Polaris-4B-Preview from POLARIS-Project – Run On-Device with Mirai.

Polaris-4B-Preview

Run locally Apple devices with Mirai

Run on device

Type

Local

From

POLARIS-Project

Quantisation

No

Precision

No

Size

4B

Source

Explore all local models

Polaris is an open-source post-training method that uses reinforcement learning to enhance models with advanced reasoning abilities. The approach demonstrates that even smaller models like Qwen3-4B can achieve significant improvements on challenging reasoning tasks through RL optimization, with results surpassing top commercial systems like Claude-4-Opus, Grok-3-Beta, and o3-mini-high on benchmark evaluations. The method incorporates several key techniques including data difficulty analysis and distribution mapping, diversity-based rollout sampling with progressive temperature increases, inference-time length extrapolation for generating longer chain-of-thought reasoning while training with shorter sequences, and multi-stage training to improve exploration efficiency. Polaris leverages open-source data and academic-level resources to push the capabilities of open-recipe reasoning models to new heights.

1

Choose framework

2

Run the following command to install Mirai SDK

SPMhttps://github.com/trymirai/uzu-swift

3

Set Mirai API keyGet API Key →

4

Apply code

Loading...

Polaris-4B-Preview

Run locally Apple devices with Mirai

Run on device

Type

Local

From

POLARIS-Project

Quantisation

No

Precision

float16

Size

4B

Source

Explore all local models

Polaris is an open-source post-training method that uses reinforcement learning to enhance models with advanced reasoning abilities. The approach demonstrates that even smaller models like Qwen3-4B can achieve significant improvements on challenging reasoning tasks through RL optimization, with results surpassing top commercial systems like Claude-4-Opus, Grok-3-Beta, and o3-mini-high on benchmark evaluations. The method incorporates several key techniques including data difficulty analysis and distribution mapping, diversity-based rollout sampling with progressive temperature increases, inference-time length extrapolation for generating longer chain-of-thought reasoning while training with shorter sequences, and multi-stage training to improve exploration efficiency. Polaris leverages open-source data and academic-level resources to push the capabilities of open-recipe reasoning models to new heights.

1

Choose framework

2

Run the following command to install Mirai SDK

SPMhttps://github.com/trymirai/uzu-swift

3

Set Mirai API keyGet API Key →

4

Apply code

Loading...

Main

Company

Links

Platform / SDK

Models Library

MacOS App

Blog

Docs

About us

Careers

Contact Us

Privacy Policy

Terms of Use

X (Twitter)

Github

Discord

Platform / SDK

Models Library

MacOS App

Blog

Docs

Main

About us

Careers

Contact Us

Privacy Policy

Terms of Use

Company

X (Twitter)

Github

Discord

Links

Main

Company

Links

Platform / SDK

Models Library

MacOS App

Blog

Docs

About us

Careers

Contact Us

Privacy Policy

Terms of Use

X (Twitter)

Github

Discord