Qwen3-0.6B

Run locally Apple devices with Mirai

Type

Type

Local

From

From

Alibaba

Quantisation

Quantisation

No

Precision

Precision

No

Size

Size

0.6B

Source

Source

Hugging Face Logo

Qwen3 is the latest generation of large language models in the Qwen series, offering both dense and mixture-of-experts models built upon extensive training. The series delivers significant advancements in reasoning, instruction-following, agent capabilities, and multilingual support. Qwen3-0.6B is a compact causal language model with 0.6 billion parameters featuring a unique ability to seamlessly switch between thinking mode for complex logical reasoning, mathematics, and coding tasks, and non-thinking mode for efficient general-purpose dialogue. The model includes 28 layers with grouped query attention and supports a context length of 32,768 tokens. It significantly enhances reasoning capabilities beyond previous Qwen models, excels in human preference alignment with strengths in creative writing and multi-turn dialogues, demonstrates strong agent capabilities for tool integration in both modes, and provides support for over 100 languages and dialects with strong multilingual instruction-following and translation abilities.

Qwen3 is the latest generation of large language models in the Qwen series, offering both dense and mixture-of-experts models built upon extensive training. The series delivers significant advancements in reasoning, instruction-following, agent capabilities, and multilingual support. Qwen3-0.6B is a compact causal language model with 0.6 billion parameters featuring a unique ability to seamlessly switch between thinking mode for complex logical reasoning, mathematics, and coding tasks, and non-thinking mode for efficient general-purpose dialogue. The model includes 28 layers with grouped query attention and supports a context length of 32,768 tokens. It significantly enhances reasoning capabilities beyond previous Qwen models, excels in human preference alignment with strengths in creative writing and multi-turn dialogues, demonstrates strong agent capabilities for tool integration in both modes, and provides support for over 100 languages and dialects with strong multilingual instruction-following and translation abilities.

Qwen3-0.6B

Run locally Apple devices with Mirai

Type

Local

From

Alibaba

Quantisation

No

Precision

float16

Size

0.6B

Source

Hugging Face Logo

Qwen3 is the latest generation of large language models in the Qwen series, offering both dense and mixture-of-experts models built upon extensive training. The series delivers significant advancements in reasoning, instruction-following, agent capabilities, and multilingual support. Qwen3-0.6B is a compact causal language model with 0.6 billion parameters featuring a unique ability to seamlessly switch between thinking mode for complex logical reasoning, mathematics, and coding tasks, and non-thinking mode for efficient general-purpose dialogue. The model includes 28 layers with grouped query attention and supports a context length of 32,768 tokens. It significantly enhances reasoning capabilities beyond previous Qwen models, excels in human preference alignment with strengths in creative writing and multi-turn dialogues, demonstrates strong agent capabilities for tool integration in both modes, and provides support for over 100 languages and dialects with strong multilingual instruction-following and translation abilities.