Qwen3-4B is a 4 billion parameter causal language model that represents the latest generation in the Qwen series of large language models. The model uniquely supports seamless switching between thinking mode for complex logical reasoning, mathematics, and coding tasks, and non-thinking mode for efficient general-purpose dialogue, all within a single model. This dual-mode capability allows optimal performance across various scenarios without requiring separate models. The model demonstrates significantly enhanced reasoning capabilities that surpass both the previous QwQ model in thinking mode and Qwen2.5 instruct models in non-thinking mode on mathematics, code generation, and commonsense logical reasoning. Beyond reasoning, Qwen3-4B excels in human preference alignment for creative writing, role-playing, multi-turn dialogues, and instruction following, while also featuring strong agent capabilities for precise integration with external tools. The model supports over 100 languages and dialects with advanced multilingual instruction following and translation abilities. Qwen3-4B natively supports a context length of 32,768 tokens and can be extended to 131,072 tokens using YaRN scaling techniques. With 36 layers, 32 attention heads for queries and 8 for key-value pairs using grouped query attention, the model is optimized for both performance and efficiency across diverse applications.
Alibaba
available local models on Mirai:
available local models on Mirai:
Name
Quantisation
Size
Qwen2.5-Coder-0.5B-Instruct
uint8
0.5B
Quant.
uint8
Size
0.5B
Qwen2.5-Coder-1.5B-Instruct
uint8
1.5B
Quant.
uint8
Size
1.5B
Qwen2.5-Coder-14B-Instruct
uint8
14B
Quant.
uint8
Size
14B
Qwen2.5-Coder-32B-Instruct
uint8
32B
Quant.
uint8
Size
32B
Qwen2.5-Coder-3B-Instruct
uint8
3B
Quant.
uint8
Size
3B
Qwen2.5-Coder-7B-Instruct
uint8
7B
Quant.
uint8
Size
7B
Qwen3-0.6B
uint8
0.6B
Quant.
uint8
Size
0.6B
Qwen3-0.6B-MLX-4bit
uint8
0.6B
Quant.
uint8
Size
0.6B
Qwen3-0.6B-MLX-8bit
uint8
0.6B
Quant.
uint8
Size
0.6B
Qwen3-1.7B
uint8
1.7B
Quant.
uint8
Size
1.7B
Qwen3-1.7B-MLX-4bit
uint8
1.7B
Quant.
uint8
Size
1.7B
Qwen3-1.7B-MLX-8bit
uint8
1.7B
Quant.
uint8
Size
1.7B
Qwen3-14B
uint8
14B
Quant.
uint8
Size
14B
Qwen3-14B-AWQ
uint8
14B
Quant.
uint8
Size
14B
Qwen3-14B-MLX-4bit
uint8
14B
Quant.
uint8
Size
14B
Qwen3-14B-MLX-8bit
uint8
14B
Quant.
uint8
Size
14B
Qwen3-32B
uint8
32B
Quant.
uint8
Size
32B
Qwen3-32B-AWQ
uint8
32B
Quant.
uint8
Size
32B
Qwen3-32B-MLX-4bit
uint8
32B
Quant.
uint8
Size
32B
Qwen3-4B
uint8
4B
Quant.
uint8
Size
4B
Qwen3-4B is a 4 billion parameter causal language model that represents the latest generation in the Qwen series of large language models. The model uniquely supports seamless switching between thinking mode for complex logical reasoning, mathematics, and coding tasks, and non-thinking mode for efficient general-purpose dialogue, all within a single model. This dual-mode capability allows optimal performance across various scenarios without requiring separate models. The model demonstrates significantly enhanced reasoning capabilities that surpass both the previous QwQ model in thinking mode and Qwen2.5 instruct models in non-thinking mode on mathematics, code generation, and commonsense logical reasoning. Beyond reasoning, Qwen3-4B excels in human preference alignment for creative writing, role-playing, multi-turn dialogues, and instruction following, while also featuring strong agent capabilities for precise integration with external tools. The model supports over 100 languages and dialects with advanced multilingual instruction following and translation abilities. Qwen3-4B natively supports a context length of 32,768 tokens and can be extended to 131,072 tokens using YaRN scaling techniques. With 36 layers, 32 attention heads for queries and 8 for key-value pairs using grouped query attention, the model is optimized for both performance and efficiency across diverse applications.
Alibaba
available local models on Mirai:
Name
Quantisation
Size
Qwen2.5-Coder-0.5B-Instruct
uint8
0.5B
Quant.
uint8
Size
0.5B
Qwen2.5-Coder-1.5B-Instruct
uint8
1.5B
Quant.
uint8
Size
1.5B
Qwen2.5-Coder-14B-Instruct
uint8
14B
Quant.
uint8
Size
14B
Qwen2.5-Coder-32B-Instruct
uint8
32B
Quant.
uint8
Size
32B
Qwen2.5-Coder-3B-Instruct
uint8
3B
Quant.
uint8
Size
3B
Qwen2.5-Coder-7B-Instruct
uint8
7B
Quant.
uint8
Size
7B
Qwen3-0.6B
uint8
0.6B
Quant.
uint8
Size
0.6B
Qwen3-0.6B-MLX-4bit
uint8
0.6B
Quant.
uint8
Size
0.6B
Qwen3-0.6B-MLX-8bit
uint8
0.6B
Quant.
uint8
Size
0.6B
Qwen3-1.7B
uint8
1.7B
Quant.
uint8
Size
1.7B
Qwen3-1.7B-MLX-4bit
uint8
1.7B
Quant.
uint8
Size
1.7B
Qwen3-1.7B-MLX-8bit
uint8
1.7B
Quant.
uint8
Size
1.7B
Qwen3-14B
uint8
14B
Quant.
uint8
Size
14B
Qwen3-14B-AWQ
uint8
14B
Quant.
uint8
Size
14B
Qwen3-14B-MLX-4bit
uint8
14B
Quant.
uint8
Size
14B
Qwen3-14B-MLX-8bit
uint8
14B
Quant.
uint8
Size
14B
Qwen3-32B
uint8
32B
Quant.
uint8
Size
32B
Qwen3-32B-AWQ
uint8
32B
Quant.
uint8
Size
32B
Qwen3-32B-MLX-4bit
uint8
32B
Quant.
uint8
Size
32B
Qwen3-4B
uint8
4B
Quant.
uint8
Size
4B