Qwen3-8B-MLX-8bit is an 8.2 billion parameter causal language model that represents the latest generation in the Qwen series. This model offers a unique feature of seamlessly switching between thinking mode for complex logical reasoning, mathematics, and coding tasks, and non-thinking mode for efficient general-purpose dialogue, all within a single model. It delivers significant enhancements in reasoning capabilities, human preference alignment for creative writing and multi-turn conversations, and expertise in agent capabilities with tool integration. The model supports over 100 languages and dialects with strong multilingual instruction-following and translation capabilities. It natively supports context lengths of up to 32,768 tokens and can extend to 131,072 tokens using YaRN scaling techniques. This MLX-optimized 8-bit quantized version is designed for efficient inference while maintaining the comprehensive capabilities of the full Qwen3 model family.
Alibaba
available local models on Mirai:
available local models on Mirai:
Name
Quantisation
Size
Qwen2.5-Coder-0.5B-Instruct
uint8
0.5B
Quant.
uint8
Size
0.5B
Qwen2.5-Coder-1.5B-Instruct
uint8
1.5B
Quant.
uint8
Size
1.5B
Qwen2.5-Coder-14B-Instruct
uint8
14B
Quant.
uint8
Size
14B
Qwen2.5-Coder-32B-Instruct
uint8
32B
Quant.
uint8
Size
32B
Qwen2.5-Coder-3B-Instruct
uint8
3B
Quant.
uint8
Size
3B
Qwen2.5-Coder-7B-Instruct
uint8
7B
Quant.
uint8
Size
7B
Qwen3-0.6B
uint8
0.6B
Quant.
uint8
Size
0.6B
Qwen3-0.6B-MLX-4bit
uint8
0.6B
Quant.
uint8
Size
0.6B
Qwen3-0.6B-MLX-8bit
uint8
0.6B
Quant.
uint8
Size
0.6B
Qwen3-1.7B
uint8
1.7B
Quant.
uint8
Size
1.7B
Qwen3-1.7B-MLX-4bit
uint8
1.7B
Quant.
uint8
Size
1.7B
Qwen3-1.7B-MLX-8bit
uint8
1.7B
Quant.
uint8
Size
1.7B
Qwen3-14B
uint8
14B
Quant.
uint8
Size
14B
Qwen3-14B-AWQ
uint8
14B
Quant.
uint8
Size
14B
Qwen3-14B-MLX-4bit
uint8
14B
Quant.
uint8
Size
14B
Qwen3-14B-MLX-8bit
uint8
14B
Quant.
uint8
Size
14B
Qwen3-32B
uint8
32B
Quant.
uint8
Size
32B
Qwen3-32B-AWQ
uint8
32B
Quant.
uint8
Size
32B
Qwen3-32B-MLX-4bit
uint8
32B
Quant.
uint8
Size
32B
Qwen3-4B
uint8
4B
Quant.
uint8
Size
4B
Qwen3-8B-MLX-8bit is an 8.2 billion parameter causal language model that represents the latest generation in the Qwen series. This model offers a unique feature of seamlessly switching between thinking mode for complex logical reasoning, mathematics, and coding tasks, and non-thinking mode for efficient general-purpose dialogue, all within a single model. It delivers significant enhancements in reasoning capabilities, human preference alignment for creative writing and multi-turn conversations, and expertise in agent capabilities with tool integration. The model supports over 100 languages and dialects with strong multilingual instruction-following and translation capabilities. It natively supports context lengths of up to 32,768 tokens and can extend to 131,072 tokens using YaRN scaling techniques. This MLX-optimized 8-bit quantized version is designed for efficient inference while maintaining the comprehensive capabilities of the full Qwen3 model family.
Alibaba
available local models on Mirai:
Name
Quantisation
Size
Qwen2.5-Coder-0.5B-Instruct
uint8
0.5B
Quant.
uint8
Size
0.5B
Qwen2.5-Coder-1.5B-Instruct
uint8
1.5B
Quant.
uint8
Size
1.5B
Qwen2.5-Coder-14B-Instruct
uint8
14B
Quant.
uint8
Size
14B
Qwen2.5-Coder-32B-Instruct
uint8
32B
Quant.
uint8
Size
32B
Qwen2.5-Coder-3B-Instruct
uint8
3B
Quant.
uint8
Size
3B
Qwen2.5-Coder-7B-Instruct
uint8
7B
Quant.
uint8
Size
7B
Qwen3-0.6B
uint8
0.6B
Quant.
uint8
Size
0.6B
Qwen3-0.6B-MLX-4bit
uint8
0.6B
Quant.
uint8
Size
0.6B
Qwen3-0.6B-MLX-8bit
uint8
0.6B
Quant.
uint8
Size
0.6B
Qwen3-1.7B
uint8
1.7B
Quant.
uint8
Size
1.7B
Qwen3-1.7B-MLX-4bit
uint8
1.7B
Quant.
uint8
Size
1.7B
Qwen3-1.7B-MLX-8bit
uint8
1.7B
Quant.
uint8
Size
1.7B
Qwen3-14B
uint8
14B
Quant.
uint8
Size
14B
Qwen3-14B-AWQ
uint8
14B
Quant.
uint8
Size
14B
Qwen3-14B-MLX-4bit
uint8
14B
Quant.
uint8
Size
14B
Qwen3-14B-MLX-8bit
uint8
14B
Quant.
uint8
Size
14B
Qwen3-32B
uint8
32B
Quant.
uint8
Size
32B
Qwen3-32B-AWQ
uint8
32B
Quant.
uint8
Size
32B
Qwen3-32B-MLX-4bit
uint8
32B
Quant.
uint8
Size
32B
Qwen3-4B
uint8
4B
Quant.
uint8
Size
4B