Qwen3-14B-MLX-8bit is an 8-bit quantized version of Qwen3-14B optimized for the MLX framework. Qwen3 is the latest generation of large language models in the Qwen series, offering a comprehensive suite of dense and mixture-of-experts models built with extensive training to deliver advancements in reasoning, instruction-following, agent capabilities, and multilingual support. A key distinguishing feature of Qwen3 is its unique ability to seamlessly switch between thinking mode for complex logical reasoning, mathematics, and coding, and non-thinking mode for efficient general-purpose dialogue, all within a single model. The model significantly enhances reasoning capabilities, surpassing previous QwQ and Qwen2.5 instruct models on mathematics, code generation, and commonsense logical reasoning. It also excels in human preference alignment for creative writing, role-playing, multi-turn dialogues, and instruction following. The model supports over 100 languages and dialects with strong multilingual instruction-following and translation capabilities, and demonstrates expertise in agent capabilities for precise integration with external tools. Qwen3-14B has 14.8 billion parameters with 40 layers and natively supports a context length of 32,768 tokens, extendable to 131,072 tokens using YaRN scaling techniques.
Alibaba
available local models on Mirai:
available local models on Mirai:
Name
Quantisation
Size
Qwen2.5-Coder-0.5B-Instruct
uint8
0.5B
Quant.
uint8
Size
0.5B
Qwen2.5-Coder-1.5B-Instruct
uint8
1.5B
Quant.
uint8
Size
1.5B
Qwen2.5-Coder-14B-Instruct
uint8
14B
Quant.
uint8
Size
14B
Qwen2.5-Coder-32B-Instruct
uint8
32B
Quant.
uint8
Size
32B
Qwen2.5-Coder-3B-Instruct
uint8
3B
Quant.
uint8
Size
3B
Qwen2.5-Coder-7B-Instruct
uint8
7B
Quant.
uint8
Size
7B
Qwen3-0.6B
uint8
0.6B
Quant.
uint8
Size
0.6B
Qwen3-0.6B-MLX-4bit
uint8
0.6B
Quant.
uint8
Size
0.6B
Qwen3-0.6B-MLX-8bit
uint8
0.6B
Quant.
uint8
Size
0.6B
Qwen3-1.7B
uint8
1.7B
Quant.
uint8
Size
1.7B
Qwen3-1.7B-MLX-4bit
uint8
1.7B
Quant.
uint8
Size
1.7B
Qwen3-1.7B-MLX-8bit
uint8
1.7B
Quant.
uint8
Size
1.7B
Qwen3-14B
uint8
14B
Quant.
uint8
Size
14B
Qwen3-14B-AWQ
uint8
14B
Quant.
uint8
Size
14B
Qwen3-14B-MLX-4bit
uint8
14B
Quant.
uint8
Size
14B
Qwen3-14B-MLX-8bit
uint8
14B
Quant.
uint8
Size
14B
Qwen3-32B
uint8
32B
Quant.
uint8
Size
32B
Qwen3-32B-AWQ
uint8
32B
Quant.
uint8
Size
32B
Qwen3-32B-MLX-4bit
uint8
32B
Quant.
uint8
Size
32B
Qwen3-4B
uint8
4B
Quant.
uint8
Size
4B
Qwen3-14B-MLX-8bit is an 8-bit quantized version of Qwen3-14B optimized for the MLX framework. Qwen3 is the latest generation of large language models in the Qwen series, offering a comprehensive suite of dense and mixture-of-experts models built with extensive training to deliver advancements in reasoning, instruction-following, agent capabilities, and multilingual support. A key distinguishing feature of Qwen3 is its unique ability to seamlessly switch between thinking mode for complex logical reasoning, mathematics, and coding, and non-thinking mode for efficient general-purpose dialogue, all within a single model. The model significantly enhances reasoning capabilities, surpassing previous QwQ and Qwen2.5 instruct models on mathematics, code generation, and commonsense logical reasoning. It also excels in human preference alignment for creative writing, role-playing, multi-turn dialogues, and instruction following. The model supports over 100 languages and dialects with strong multilingual instruction-following and translation capabilities, and demonstrates expertise in agent capabilities for precise integration with external tools. Qwen3-14B has 14.8 billion parameters with 40 layers and natively supports a context length of 32,768 tokens, extendable to 131,072 tokens using YaRN scaling techniques.
Alibaba
available local models on Mirai:
Name
Quantisation
Size
Qwen2.5-Coder-0.5B-Instruct
uint8
0.5B
Quant.
uint8
Size
0.5B
Qwen2.5-Coder-1.5B-Instruct
uint8
1.5B
Quant.
uint8
Size
1.5B
Qwen2.5-Coder-14B-Instruct
uint8
14B
Quant.
uint8
Size
14B
Qwen2.5-Coder-32B-Instruct
uint8
32B
Quant.
uint8
Size
32B
Qwen2.5-Coder-3B-Instruct
uint8
3B
Quant.
uint8
Size
3B
Qwen2.5-Coder-7B-Instruct
uint8
7B
Quant.
uint8
Size
7B
Qwen3-0.6B
uint8
0.6B
Quant.
uint8
Size
0.6B
Qwen3-0.6B-MLX-4bit
uint8
0.6B
Quant.
uint8
Size
0.6B
Qwen3-0.6B-MLX-8bit
uint8
0.6B
Quant.
uint8
Size
0.6B
Qwen3-1.7B
uint8
1.7B
Quant.
uint8
Size
1.7B
Qwen3-1.7B-MLX-4bit
uint8
1.7B
Quant.
uint8
Size
1.7B
Qwen3-1.7B-MLX-8bit
uint8
1.7B
Quant.
uint8
Size
1.7B
Qwen3-14B
uint8
14B
Quant.
uint8
Size
14B
Qwen3-14B-AWQ
uint8
14B
Quant.
uint8
Size
14B
Qwen3-14B-MLX-4bit
uint8
14B
Quant.
uint8
Size
14B
Qwen3-14B-MLX-8bit
uint8
14B
Quant.
uint8
Size
14B
Qwen3-32B
uint8
32B
Quant.
uint8
Size
32B
Qwen3-32B-AWQ
uint8
32B
Quant.
uint8
Size
32B
Qwen3-32B-MLX-4bit
uint8
32B
Quant.
uint8
Size
32B
Qwen3-4B
uint8
4B
Quant.
uint8
Size
4B