Qwen3-4B-AWQ is a 4 billion parameter quantized language model that represents the latest generation in the Qwen series. It is a quantized version of Qwen3-4B using 4-bit AWQ quantization, offering a compact yet capable model for efficient deployment. The model features a unique dual-mode capability that allows seamless switching between a thinking mode for complex logical reasoning, mathematics, and coding tasks, and a non-thinking mode for efficient general-purpose dialogue. This flexibility enables users to optimize performance across different types of queries within a single model. Qwen3-4B-AWQ supports 100+ languages and dialects with strong multilingual instruction-following capabilities. The model natively supports context lengths of 32,768 tokens and can be extended to 131,072 tokens using the YaRN scaling technique. It excels in reasoning, instruction following, agent capabilities with tool use, creative writing, role-playing, and multi-turn conversations.
Alibaba
available local models on Mirai:
available local models on Mirai:
Name
Quantisation
Size
Qwen2.5-Coder-0.5B-Instruct
uint4
0.5B
Quant.
uint4
Size
0.5B
Qwen2.5-Coder-1.5B-Instruct
uint4
1.5B
Quant.
uint4
Size
1.5B
Qwen2.5-Coder-14B-Instruct
uint4
14B
Quant.
uint4
Size
14B
Qwen2.5-Coder-32B-Instruct
uint4
32B
Quant.
uint4
Size
32B
Qwen2.5-Coder-3B-Instruct
uint4
3B
Quant.
uint4
Size
3B
Qwen2.5-Coder-7B-Instruct
uint4
7B
Quant.
uint4
Size
7B
Qwen3-0.6B
uint4
0.6B
Quant.
uint4
Size
0.6B
Qwen3-0.6B-MLX-4bit
uint4
0.6B
Quant.
uint4
Size
0.6B
Qwen3-0.6B-MLX-8bit
uint4
0.6B
Quant.
uint4
Size
0.6B
Qwen3-1.7B
uint4
1.7B
Quant.
uint4
Size
1.7B
Qwen3-1.7B-MLX-4bit
uint4
1.7B
Quant.
uint4
Size
1.7B
Qwen3-1.7B-MLX-8bit
uint4
1.7B
Quant.
uint4
Size
1.7B
Qwen3-14B
uint4
14B
Quant.
uint4
Size
14B
Qwen3-14B-AWQ
uint4
14B
Quant.
uint4
Size
14B
Qwen3-14B-MLX-4bit
uint4
14B
Quant.
uint4
Size
14B
Qwen3-14B-MLX-8bit
uint4
14B
Quant.
uint4
Size
14B
Qwen3-32B
uint4
32B
Quant.
uint4
Size
32B
Qwen3-32B-AWQ
uint4
32B
Quant.
uint4
Size
32B
Qwen3-32B-MLX-4bit
uint4
32B
Quant.
uint4
Size
32B
Qwen3-4B
uint4
4B
Quant.
uint4
Size
4B
Qwen3-4B-AWQ is a 4 billion parameter quantized language model that represents the latest generation in the Qwen series. It is a quantized version of Qwen3-4B using 4-bit AWQ quantization, offering a compact yet capable model for efficient deployment. The model features a unique dual-mode capability that allows seamless switching between a thinking mode for complex logical reasoning, mathematics, and coding tasks, and a non-thinking mode for efficient general-purpose dialogue. This flexibility enables users to optimize performance across different types of queries within a single model. Qwen3-4B-AWQ supports 100+ languages and dialects with strong multilingual instruction-following capabilities. The model natively supports context lengths of 32,768 tokens and can be extended to 131,072 tokens using the YaRN scaling technique. It excels in reasoning, instruction following, agent capabilities with tool use, creative writing, role-playing, and multi-turn conversations.
Alibaba
available local models on Mirai:
Name
Quantisation
Size
Qwen2.5-Coder-0.5B-Instruct
uint4
0.5B
Quant.
uint4
Size
0.5B
Qwen2.5-Coder-1.5B-Instruct
uint4
1.5B
Quant.
uint4
Size
1.5B
Qwen2.5-Coder-14B-Instruct
uint4
14B
Quant.
uint4
Size
14B
Qwen2.5-Coder-32B-Instruct
uint4
32B
Quant.
uint4
Size
32B
Qwen2.5-Coder-3B-Instruct
uint4
3B
Quant.
uint4
Size
3B
Qwen2.5-Coder-7B-Instruct
uint4
7B
Quant.
uint4
Size
7B
Qwen3-0.6B
uint4
0.6B
Quant.
uint4
Size
0.6B
Qwen3-0.6B-MLX-4bit
uint4
0.6B
Quant.
uint4
Size
0.6B
Qwen3-0.6B-MLX-8bit
uint4
0.6B
Quant.
uint4
Size
0.6B
Qwen3-1.7B
uint4
1.7B
Quant.
uint4
Size
1.7B
Qwen3-1.7B-MLX-4bit
uint4
1.7B
Quant.
uint4
Size
1.7B
Qwen3-1.7B-MLX-8bit
uint4
1.7B
Quant.
uint4
Size
1.7B
Qwen3-14B
uint4
14B
Quant.
uint4
Size
14B
Qwen3-14B-AWQ
uint4
14B
Quant.
uint4
Size
14B
Qwen3-14B-MLX-4bit
uint4
14B
Quant.
uint4
Size
14B
Qwen3-14B-MLX-8bit
uint4
14B
Quant.
uint4
Size
14B
Qwen3-32B
uint4
32B
Quant.
uint4
Size
32B
Qwen3-32B-AWQ
uint4
32B
Quant.
uint4
Size
32B
Qwen3-32B-MLX-4bit
uint4
32B
Quant.
uint4
Size
32B
Qwen3-4B
uint4
4B
Quant.
uint4
Size
4B