Qwen3-8B-AWQ is a 8.2 billion parameter language model quantized to 4-bit using AWQ quantization. It is the latest generation in the Qwen series, offering a unique seamless switching capability between thinking mode for complex logical reasoning, mathematics, and coding, and non-thinking mode for efficient general-purpose dialogue, all within a single model. The model demonstrates significant advancements in reasoning, instruction-following, agent capabilities, and multilingual support across 100+ languages and dialects. It supports context lengths of up to 32,768 tokens natively and can extend to 131,072 tokens using YaRN scaling. Key features include enhanced reasoning capabilities that surpass previous models like QwQ and Qwen2.5 on mathematics and code generation tasks, superior human preference alignment for creative writing and multi-turn dialogue, strong agent capabilities for precise tool integration, and multilingual instruction-following abilities. The AWQ quantization reduces the model's memory footprint while maintaining strong performance characteristics, making it suitable for deployment on resource-constrained hardware while preserving the reasoning and instruction-following capabilities of the base Qwen3-8B model.
Alibaba
available local models on Mirai:
available local models on Mirai:
Name
Quantisation
Size
Qwen2.5-Coder-0.5B-Instruct
uint4
0.5B
Quant.
uint4
Size
0.5B
Qwen2.5-Coder-1.5B-Instruct
uint4
1.5B
Quant.
uint4
Size
1.5B
Qwen2.5-Coder-14B-Instruct
uint4
14B
Quant.
uint4
Size
14B
Qwen2.5-Coder-32B-Instruct
uint4
32B
Quant.
uint4
Size
32B
Qwen2.5-Coder-3B-Instruct
uint4
3B
Quant.
uint4
Size
3B
Qwen2.5-Coder-7B-Instruct
uint4
7B
Quant.
uint4
Size
7B
Qwen3-0.6B
uint4
0.6B
Quant.
uint4
Size
0.6B
Qwen3-0.6B-MLX-4bit
uint4
0.6B
Quant.
uint4
Size
0.6B
Qwen3-0.6B-MLX-8bit
uint4
0.6B
Quant.
uint4
Size
0.6B
Qwen3-1.7B
uint4
1.7B
Quant.
uint4
Size
1.7B
Qwen3-1.7B-MLX-4bit
uint4
1.7B
Quant.
uint4
Size
1.7B
Qwen3-1.7B-MLX-8bit
uint4
1.7B
Quant.
uint4
Size
1.7B
Qwen3-14B
uint4
14B
Quant.
uint4
Size
14B
Qwen3-14B-AWQ
uint4
14B
Quant.
uint4
Size
14B
Qwen3-14B-MLX-4bit
uint4
14B
Quant.
uint4
Size
14B
Qwen3-14B-MLX-8bit
uint4
14B
Quant.
uint4
Size
14B
Qwen3-32B
uint4
32B
Quant.
uint4
Size
32B
Qwen3-32B-AWQ
uint4
32B
Quant.
uint4
Size
32B
Qwen3-32B-MLX-4bit
uint4
32B
Quant.
uint4
Size
32B
Qwen3-4B
uint4
4B
Quant.
uint4
Size
4B
Qwen3-8B-AWQ is a 8.2 billion parameter language model quantized to 4-bit using AWQ quantization. It is the latest generation in the Qwen series, offering a unique seamless switching capability between thinking mode for complex logical reasoning, mathematics, and coding, and non-thinking mode for efficient general-purpose dialogue, all within a single model. The model demonstrates significant advancements in reasoning, instruction-following, agent capabilities, and multilingual support across 100+ languages and dialects. It supports context lengths of up to 32,768 tokens natively and can extend to 131,072 tokens using YaRN scaling. Key features include enhanced reasoning capabilities that surpass previous models like QwQ and Qwen2.5 on mathematics and code generation tasks, superior human preference alignment for creative writing and multi-turn dialogue, strong agent capabilities for precise tool integration, and multilingual instruction-following abilities. The AWQ quantization reduces the model's memory footprint while maintaining strong performance characteristics, making it suitable for deployment on resource-constrained hardware while preserving the reasoning and instruction-following capabilities of the base Qwen3-8B model.
Alibaba
available local models on Mirai:
Name
Quantisation
Size
Qwen2.5-Coder-0.5B-Instruct
uint4
0.5B
Quant.
uint4
Size
0.5B
Qwen2.5-Coder-1.5B-Instruct
uint4
1.5B
Quant.
uint4
Size
1.5B
Qwen2.5-Coder-14B-Instruct
uint4
14B
Quant.
uint4
Size
14B
Qwen2.5-Coder-32B-Instruct
uint4
32B
Quant.
uint4
Size
32B
Qwen2.5-Coder-3B-Instruct
uint4
3B
Quant.
uint4
Size
3B
Qwen2.5-Coder-7B-Instruct
uint4
7B
Quant.
uint4
Size
7B
Qwen3-0.6B
uint4
0.6B
Quant.
uint4
Size
0.6B
Qwen3-0.6B-MLX-4bit
uint4
0.6B
Quant.
uint4
Size
0.6B
Qwen3-0.6B-MLX-8bit
uint4
0.6B
Quant.
uint4
Size
0.6B
Qwen3-1.7B
uint4
1.7B
Quant.
uint4
Size
1.7B
Qwen3-1.7B-MLX-4bit
uint4
1.7B
Quant.
uint4
Size
1.7B
Qwen3-1.7B-MLX-8bit
uint4
1.7B
Quant.
uint4
Size
1.7B
Qwen3-14B
uint4
14B
Quant.
uint4
Size
14B
Qwen3-14B-AWQ
uint4
14B
Quant.
uint4
Size
14B
Qwen3-14B-MLX-4bit
uint4
14B
Quant.
uint4
Size
14B
Qwen3-14B-MLX-8bit
uint4
14B
Quant.
uint4
Size
14B
Qwen3-32B
uint4
32B
Quant.
uint4
Size
32B
Qwen3-32B-AWQ
uint4
32B
Quant.
uint4
Size
32B
Qwen3-32B-MLX-4bit
uint4
32B
Quant.
uint4
Size
32B
Qwen3-4B
uint4
4B
Quant.
uint4
Size
4B