Qwen3-8B-MLX-8bit

Run locally Apple devices with Mirai

Type

Type

Local

From

From

Alibaba

Quantisation

Quantisation

uint8

Precision

Precision

No

Size

Size

8B

Source

Source

Hugging Face Logo

Qwen3-8B-MLX-8bit is an 8.2 billion parameter causal language model that represents the latest generation in the Qwen series. This model offers a unique feature of seamlessly switching between thinking mode for complex logical reasoning, mathematics, and coding tasks, and non-thinking mode for efficient general-purpose dialogue, all within a single model. It delivers significant enhancements in reasoning capabilities, human preference alignment for creative writing and multi-turn conversations, and expertise in agent capabilities with tool integration. The model supports over 100 languages and dialects with strong multilingual instruction-following and translation capabilities. It natively supports context lengths of up to 32,768 tokens and can extend to 131,072 tokens using YaRN scaling techniques. This MLX-optimized 8-bit quantized version is designed for efficient inference while maintaining the comprehensive capabilities of the full Qwen3 model family.

Qwen3-8B-MLX-8bit is an 8.2 billion parameter causal language model that represents the latest generation in the Qwen series. This model offers a unique feature of seamlessly switching between thinking mode for complex logical reasoning, mathematics, and coding tasks, and non-thinking mode for efficient general-purpose dialogue, all within a single model. It delivers significant enhancements in reasoning capabilities, human preference alignment for creative writing and multi-turn conversations, and expertise in agent capabilities with tool integration. The model supports over 100 languages and dialects with strong multilingual instruction-following and translation capabilities. It natively supports context lengths of up to 32,768 tokens and can extend to 131,072 tokens using YaRN scaling techniques. This MLX-optimized 8-bit quantized version is designed for efficient inference while maintaining the comprehensive capabilities of the full Qwen3 model family.

Qwen3-8B-MLX-8bit

Run locally Apple devices with Mirai

Type

Local

From

Alibaba

Quantisation

uint8

Precision

float16

Size

8B

Source

Hugging Face Logo

Qwen3-8B-MLX-8bit is an 8.2 billion parameter causal language model that represents the latest generation in the Qwen series. This model offers a unique feature of seamlessly switching between thinking mode for complex logical reasoning, mathematics, and coding tasks, and non-thinking mode for efficient general-purpose dialogue, all within a single model. It delivers significant enhancements in reasoning capabilities, human preference alignment for creative writing and multi-turn conversations, and expertise in agent capabilities with tool integration. The model supports over 100 languages and dialects with strong multilingual instruction-following and translation capabilities. It natively supports context lengths of up to 32,768 tokens and can extend to 131,072 tokens using YaRN scaling techniques. This MLX-optimized 8-bit quantized version is designed for efficient inference while maintaining the comprehensive capabilities of the full Qwen3 model family.