Qwen3-32B-MLX-4bit

Run locally Apple devices with Mirai

Type

Type

Local

From

From

Alibaba

Quantisation

Quantisation

uint4

Precision

Precision

No

Size

Size

32B

Source

Source

Hugging Face Logo

Qwen3-32B is the latest generation large language model from Qwen, offering a 32.8 billion parameter model with unique capabilities for both reasoning and efficient dialogue. The model features seamless switching between thinking mode for complex logical reasoning, mathematics, and coding tasks, and non-thinking mode for faster general-purpose conversation, all within a single unified architecture. Qwen3-32B demonstrates significant improvements in reasoning capabilities, human preference alignment for creative writing and multi-turn dialogues, agent capabilities for tool integration, and support for over 100 languages with strong multilingual instruction following. The model has a native context length of 32,768 tokens, expandable to 131,072 tokens using YaRN scaling, and is available as a 4-bit quantized version optimized for MLX inference frameworks.

1
Choose framework
2
Run the following command to install Mirai SDK
SPMhttps://github.com/trymirai/uzu-swift
3
Set Mirai API keyGet API Key
4
Apply code
Loading...

Qwen3-32B-MLX-4bit

Run locally Apple devices with Mirai

Type

Local

From

Alibaba

Quantisation

uint4

Precision

float16

Size

32B

Source

Hugging Face Logo

Qwen3-32B is the latest generation large language model from Qwen, offering a 32.8 billion parameter model with unique capabilities for both reasoning and efficient dialogue. The model features seamless switching between thinking mode for complex logical reasoning, mathematics, and coding tasks, and non-thinking mode for faster general-purpose conversation, all within a single unified architecture. Qwen3-32B demonstrates significant improvements in reasoning capabilities, human preference alignment for creative writing and multi-turn dialogues, agent capabilities for tool integration, and support for over 100 languages with strong multilingual instruction following. The model has a native context length of 32,768 tokens, expandable to 131,072 tokens using YaRN scaling, and is available as a 4-bit quantized version optimized for MLX inference frameworks.

1
Choose framework
2
Run the following command to install Mirai SDK
SPMhttps://github.com/trymirai/uzu-swift
3
Set Mirai API keyGet API Key
4
Apply code
Loading...