Qwen3-8B-AWQ from Alibaba – Run On-Device with Mirai.

Qwen3-8B-AWQ

Run locally Apple devices with Mirai

Run on device

Type

Local

From

Alibaba

Quantisation

uint4

Precision

No

Size

8B

Source

Explore all local models

Qwen3-8B-AWQ is a 8.2 billion parameter language model quantized to 4-bit using AWQ quantization. It is the latest generation in the Qwen series, offering a unique seamless switching capability between thinking mode for complex logical reasoning, mathematics, and coding, and non-thinking mode for efficient general-purpose dialogue, all within a single model. The model demonstrates significant advancements in reasoning, instruction-following, agent capabilities, and multilingual support across 100+ languages and dialects. It supports context lengths of up to 32,768 tokens natively and can extend to 131,072 tokens using YaRN scaling. Key features include enhanced reasoning capabilities that surpass previous models like QwQ and Qwen2.5 on mathematics and code generation tasks, superior human preference alignment for creative writing and multi-turn dialogue, strong agent capabilities for precise tool integration, and multilingual instruction-following abilities. The AWQ quantization reduces the model's memory footprint while maintaining strong performance characteristics, making it suitable for deployment on resource-constrained hardware while preserving the reasoning and instruction-following capabilities of the base Qwen3-8B model.

1

Choose framework

2

Run the following command to install Mirai SDK

SPMhttps://github.com/trymirai/uzu-swift

3

Set Mirai API keyGet API Key →

4

Apply code

Loading...

Qwen3-8B-AWQ

Run locally Apple devices with Mirai

Run on device

Type

Local

From

Alibaba

Quantisation

uint4

Precision

float16

Size

8B

Source

Explore all local models

Qwen3-8B-AWQ is a 8.2 billion parameter language model quantized to 4-bit using AWQ quantization. It is the latest generation in the Qwen series, offering a unique seamless switching capability between thinking mode for complex logical reasoning, mathematics, and coding, and non-thinking mode for efficient general-purpose dialogue, all within a single model. The model demonstrates significant advancements in reasoning, instruction-following, agent capabilities, and multilingual support across 100+ languages and dialects. It supports context lengths of up to 32,768 tokens natively and can extend to 131,072 tokens using YaRN scaling. Key features include enhanced reasoning capabilities that surpass previous models like QwQ and Qwen2.5 on mathematics and code generation tasks, superior human preference alignment for creative writing and multi-turn dialogue, strong agent capabilities for precise tool integration, and multilingual instruction-following abilities. The AWQ quantization reduces the model's memory footprint while maintaining strong performance characteristics, making it suitable for deployment on resource-constrained hardware while preserving the reasoning and instruction-following capabilities of the base Qwen3-8B model.

1

Choose framework

2

Run the following command to install Mirai SDK

SPMhttps://github.com/trymirai/uzu-swift

3

Set Mirai API keyGet API Key →

4

Apply code

Loading...

Main

Company

Links

Platform / SDK

Models Library

MacOS App

Blog

Docs

About us

Careers

Contact Us

Privacy Policy

Terms of Use

X (Twitter)

Github

Discord

Platform / SDK

Models Library

MacOS App

Blog

Docs

Main

About us

Careers

Contact Us

Privacy Policy

Terms of Use

Company

X (Twitter)

Github

Discord

Links

Main

Company

Links

Platform / SDK

Models Library

MacOS App

Blog

Docs

About us

Careers

Contact Us

Privacy Policy

Terms of Use

X (Twitter)

Github

Discord