Qwen3 is the latest generation of large language models in the Qwen series, offering both dense and mixture-of-experts models built upon extensive training. The series delivers significant advancements in reasoning, instruction-following, agent capabilities, and multilingual support. Qwen3-0.6B is a compact causal language model with 0.6 billion parameters featuring a unique ability to seamlessly switch between thinking mode for complex logical reasoning, mathematics, and coding tasks, and non-thinking mode for efficient general-purpose dialogue. The model includes 28 layers with grouped query attention and supports a context length of 32,768 tokens. It significantly enhances reasoning capabilities beyond previous Qwen models, excels in human preference alignment with strengths in creative writing and multi-turn dialogues, demonstrates strong agent capabilities for tool integration in both modes, and provides support for over 100 languages and dialects with strong multilingual instruction-following and translation abilities.
Qwen3 is the latest generation of large language models in the Qwen series, offering both dense and mixture-of-experts models built upon extensive training. The series delivers significant advancements in reasoning, instruction-following, agent capabilities, and multilingual support. Qwen3-0.6B is a compact causal language model with 0.6 billion parameters featuring a unique ability to seamlessly switch between thinking mode for complex logical reasoning, mathematics, and coding tasks, and non-thinking mode for efficient general-purpose dialogue. The model includes 28 layers with grouped query attention and supports a context length of 32,768 tokens. It significantly enhances reasoning capabilities beyond previous Qwen models, excels in human preference alignment with strengths in creative writing and multi-turn dialogues, demonstrates strong agent capabilities for tool integration in both modes, and provides support for over 100 languages and dialects with strong multilingual instruction-following and translation abilities.
Qwen3 is the latest generation of large language models in the Qwen series, offering both dense and mixture-of-experts models built upon extensive training. The series delivers significant advancements in reasoning, instruction-following, agent capabilities, and multilingual support. Qwen3-0.6B is a compact causal language model with 0.6 billion parameters featuring a unique ability to seamlessly switch between thinking mode for complex logical reasoning, mathematics, and coding tasks, and non-thinking mode for efficient general-purpose dialogue. The model includes 28 layers with grouped query attention and supports a context length of 32,768 tokens. It significantly enhances reasoning capabilities beyond previous Qwen models, excels in human preference alignment with strengths in creative writing and multi-turn dialogues, demonstrates strong agent capabilities for tool integration in both modes, and provides support for over 100 languages and dialects with strong multilingual instruction-following and translation abilities.