Qwen3-4B-Instruct-2507 is an updated version of the Qwen3-4B non-thinking mode language model with 4 billion parameters and 262,144 tokens of native context length. The model features significant improvements across general capabilities including instruction following, logical reasoning, text comprehension, mathematics, science, coding and tool usage, along with substantially enhanced long-tail knowledge coverage across multiple languages and better alignment with user preferences for subjective and open-ended tasks. The model operates as a causal language model trained through pretraining and post-training stages, using 36 layers with grouped query attention. It has been optimized for diverse applications from knowledge-based tasks to reasoning, coding, creative writing, and agentic use cases with tool calling capabilities.
Qwen3-4B-Instruct-2507 is an updated version of the Qwen3-4B non-thinking mode language model with 4 billion parameters and 262,144 tokens of native context length. The model features significant improvements across general capabilities including instruction following, logical reasoning, text comprehension, mathematics, science, coding and tool usage, along with substantially enhanced long-tail knowledge coverage across multiple languages and better alignment with user preferences for subjective and open-ended tasks. The model operates as a causal language model trained through pretraining and post-training stages, using 36 layers with grouped query attention. It has been optimized for diverse applications from knowledge-based tasks to reasoning, coding, creative writing, and agentic use cases with tool calling capabilities.