Qwen2.5-Coder is the latest series of code-specific Qwen large language models, available in six sizes ranging from 0.5 to 32 billion parameters. This instruction-tuned 1.5B variant represents a causal language model with significant improvements in code generation, code reasoning, and code fixing, built on the strong foundation of Qwen2.5 with 5.5 trillion training tokens including source code, text-code grounding, and synthetic data. The model maintains comprehensive capabilities for real-world applications such as code agents while preserving strengths in mathematics and general competencies. With a context length of 32,768 tokens and an architecture featuring 28 layers with grouped query attention, Qwen2.5-Coder-1.5B-Instruct is designed to provide efficient coding assistance across various development scenarios.
Qwen2.5-Coder is the latest series of code-specific Qwen large language models, available in six sizes ranging from 0.5 to 32 billion parameters. This instruction-tuned 1.5B variant represents a causal language model with significant improvements in code generation, code reasoning, and code fixing, built on the strong foundation of Qwen2.5 with 5.5 trillion training tokens including source code, text-code grounding, and synthetic data. The model maintains comprehensive capabilities for real-world applications such as code agents while preserving strengths in mathematics and general competencies. With a context length of 32,768 tokens and an architecture featuring 28 layers with grouped query attention, Qwen2.5-Coder-1.5B-Instruct is designed to provide efficient coding assistance across various development scenarios.
Qwen2.5-Coder is the latest series of code-specific Qwen large language models, available in six sizes ranging from 0.5 to 32 billion parameters. This instruction-tuned 1.5B variant represents a causal language model with significant improvements in code generation, code reasoning, and code fixing, built on the strong foundation of Qwen2.5 with 5.5 trillion training tokens including source code, text-code grounding, and synthetic data. The model maintains comprehensive capabilities for real-world applications such as code agents while preserving strengths in mathematics and general competencies. With a context length of 32,768 tokens and an architecture featuring 28 layers with grouped query attention, Qwen2.5-Coder-1.5B-Instruct is designed to provide efficient coding assistance across various development scenarios.