Qwen2.5-Coder is the latest series of code-specific large language models from Alibaba Cloud, available in six mainstream sizes from 0.5 to 32 billion parameters. Built on the strong Qwen2.5 foundation and trained on 5.5 trillion tokens including source code, text-code grounding, and synthetic data, it brings significant improvements in code generation, code reasoning, and code fixing. The 32B variant has achieved state-of-the-art performance among open-source code LLMs, with coding abilities matching GPT-4o. This instruction-tuned 14B variant is a causal language model with 48 layers, 40 query attention heads with grouped query attention for 8 key-value heads, and supports up to 128K tokens of context length. Beyond coding, it maintains strong capabilities in mathematics and general competencies, making it a comprehensive foundation for real-world applications including code agents. The model features standard transformer architecture enhancements including RoPE positional embeddings, SwiGLU activation, and RMSNorm.
Qwen2.5-Coder is the latest series of code-specific large language models from Alibaba Cloud, available in six mainstream sizes from 0.5 to 32 billion parameters. Built on the strong Qwen2.5 foundation and trained on 5.5 trillion tokens including source code, text-code grounding, and synthetic data, it brings significant improvements in code generation, code reasoning, and code fixing. The 32B variant has achieved state-of-the-art performance among open-source code LLMs, with coding abilities matching GPT-4o. This instruction-tuned 14B variant is a causal language model with 48 layers, 40 query attention heads with grouped query attention for 8 key-value heads, and supports up to 128K tokens of context length. Beyond coding, it maintains strong capabilities in mathematics and general competencies, making it a comprehensive foundation for real-world applications including code agents. The model features standard transformer architecture enhancements including RoPE positional embeddings, SwiGLU activation, and RMSNorm.