LFM2.5-1.2B-Instruct is a general-purpose instruction-tuned language model from Liquid AI designed for on-device deployment. With 1.17 billion parameters across 16 layers combining double-gated LIV convolution blocks and GQA attention, the model delivers best-in-class performance for its size, rivaling much larger models while running under 1GB of memory. Trained on 28 trillion tokens with extended pre-training and large-scale reinforcement learning, it supports a 32,768 token context length and is multilingual across English, Arabic, Chinese, French, German, Japanese, Korean, and Spanish. The model excels in fast edge inference, achieving 239 tokens per second on AMD CPUs and 82 tokens per second on mobile NPUs, with day-one support for llama.cpp, MLX, and vLLM frameworks. LFM2.5-1.2B-Instruct is recommended for agentic tasks, data extraction, and retrieval-augmented generation, and supports function calling and tool use through a ChatML-like chat template. The model is not recommended for knowledge-intensive tasks and programming.
LiquidAI
available local models on Mirai:
available local models on Mirai:
Name
Quantisation
Size
LFM2-1.2B
No
1.2B
Quant.
No
Size
1.2B
LFM2-2.6B
No
2.6B
Quant.
No
Size
2.6B
LFM2-350M
No
350M
Quant.
No
Size
350M
LFM2-700M
No
700M
Quant.
No
Size
700M
LFM2.5-1.2B-Instruct
No
1.2B
Quant.
No
Size
1.2B
LFM2.5-1.2B-Instruct-MLX-4bit
No
1.2B
Quant.
No
Size
1.2B
LFM2.5-1.2B-Instruct-MLX-8bit
No
1.2B
Quant.
No
Size
1.2B
LFM2.5-1.2B-Thinking
No
1.2B
Quant.
No
Size
1.2B
LFM2-1.2B-4bit
No
1.2B
Quant.
No
Size
1.2B
LFM2-1.2B-8bit
No
1.2B
Quant.
No
Size
1.2B
LFM2-2.6B-4bit
No
2.6B
Quant.
No
Size
2.6B
LFM2-2.6B-8bit
No
2.6B
Quant.
No
Size
2.6B
LFM2-350M-4bit
No
350M
Quant.
No
Size
350M
LFM2-350M-8bit
No
350M
Quant.
No
Size
350M
LFM2-700M-4bit
No
700M
Quant.
No
Size
700M
LFM2-700M-8bit
No
700M
Quant.
No
Size
700M
LFM2.5-1.2B-Thinking-4bit
No
1.2B
Quant.
No
Size
1.2B
LFM2.5-1.2B-Thinking-8bit
No
1.2B
Quant.
No
Size
1.2B
LFM2.5-1.2B-Instruct is a general-purpose instruction-tuned language model from Liquid AI designed for on-device deployment. With 1.17 billion parameters across 16 layers combining double-gated LIV convolution blocks and GQA attention, the model delivers best-in-class performance for its size, rivaling much larger models while running under 1GB of memory. Trained on 28 trillion tokens with extended pre-training and large-scale reinforcement learning, it supports a 32,768 token context length and is multilingual across English, Arabic, Chinese, French, German, Japanese, Korean, and Spanish. The model excels in fast edge inference, achieving 239 tokens per second on AMD CPUs and 82 tokens per second on mobile NPUs, with day-one support for llama.cpp, MLX, and vLLM frameworks. LFM2.5-1.2B-Instruct is recommended for agentic tasks, data extraction, and retrieval-augmented generation, and supports function calling and tool use through a ChatML-like chat template. The model is not recommended for knowledge-intensive tasks and programming.
LiquidAI
available local models on Mirai:
Name
Quantisation
Size
LFM2-1.2B
No
1.2B
Quant.
No
Size
1.2B
LFM2-2.6B
No
2.6B
Quant.
No
Size
2.6B
LFM2-350M
No
350M
Quant.
No
Size
350M
LFM2-700M
No
700M
Quant.
No
Size
700M
LFM2.5-1.2B-Instruct
No
1.2B
Quant.
No
Size
1.2B
LFM2.5-1.2B-Instruct-MLX-4bit
No
1.2B
Quant.
No
Size
1.2B
LFM2.5-1.2B-Instruct-MLX-8bit
No
1.2B
Quant.
No
Size
1.2B
LFM2.5-1.2B-Thinking
No
1.2B
Quant.
No
Size
1.2B
LFM2-1.2B-4bit
No
1.2B
Quant.
No
Size
1.2B
LFM2-1.2B-8bit
No
1.2B
Quant.
No
Size
1.2B
LFM2-2.6B-4bit
No
2.6B
Quant.
No
Size
2.6B
LFM2-2.6B-8bit
No
2.6B
Quant.
No
Size
2.6B
LFM2-350M-4bit
No
350M
Quant.
No
Size
350M
LFM2-350M-8bit
No
350M
Quant.
No
Size
350M
LFM2-700M-4bit
No
700M
Quant.
No
Size
700M
LFM2-700M-8bit
No
700M
Quant.
No
Size
700M
LFM2.5-1.2B-Thinking-4bit
No
1.2B
Quant.
No
Size
1.2B
LFM2.5-1.2B-Thinking-8bit
No
1.2B
Quant.
No
Size
1.2B