LFM2.5-1.2B-Thinking from LiquidAI – Run On-Device with Mirai.

LFM2.5-1.2B-Thinking

Run locally Apple devices with Mirai

Run on device

Type

Local

From

LiquidAI

Quantisation

No

Precision

No

Size

1.2B

Source

Explore all local models

LFM2.5-1.2B-Thinking is a compact language model designed for on-device deployment with 1.2 billion parameters. It builds on the LFM2 architecture with extended pre-training up to 28 trillion tokens and large-scale reinforcement learning, achieving best-in-class performance for its size while rivaling much larger models. The model features a 32,768-token context length and supports eight languages including English, Arabic, Chinese, French, German, Japanese, Korean, and Spanish. The model excels at fast edge inference with extremely low memory requirements, running under 1GB of memory and delivering 239 tokens per second on AMD CPUs and 82 tokens per second on mobile NPUs. It has day-one support for multiple inference frameworks including llama.cpp, MLX, and vLLM. LFM2.5-1.2B-Thinking incorporates a hybrid architecture combining double-gated LIV convolution blocks with GQA blocks, making it particularly effective for agentic tasks, data extraction, and retrieval-augmented generation applications.

1

Choose framework

2

Run the following command to install Mirai SDK

SPMhttps://github.com/trymirai/uzu-swift

3

Set Mirai API keyGet API Key →

4

Apply code

Loading...

LFM2.5-1.2B-Thinking

Run locally Apple devices with Mirai

Run on device

Type

Local

From

LiquidAI

Quantisation

No

Precision

float16

Size

1.2B

Source

Explore all local models

LFM2.5-1.2B-Thinking is a compact language model designed for on-device deployment with 1.2 billion parameters. It builds on the LFM2 architecture with extended pre-training up to 28 trillion tokens and large-scale reinforcement learning, achieving best-in-class performance for its size while rivaling much larger models. The model features a 32,768-token context length and supports eight languages including English, Arabic, Chinese, French, German, Japanese, Korean, and Spanish. The model excels at fast edge inference with extremely low memory requirements, running under 1GB of memory and delivering 239 tokens per second on AMD CPUs and 82 tokens per second on mobile NPUs. It has day-one support for multiple inference frameworks including llama.cpp, MLX, and vLLM. LFM2.5-1.2B-Thinking incorporates a hybrid architecture combining double-gated LIV convolution blocks with GQA blocks, making it particularly effective for agentic tasks, data extraction, and retrieval-augmented generation applications.

1

Choose framework

2

Run the following command to install Mirai SDK

SPMhttps://github.com/trymirai/uzu-swift

3

Set Mirai API keyGet API Key →

4

Apply code

Loading...

Main

Company

Links

Platform / SDK

Models Library

MacOS App

Blog

Docs

About us

Careers

Contact Us

Privacy Policy

Terms of Use

X (Twitter)

Github

Discord

Platform / SDK

Models Library

MacOS App

Blog

Docs

Main

About us

Careers

Contact Us

Privacy Policy

Terms of Use

Company

X (Twitter)

Github

Discord

Links

Main

Company

Links

Platform / SDK

Models Library

MacOS App

Blog

Docs

About us

Careers

Contact Us

Privacy Policy

Terms of Use

X (Twitter)

Github

Discord