Your models. Every Apple device.
The fastest inference engine for Apple Silicon.

Your models. Every Apple device. The fastest inference engine for Apple Silicon.

Your models.
Every Apple device.

The fastest inference engine for Apple Silicon.

New Chat
Chats
Local Models
Agents
Routers
Deploy to Apple...
Settings
Local chatOffline • Powered by Mirai
How can I run my model on Apple devices?
Running on Mac, iPhone or iPad · Offline · Private
What makes Mirai different?
Qwen3-0.6B
Eject model
Qwen3-0.6B
0.1.18
New Chat
Chats
Local Models
Agents
Routers
Deploy to Apple...
Settings
Local chatOffline • Powered by Mirai
How can I run my model on Apple devices?
Running on Mac, iPhone or iPad · Offline · Private
What makes Mirai different?
Qwen3-0.6B
Eject model
Qwen3-0.6B
0.1.18
New Chat
Chats
Local Models
Agents
Routers
Deploy to Apple...
Settings
Local chatOffline • Powered by Mirai
How can I run my model on Apple devices?
Running on Mac, iPhone or iPad · Offline · Private
What makes Mirai different?
Qwen3-0.6B
Eject model
Qwen3-0.6B
0.1.18

Convert, optimize, distribute & run your models on Apple devices.

terminal — mirai
user %
terminal — mirai
user %
terminal — mirai
user %

Convert.

Convert and optimize your model for edge devices.

One line model conversion.

Quantize with outstanding quality.

A lot of supported architectures out of the box.

Benchmark.

Distribute.

Run.

Convert.

Convert and optimize your model for edge devices.

One line model conversion.

Quantize with outstanding quality.

A lot of supported architectures.

Benchmark.

Distribute.

Run.

Convert.

Convert and optimize your model for edge devices.

One line model conversion.

Quantize with outstanding quality.

A lot of supported architectures.

Benchmark.

Distribute.

Run.

What Apple Silicon delivers today with Mirai.

0

0

%

%

Real-world AI queries

Real-world AI queries

Real-world AI queries

Can be served locally on consumer hardware.

Can be served locally on consumer hardware.

0

0

TOPS

TOPS

Neural Engine on M4

Neural Engine on M4

Neural Engine on M4

Mirai squeezes everything out of it.

Mirai squeezes everything out of it.

0

0

GB/s

GB/s

Unified memory bandwidth on M4

Unified memory bandwidth on M4

Memory bandwidth on M4

Your model loads once, runs everywhere on chip.

Your model loads once, runs everywhere on chip.

0

0

t/s

t/s

Qwen3-0.6B on M4 Max

Qwen3-0.6B on M4 Max

Qwen3-0.6B on M4 Max

Fast real-time generation on device.

Fast real-time generation on device.

Use cases that benefit from local inference.

Text

Text

Text

Summarization & extraction
Documents, emails, meeting notes

Summarization & extraction
Documents, emails, meeting notes

Summarization & extraction
Documents, emails, meeting notes

Classification
Intent detection, content tagging

Classification
Intent detection, content tagging

Classification
Intent detection, content tagging

Routing
Easily route complex requests to cloud model

Routing
Easily route complex requests to cloud model

Routing
Easily route complex requests to cloud model

Translation
With no internet connection

Translation
With no internet connection

Translation
With no internet connection

Voice coming soon

Voice coming soon

Voice coming soon

Speech-to-text
Real-time transcription on device

Speech-to-text
Real-time transcription on device

Speech-to-text
Real-time transcription on device

Text-to-speech
Easily narration of the text

Text-to-speech
Easily narration of the text

Text-to-speech
Easily narration of the text

Speech-to-speech
Translation and voice assistants

Speech-to-speech
Translation and voice assistants

Speech-to-speech
Translation and voice assistants

Voice commands
Predictable outputs

Voice commands
Predictable outputs

Voice commands
Predictable outputs

We built on-device native inference layer for Apple Silicon.

Seamless distribution.

Your model works on every Apple device.

Reach every Apple device.

Your model runs faster on Mirai than any other on-device runtime.

Performance without compromise.

Convert from Hugging Face. Quantize, optimize, distribute.

One conversion pipeline.

Offline by default.

Zero inference cost.

Data stays on device.

Local chatPowered by Mirai
Does it work on every Apple device?
Yes. iPhone, iPad and Mac. One conversion pipeline, one SDK. Your model reaches every Apple device.
How fast is it?
Your model runs faster on Mirai than any other on-device runtime. Built natively for Apple Silicon. Zero latency.
Send a message

Seamless distribution.

Real-time features.

Consistent UX.

Simpler code.

Offline by default.

Zero inference cost.

Data stays on device.

Seamless distribution.

Reach every Apple device.

Performance without compromise.

One conversion pipeline.

Offline by default.

Zero inference cost.

Data stays on device.

Supported models

Supported models:

Route models between device and cloud.

Route models between device and cloud.

Run compact models locally on Apple Silicon. Route larger workloads to cloud infrastructure when necessary.

Reduce cloud cost.

Maintain full control.

Keep sensitive data on user device.

Route models between device and cloud.

Run compact models locally on Apple Silicon. Route larger workloads to cloud infrastructure when necessary.

Reduce cloud cost.

Maintain full control.

Keep sensitive data on user device.

Routing
swift
Routing
swift
Routing
swift

Common questions:

How does model support work?

How does model support work?

How does model support work?

What architectures are supported?

Framer is a design tool that allows you to design websites on a freeform canvas, and then publish them as websites with a single click.

What architectures are supported?

Framer is a design tool that allows you to design websites on a freeform canvas, and then publish them as websites with a single click.

What architectures are supported?

Framer is a design tool that allows you to design websites on a freeform canvas, and then publish them as websites with a single click.

How does Mirai compare to other inference engines?

Framer is a design tool that allows you to design websites on a freeform canvas, and then publish them as websites with a single click.

How does Mirai compare to other inference engines?

Framer is a design tool that allows you to design websites on a freeform canvas, and then publish them as websites with a single click.

How does Mirai compare to other inference engines?

Framer is a design tool that allows you to design websites on a freeform canvas, and then publish them as websites with a single click.

What is the maximum supported model size?

Framer is a design tool that allows you to design websites on a freeform canvas, and then publish them as websites with a single click.

What is the maximum supported model size?

Framer is a design tool that allows you to design websites on a freeform canvas, and then publish them as websites with a single click.

What is the maximum supported model size?

Framer is a design tool that allows you to design websites on a freeform canvas, and then publish them as websites with a single click.

How can I run benchmarks myself?

Framer is a design tool that allows you to design websites on a freeform canvas, and then publish them as websites with a single click.

How can I run benchmarks myself?

Framer is a design tool that allows you to design websites on a freeform canvas, and then publish them as websites with a single click.

How can I run benchmarks myself?

Framer is a design tool that allows you to design websites on a freeform canvas, and then publish them as websites with a single click.

How can we discuss a specific use case?

Framer is a design tool that allows you to design websites on a freeform canvas, and then publish them as websites with a single click.

How can we discuss a specific use case?

Framer is a design tool that allows you to design websites on a freeform canvas, and then publish them as websites with a single click.

How can we discuss a specific use case?

Framer is a design tool that allows you to design websites on a freeform canvas, and then publish them as websites with a single click.