Inference Engine

The fastest inference runtime for iPhone, iPad and Mac.

Your models.
Every Apple device.

The fastest inference engine for Apple Silicon.

Your models. Every Apple device. The fastest inference engine for Apple Silicon.

Optimize and run your model on every Apple device. Up to 38% faster prompt processing and 18% faster generation than MLX.

terminal — mirai
user %
terminal — mirai
user %

What Apple Silicon delivers today with Mirai.

Convert. Integrate. Run.

1
Choose framework
2
Run the following command to install Mirai SDK
SPMhttps://github.com/trymirai/uzu-swift
3
Set Mirai API keyGet API Key
4
Apply code
Loading...
1
Choose framework
2
Run the following command to install Mirai SDK
SPMhttps://github.com/trymirai/uzu-swift
3
Set Mirai API keyGet API Key
4
Apply code
Loading...

You built the model. We get it running on 2 billion Apple devices.

Model companies.

You train and ship models. Mirai optimizes them for Apple Silicon, benchmarks on real hardware, and distributes across Apple devices.

You train and ship models. Mirai optimizes them for Apple Silicon, benchmarks on real hardware, and distributes across Apple devices.

You train and ship models. Mirai optimizes them for Apple Silicon, benchmarks on real hardware, and distributes across Apple devices.

AI researchers & labs.

AI researchers & labs.

AI researchers & labs.

You publish on Hugging Face. Mirai converts your model and puts it in front of real users on real devices, not just leaderboards.

You publish on Hugging Face. Mirai converts your model and puts it in front of real users on real devices, not just leaderboards.

You publish on Hugging Face. Mirai converts your model and puts it in front of real users on real devices, not just leaderboards.

Independent model makers.

Independent model makers.

Independent model makers.

You're fine-tuning or training from scratch. Mirai gives your model the same Apple device reach as LiquidAI, OpenAI, and DeepSeek.

You're fine-tuning or training from scratch. Mirai gives your model the same Apple device reach as LiquidAI, OpenAI, and DeepSeek.

You're fine-tuning or training from scratch. Mirai gives your model the same Apple device reach as LiquidAI, OpenAI, and DeepSeek.

Your model gets these out of the box.

Speculative decoding.

A smaller draft model predicts multiple tokens ahead. Your main model verifies them in one pass. Fewer forward passes, same output quality.

Structured output.

JSON mode, constrained generation, grammar-guided decoding. Your model outputs structured data reliably.

Task-specific sessions.

Optimized execution modes for classification, summarization, and generation. Each mode is tuned for the workload.

Built-in performance metrics.

Tokens per second, time to first token, generation duration. Every run returns detailed metrics automatically.

Speculative decoding.

A smaller draft model predicts multiple tokens ahead. Your main model verifies them in one pass. Fewer forward passes, same output quality.

Structured output.

JSON mode, constrained generation, grammar-guided decoding. Your model outputs structured data reliably.

Task-specific sessions.

Optimized execution modes for classification, summarization, and generation. Each mode is tuned for the workload.

Built-in performance metrics.

Tokens per second, time to first token, generation duration. Every run returns detailed metrics automatically.

Routing
swift
Routing
swift
Routing
swift

Works with any stack.

Simple, high-level API.

Hybrid architecture.

Unified model configurations

Traceable computations.

Utilizes unified memory on Apple devices.

Language

Package

Distribution

Rust

Mirai UZU

Cargo

Swift

Mirai UZU Swift

Swift Package Manager

TypeScript

Mirai UZU TS

NPM (Node.js)

Kotlin

Coming Soon

Python

Coming Soon

Supported models

Supported models:

Route models between device and cloud.

Common questions:

How does model support work?

How does model support work?

What architectures are supported?

Framer is a design tool that allows you to design websites on a freeform canvas, and then publish them as websites with a single click.

What architectures are supported?

Framer is a design tool that allows you to design websites on a freeform canvas, and then publish them as websites with a single click.

How does Mirai compare to other inference engines?

Framer is a design tool that allows you to design websites on a freeform canvas, and then publish them as websites with a single click.

How does Mirai compare to other inference engines?

Framer is a design tool that allows you to design websites on a freeform canvas, and then publish them as websites with a single click.

What is the maximum supported model size?

Framer is a design tool that allows you to design websites on a freeform canvas, and then publish them as websites with a single click.

What is the maximum supported model size?

Framer is a design tool that allows you to design websites on a freeform canvas, and then publish them as websites with a single click.

How can I run benchmarks myself?

Framer is a design tool that allows you to design websites on a freeform canvas, and then publish them as websites with a single click.

How can I run benchmarks myself?

Framer is a design tool that allows you to design websites on a freeform canvas, and then publish them as websites with a single click.

How can we discuss a specific use case?

Framer is a design tool that allows you to design websites on a freeform canvas, and then publish them as websites with a single click.

How can we discuss a specific use case?

Framer is a design tool that allows you to design websites on a freeform canvas, and then publish them as websites with a single click.

terminal — mirai