Inference Engine

The fastest inference runtime for iPhone, iPad and Mac.

Inference Engine

Optimize and run your model on every Apple device.
Up to 38% faster prompt processing vs MLX.

Optimize and run your model on every Apple device. Up to 38% faster prompt processing vs MLX.

uzu by mirai labs
user %
uzu by mirai labs
user %

Run your model on 2 billion Apple devices. Perfect for:

Model companies.

You train and ship models. Mirai optimizes them for Apple Silicon, benchmarks on real hardware, and distributes.

AI researchers & labs.

Mirai converts your model and puts it in front of real users on Apple devices, not just leaderboards.

Independent makers.

You're fine-tuning or training from scratch. Mirai gives your model the same device reach as OpenAI and DeepSeek.

Model companies.

You train and ship models. Mirai optimizes them for Apple Silicon, benchmarks on real hardware, and distributes.

AI researchers & labs.

Mirai converts your model and puts it in front of real users on Apple devices, not just leaderboards.

Independent makers.

You're fine-tuning or training from scratch. Mirai gives your model the same device reach as OpenAI and DeepSeek.

What Apple Silicon delivers today with Mirai.

Convert. Integrate. Run.

1
Choose framework
2
Run the following command to install Mirai SDK
SPMhttps://github.com/trymirai/uzu.git
3
Set Mirai API keyGet API Key
4
Apply code
1import Uzu
2
3public func runChat() async throws {
4 let engineConfig = EngineConfig.create()
5 let engine = try await Engine.create(config: engineConfig)
6
7 guard let model = try await engine.model(identifier: "cartesia-ai/Llamba-1B") else {
8 return
9 }
10 for try await update in try await engine.download(model: model).iterator() {
11 print("Download progress: \(update.progress())")
12 }
13
14 let messages = [
15 ChatMessage.system().withText(text: "You are a helpful assistant"),
16 ChatMessage.user().withText(text: "Tell me a short, funny story about a robot")
17 ]
18 let session = try await engine.chat(model: model, config: .create())
19 let stream = await session.replyWithStream(input: messages, config: .create())
20 var message: ChatMessage? = nil
21 for try await update in stream.iterator() {
22 switch update {
23 case .replies(let replies):
24 message = replies.last?.message
25 case .error(let error):
26 print("Error: \(error)")
27 }
28 }
29 print("Text: \(message?.text() ?? "empty")")
30}
1
Choose framework
2
Run the following command to install Mirai SDK
SPMhttps://github.com/trymirai/uzu.git
3
Set Mirai API keyGet API Key
4
Apply code
1import Uzu
2
3public func runChat() async throws {
4 let engineConfig = EngineConfig.create()
5 let engine = try await Engine.create(config: engineConfig)
6
7 guard let model = try await engine.model(identifier: "cartesia-ai/Llamba-1B") else {
8 return
9 }
10 for try await update in try await engine.download(model: model).iterator() {
11 print("Download progress: \(update.progress())")
12 }
13
14 let messages = [
15 ChatMessage.system().withText(text: "You are a helpful assistant"),
16 ChatMessage.user().withText(text: "Tell me a short, funny story about a robot")
17 ]
18 let session = try await engine.chat(model: model, config: .create())
19 let stream = await session.replyWithStream(input: messages, config: .create())
20 var message: ChatMessage? = nil
21 for try await update in stream.iterator() {
22 switch update {
23 case .replies(let replies):
24 message = replies.last?.message
25 case .error(let error):
26 print("Error: \(error)")
27 }
28 }
29 print("Text: \(message?.text() ?? "empty")")
30}

One inference engine. Integrate from any language.

Language

Distribution

Snippet

Rust

cargo add uzu --git https://github.com/trymirai/uzu

Rust

cargo add uzu --git https://github.com/trymirai/uzu

Swift

Swift Package Manager

https://github.com/trymirai/uzu.git

https://github.com/trymirai/uzu.git

TypeScript

pnpm add @trymirai/uzu

TypeScript

pnpm add @trymirai/uzu

Python

uv add uzu

Python

uv add uzu

Kotlin

Coming Soon

Kotlin

Coming Soon

Same high-level API across all languages.

Full performance of the Rust core from every language.

Convert once, integrate anywhere.

Built-in features every model gets automatically:

Speculative decoding.

A draft model predicts tokens ahead, your model verifies in one pass. Up to 2x faster generation.

Structured output.

Task-specific sessions.

Built-in performance metrics.

Speculative decoding

Structured output

Task-specific sessions

Built-in performance metrics

swift
swift

Supported models

Supported models:

Common questions:

How does model support work?

How does model support work?

What architectures are supported?

Framer is a design tool that allows you to design websites on a freeform canvas, and then publish them as websites with a single click.

What architectures are supported?

Framer is a design tool that allows you to design websites on a freeform canvas, and then publish them as websites with a single click.

How does Mirai compare to other inference engines?

Framer is a design tool that allows you to design websites on a freeform canvas, and then publish them as websites with a single click.

How does Mirai compare to other inference engines?

Framer is a design tool that allows you to design websites on a freeform canvas, and then publish them as websites with a single click.

What is the maximum supported model size?

Framer is a design tool that allows you to design websites on a freeform canvas, and then publish them as websites with a single click.

What is the maximum supported model size?

Framer is a design tool that allows you to design websites on a freeform canvas, and then publish them as websites with a single click.

How can I run benchmarks myself?

Framer is a design tool that allows you to design websites on a freeform canvas, and then publish them as websites with a single click.

How can I run benchmarks myself?

Framer is a design tool that allows you to design websites on a freeform canvas, and then publish them as websites with a single click.

How can we discuss a specific use case?

Framer is a design tool that allows you to design websites on a freeform canvas, and then publish them as websites with a single click.

How can we discuss a specific use case?

Framer is a design tool that allows you to design websites on a freeform canvas, and then publish them as websites with a single click.

terminal — mirai