Mirai iOS SDK Public Preview

By

Mirai team

May 3, 2025

Today, we’re excited to share the first public preview of our inference engine for iOS devices. It fully leverages the potential of the hardware, as described in this blog post, to achieve superior performance in specific use cases.

Here is side-by-side comparison between Mirai inference engine and MLX (Apple framework):

Chat, iPhone 16 Pro:

Mirai inference engine vs MLX (Apple Framework)

Summarization, iPhone 16 Pro: 

Mirai inference engine vs MLX (Apple Framework)

As part of the preview, you can run Llama-3.2-1b-Instruct-float16 on your device and choose one of the following configurations:

  • Chat

  • Summarization

  • Classification

In upcoming releases, we’ll add support for additional models, including VLMs, and provide specific configurations for more use cases, such as structured output.

If you have any questions, feel free to drop us a message at contact@getmirai.co.

Today, we’re excited to share the first public preview of our inference engine for iOS devices. It fully leverages the potential of the hardware, as described in this blog post, to achieve superior performance in specific use cases.

Here is side-by-side comparison between Mirai inference engine and MLX (Apple framework):

Chat, iPhone 16 Pro:

Mirai inference engine vs MLX (Apple Framework)

Summarization, iPhone 16 Pro: 

Mirai inference engine vs MLX (Apple Framework)

As part of the preview, you can run Llama-3.2-1b-Instruct-float16 on your device and choose one of the following configurations:

  • Chat

  • Summarization

  • Classification

In upcoming releases, we’ll add support for additional models, including VLMs, and provide specific configurations for more use cases, such as structured output.

If you have any questions, feel free to drop us a message at contact@getmirai.co.

Next articles:

Other articles to read:

Try Mirai – AI which run directly on device, bringing powerful capabilities closer to where decisions are made.

Hassle-free app integration, lightning-fast inference, reliable structured outputs