About us

We’re a small, senior team building the fastest on-device AI inference engine

About us

We’re a small, senior team building the fastest on-device AI inference engine.

What we’re building?

We are 14-person team, who under a year, built a complete on-device inference stack, from model optimization and export tooling to a proprietary runtime and deployment layer.

On supported models, Mirai outperforms MLX and llama.cpp, while staying production-ready for real applications.

We're not building demos. We're not optimizing for benchmarks that exist in some airless vacuum. We're making local inference something you can ship.

Why on-device?

Cloud inference won’t disappear. But it’s not enough.

Latency, cost, privacy, and reliability all break down when every interaction depends on a round trip to a server. As AI moves deeper into products and workflows, intelligence needs to live closer to the user.

Modern consumer devices can now run meaningful AI workloads locally, unlocking a new execution surface beyond the cloud.

On-device inference enables:

  • Near-zero latency experiences

  • Strong privacy guarantees

  • Predictable costs at scale

  • Offline and degraded-network use cases

Mirai exists to make this practical.

What makes Mirai different?

Mirai isn't a wrapper around existing inference stacks. It's built from scratch for on-device execution.

Most inference engines start with cross-platform abstractions and work downward to fit mobile hardware. That makes portability easy. But you pay for it in performance, memory efficiency, and reliability.

We went the other way.

Mirai was designed for Apple Silicon specifically.

We own the whole stack:

  • model optimization,

  • execution,

  • memory management,

  • scheduling,

  • deployment.

Mirai controls every layer to optimize how models actually run.

Mirai is different:

  • Hardware-aware execution instead of generic kernels

  • No cross-platform abstraction tax

  • Predictable performance under real production constraints

Who we are?

Mirai is founded by proven entrepreneurs who built and scaled consumer AI leaders like Reface (200M+ users, backed by Andreessen Horowitz) and Prisma (100M+ users).

Our team is small (14 people), senior, and deeply technical. We ship fast and own problems end-to-end.

We’re advised by a former Apple Distinguished Engineer who worked on MLX, and backed by leading AI-focused funds and individuals.

Where we’re going?

Turn our Apple Silicon technical lead into market dominance.

We are focusing on:

  • Maintaining a clear performance lead over open stacks.

  • Expanding model support without sacrificing speed or reliability.

  • Building world-class developer tooling, documentation, and benchmarks.

  • Powering companies where latency, cost, and privacy actually matter.

Our Vision

For AI to work seamlessly, the core must live on device.

As AI becomes a core part of software, inference can’t rely entirely on the cloud. Latency, privacy, cost, and reliability require intelligence to run where users are.

We believe the next generation of software will be built on a new system layer. Not just models, not just runtimes, but a tightly integrated stack that makes intelligence native to the device.

Mirai is building that layer: an LLM OS that combines optimized inference, models, and deployment into a reliable on-device foundation for intelligent software.

AI in a next 10 years will make reality that everyone will make software for themselves. We will not need apps, we will need just a screen or surface where we can collaborate with machines and services.

Backed by leading AI
builders and investors: