Smart routing

Smart routing

Decide what runs local vs cloud. Automatically

Mirai’s routing engine gives you full control over performance, privacy, and price

On device or cloud. In real time

On-device when it’s fast and private. Cloud when it’s heavy and contextual. All handled through our routing engine — with no overhead on your side.

On-device when it’s fast and private. Cloud when it’s heavy and contextual. All handled through our routing engine — with no overhead on your side.

Dynamic runtime routing

Dynamic runtime routing

Dynamic runtime routing

Automatically route inference to device or cloud based on prompt type, latency constraints, or user context — no manual ops required.

Fully programmable policies

Fully programmable policies

Fully programmable policies

Route by prompt length, device capabilities, confidence thresholds, or user segments. You define the logic — Mirai handles execution.

Mirai Routing Optimises

Speed

Speed

Speed

Speed

when low-latency matters

Privacy

Privacy

Privacy

Privacy

for sensitive data

Accuracy

Accuracy

Accuracy

Accuracy

when longer context is needed

Cost-efficiency

Cost-efficiency

Cost-efficiency

Cost-efficiency

to run less in the cloud

Task

Route

Benefits

Task

Route

Benefits

Welcome message generation

📱 On device (Llama 3.2 1B)

Sub-400ms response time

Welcome message generation

📱 On device (Llama 3.2 1B)

Sub-400ms response time

Content moderation

📱 On-device or Hybrid

Full privacy for user data

Content moderation

📱 On-device or Hybrid

Full privacy for user data

Code refactor request

☁️ Cloud (GPT-4 or Claude)

More accurate results

Code refactor request

☁️ Cloud (GPT-4 or Claude)

More accurate results

Inline classification

📱 Local (using speculative decoding)

3x faster than baseline run

Inline classification

📱 Local (using speculative decoding)

3x faster than baseline run

Welcome message generation

📱 On device (Llama 3.2 1B)

Route

Sub-400ms response time

Why Mirai Wins

Content moderation

📱 On-device or Hybrid

Route

Custom safety tuning

Full privacy for user data

Why Mirai Wins

Code refactor request

☁️ Cloud (GPT-4 or Claude)

Route

Requires long context

More accurate results

Why Mirai Wins

Inline classification

📱 Local + speculative

Route

3x faster than baseline run

Why Mirai Wins

Configure routing without changing your app logic

Specify hard rules (always use cloud, never use cloud), prompt templates with routing hints, model families per task type and more…

Smart routing will be available soon in our SDK

Smart routing will be available soon in our SDK

Set up your AI project in 10 minutes for free

Full control and performance

First 10K devices for free

Run most popular small LLMs