Smart routing
Smart routing
Decide what runs local vs cloud. Automatically
Mirai’s routing engine gives you full control over performance, privacy, and price
On device or cloud. In real time
On-device when it’s fast and private. Cloud when it’s heavy and contextual. All handled through our routing engine — with no overhead on your side.
On-device when it’s fast and private. Cloud when it’s heavy and contextual. All handled through our routing engine — with no overhead on your side.
Dynamic runtime routing
Dynamic runtime routing
Dynamic runtime routing
Automatically route inference to device or cloud based on prompt type, latency constraints, or user context — no manual ops required.
Fully programmable policies
Fully programmable policies
Fully programmable policies
Route by prompt length, device capabilities, confidence thresholds, or user segments. You define the logic — Mirai handles execution.
Mirai Routing Optimizes
Speed
Speed
Speed
Speed
when low-latency matters
Privacy
Privacy
Privacy
Privacy
for sensitive data
Accuracy
Accuracy
Accuracy
Accuracy
when longer context is needed
Cost-efficiency
Cost-efficiency
Cost-efficiency
Cost-efficiency
to run less in the cloud
Task
Route
Benefits
Task
Route
Benefits
Welcome message generation
📱 On device (Llama 3.2 1B)
Sub-400ms response time
Welcome message generation
📱 On device (Llama 3.2 1B)
Sub-400ms response time
Content moderation
📱 On-device or Hybrid
Full privacy for user data
Content moderation
📱 On-device or Hybrid
Full privacy for user data
Code refactor request
☁️ Cloud (GPT-4 or Claude)
More accurate results
Code refactor request
☁️ Cloud (GPT-4 or Claude)
More accurate results
Inline classification
📱 Local (using speculative decoding)
3x faster than baseline run
Inline classification
📱 Local (using speculative decoding)
3x faster than baseline run
Welcome message generation
📱 On device (Llama 3.2 1B)
Route
Sub-400ms response time
Why Mirai Wins
Content moderation
📱 On-device or Hybrid
Route
Custom safety tuning
Full privacy for user data
Why Mirai Wins
Code refactor request
☁️ Cloud (GPT-4 or Claude)
Route
Requires long context
More accurate results
Why Mirai Wins
Inline classification
📱 Local + speculative
Route
3x faster than baseline run
Why Mirai Wins
Configure routing without changing your app logic
Specify hard rules (always use cloud
, never use cloud
), prompt templates with routing hints, model families per task type and more…




Smart routing will be available soon in our SDK
Smart routing will be available soon in our SDK
Set up your AI project in 10 minutes for free
Full control and performance
First 10K devices for free
Run most popular small LLMs