Routing & Speculation for on device AI
Mirai’s routing engine gives you full control over performance, privacy, and price, with speculative decoding built in.
On device or cloud. In real time
Optimised for
when low-latency matters
for sensitive data
when longer context is needed
to run less in the cloud
Configure routing without changing your app logic
Specify hard rules (always use cloud
, never use cloud
), prompt templates with routing hints, model families per task type and more…
Routing & Speculation is currently available in our SDK