Ship AI to every device
without the cloud tax
Run speech, vision, and language models on the device your users hold  with automatic cloud fallback for the long tail.
Oncillo automatically routes audio between on-device for clear audio and cloud for noisy data.
Oncillo
Oncillo routes agent commands based on complexity: on-device for simple tasks, cloud for complex operations.
Set the thermostat to 72 degrees
Built by a team from
Powered by the Oncillo Engine.
The fastest on-device runtime.
Open Source
Fully auditable and community-driven. Inspect every line that runs on your users' devices.
Optimized Execution
Quantized models with hardware-specific acceleration. Tuned for battery-efficient inference.
Zero-copy Memory Mapping
Minimal RAM usage and near-instant model loading with zero-copy memory mapping.
Cross-Platform
iOS, Android, macOS, and wearables from a single SDK. Write once, deploy anywhere.
Oncillo Hybrid Cloud
Cloud accuracy. Without the cloud cost.
Oncillo only hands off the complex requests to the cloud, running simple tasks on-device.
Over 80% of production transcription and LLM inference can be handled on-device.
Real-time transcription. No round-trip to the cloud for clear audio.
We built Oncillo as an on-device engine first. Optimized for the fastest inference on smartphones, laptops, and wearables.
Automatic Handoff
Oncillo monitors audio quality in real-time. When conditions change, we seamlessly switch between on-device and cloud inference. Your app doesn't need to know the difference.
Privacy When You Need It
For sensitive applications, lock transcription to on-device only. Audio data never leaves the user's phone. HIPAA-friendly, GDPR-compliant, zero data retention.
No compromise
Get the best of both on-device and cloud.
Built for the edge
From phones to glasses, Oncillo runs wherever your users are.
Mobile Voice Assistant
Real-time voice commands and dictation for iOS and Android apps with sub-150ms latency.
Desktop Notetaker
Meeting transcription and note-taking for macOS with automatic speaker detection.
Wearable Intelligence
Always-on transcription for smart glasses and AR devices with minimal battery impact.
Ready to get started?
Add transcription to your app in minutes. Free to start, scales with you.