Hi, we’re amigoh. We help in rooms where interpreters don’t show up. One person speaks; every listener hears their own language, within about a second. Driving schools, hotels, classrooms, hospitals, front desks — places where waiting isn’t an option.
No credit card. Already a customer? Sign in.
We didn’t build this for conference stages. We built it for the small, high-stakes rooms where people are already trying to get through to each other, and where an hour’s delay would cancel the whole thing.
The instructor speaks Japanese. Each learner — Portuguese, Tagalog, Vietnamese — hears it in their own language. That matters when what’s being said is “brake now”.
Concierges take check-ins, medical emergencies, the 2am noise complaint. No calling the interpreter line. Guests hear their own language; staff keep speaking theirs.
One teacher, twenty kids, seven home languages. Captions on the back wall; audio in each kid’s earbud. Nobody has to translate for their seatmate.
Two modes, depending on the room. Broadcast mode is one-to-many: a classroom, a lobby, a townhall. Conversation mode is one-to-one and hands-free — it talks back out loud, so eyes can stay on the road.
Share an 8-digit code or QR. Listeners join on their own phone, pick a language, and follow along as captions or audio. Built for classrooms, lobbies, townhalls and etc.
A two-way conversation between instructor and learner, or driver and passenger. amigoh listens, translates, and reads the answer back out loud. No screens.
In Broadcast mode they share an 8-digit code and the listeners receive the translations. In Conversation mode they just start talking and amigoh replies out loud in the learner’s language.
They scan a QR or type a code, pick a language, and they’re in. No app, no login.
Seats, invoices, team roles, audit log. Admins see how minutes are used, which languages come up most, which rooms run the busiest.
A simultaneous interpreter is extraordinary. They’re also expensive, booked out weeks in advance, and they can’t show up to a 6am driving lesson. amigoh fills the rooms where they were never going to be.
| Human interpreter | amigoh | |
|---|---|---|
| Cost per hour | ¥20,000 – ¥38,000 (Tokyo average) | Flat monthly subscription — no per-hour billing |
| Lead time | 24–72 hours, often longer for niche pairs | Instant — share a code |
| Languages covered in one session | 1 pair, sometimes 2 with relay | All 53, simultaneously |
| After-hours / weekends | Premium rate, if available | Same price, any hour |
| Nuance & cultural judgement | Unmatched — use them when it matters | Solid for common pairs. Getting better every month. |
A lot of these conversations aren’t meant for anyone outside the room. Here’s what we’ve built so far.
Our servers are in Tokyo. Speech recognition and translation use Japan-based endpoints wherever possible.
Everything on the wire uses TLS. Hosts, listeners, admins alike.
Admins handle billing and members. Hosts run sessions. When an admin changes billing or permissions, it’s logged.
Tell us a little about your team, and we’ll set up a 30-minute walkthrough in your actual use case. Not a canned slide deck. Someone from the team replies within one business day.
Depends on the pair and the speaker. JA ↔ EN, JA ↔ PT, EN ↔ ES are solid. Faster casual speech in a rarer language gets harder. The host sees the live transcript as it happens, so if a word comes out wrong they just say it again.
Sub-second on a decent network. To a listener it feels like the same sentence, just in their language.
No. They scan a QR or type an 8-digit code into any browser. That’s the whole onboarding.
53 with full speech-to-text and text-to-speech. That covers all UN official languages, every major EU language, Japanese, Portuguese (BR and PT), Tagalog, Vietnamese, Thai, Hindi, Bengali, Urdu, Persian, Swahili. Another 130+ are text-translation only. If your pair isn’t listed, ask.
No, translation needs a connection. 3G works fine. On a drop-out, listeners see a “reconnecting” state instead of silence, and nothing gets lost.
One host reaches many listener languages at once, so each person picks their own and you don’t set up a separate session per language. The join flow is built for people already in the same room: a code on the wall, a QR on a sign. Nothing to install, nobody to sign up.
We stream audio for recognition, translation, and voice output in real time. We don’t store raw audio. What we do keep: session transcripts and usage records, for billing and troubleshooting. Full details in the privacy policy.
Thirty minutes with us, and you’ll know whether amigoh fits your floor. If it doesn’t, we’ll tell you.