Live in 53 languages · Built in Iga

Speak once.
Be heard in everyone’s tongue.

Hi, we’re amigoh. We help in rooms where interpreters don’t show up. One person speaks; every listener hears their own language, within about a second. Driving schools, hotels, classrooms, hospitals, front desks — places where waiting isn’t an option.

No credit card. Already a customer? Sign in.

🇯🇵 Host · JA
制動距離は速度の二乗に比例する — 覚えておいてください。
Listener · EN
Listener · PT
Where amigoh lives

Rooms where every minute matters.

We didn’t build this for conference stages. We built it for the small, high-stakes rooms where people are already trying to get through to each other, and where an hour’s delay would cancel the whole thing.

Driving schools

The instructor speaks Japanese. Each learner — Portuguese, Tagalog, Vietnamese — hears it in their own language. That matters when what’s being said is “brake now”.

Hotels & front desks

Concierges take check-ins, medical emergencies, the 2am noise complaint. No calling the interpreter line. Guests hear their own language; staff keep speaking theirs.

Classrooms

One teacher, twenty kids, seven home languages. Captions on the back wall; audio in each kid’s earbud. Nobody has to translate for their seatmate.

Built for the whole team

Two modes. Whoever’s in the room, there’s something for them.

Two modes, depending on the room. Broadcast mode is one-to-many: a classroom, a lobby, a townhall. Conversation mode is one-to-one and hands-free — it talks back out loud, so eyes can stay on the road.

Broadcast mode1 → many

One host, any number of listeners.

Share an 8-digit code or QR. Listeners join on their own phone, pick a language, and follow along as captions or audio. Built for classrooms, lobbies, townhalls and etc.

OutputCaptions + audioBest forGroups, broadcastsJoinQR or 8-digit code
Conversation mode1 ↔ 1

Two people, eyes forward, hands full.

A two-way conversation between instructor and learner, or driver and passenger. amigoh listens, translates, and reads the answer back out loud. No screens.

OutputNatural TTS voiceBest forDriving lessons, cabs, fieldworkLatency< 400 ms
For hosts

The teacher, the concierge, the driver.

In Broadcast mode they share an 8-digit code and the listeners receive the translations. In Conversation mode they just start talking and amigoh replies out loud in the learner’s language.

  • Live transcript as you speak
  • One host, many listener languages at once
  • Broadcast mode and Conversation mode in one app
For listeners

The student, the guest, the new hire.

They scan a QR or type a code, pick a language, and they’re in. No app, no login.

  • No app install — a QR or a short URL
  • Captions or audio, their pick
  • Works on 3G, handles drop-outs cleanly
For admins

The person keeping the lights on.

Seats, invoices, team roles, audit log. Admins see how minutes are used, which languages come up most, which rooms run the busiest.

  • Top-language breakdowns and usage over time
  • Role-based access (Admin / Host)
  • Billing audit log
vs. Human interpreters

We’re not replacing the brilliant ones.

A simultaneous interpreter is extraordinary. They’re also expensive, booked out weeks in advance, and they can’t show up to a 6am driving lesson. amigoh fills the rooms where they were never going to be.

Human interpreteramigoh
Cost per hour¥20,000 – ¥38,000 (Tokyo average)Flat monthly subscription — no per-hour billing
Lead time24–72 hours, often longer for niche pairsInstant — share a code
Languages covered in one session1 pair, sometimes 2 with relayAll 53, simultaneously
After-hours / weekendsPremium rate, if availableSame price, any hour
Nuance & cultural judgementUnmatched — use them when it mattersSolid for common pairs. Getting better every month.
Security

What’s actually in place.

A lot of these conversations aren’t meant for anyone outside the room. Here’s what we’ve built so far.

Tokyo region
Processed in Japan.

Our servers are in Tokyo. Speech recognition and translation use Japan-based endpoints wherever possible.

TLS everywhere
Encrypted in transit.

Everything on the wire uses TLS. Hosts, listeners, admins alike.

Role-based access
Admin vs. host.

Admins handle billing and members. Hosts run sessions. When an admin changes billing or permissions, it’s logged.

Request a demo

See it running in a room like yours.

Tell us a little about your team, and we’ll set up a 30-minute walkthrough in your actual use case. Not a canned slide deck. Someone from the team replies within one business day.

In Japan? 0595-21-1000, weekdays 9–19 JST
Elsewhere? hello@amigoh.io

By submitting, you agree to our privacy policy. We don’t sell your details, and we won’t send marketing emails unless you ask.

Thanks — we got it.

Someone from the amigoh team will reply within one business day. Check your inbox (and the spam folder — sorry).

FAQ

The things everyone asks us.

Missing something? Drop us a line at hello@amigoh.io.

How accurate is it, really?

Depends on the pair and the speaker. JA ↔ EN, JA ↔ PT, EN ↔ ES are solid. Faster casual speech in a rarer language gets harder. The host sees the live transcript as it happens, so if a word comes out wrong they just say it again.

What’s the delay?

Sub-second on a decent network. To a listener it feels like the same sentence, just in their language.

Do listeners need an app?

No. They scan a QR or type an 8-digit code into any browser. That’s the whole onboarding.

Which languages are you live in?

53 with full speech-to-text and text-to-speech. That covers all UN official languages, every major EU language, Japanese, Portuguese (BR and PT), Tagalog, Vietnamese, Thai, Hindi, Bengali, Urdu, Persian, Swahili. Another 130+ are text-translation only. If your pair isn’t listed, ask.

Does it work offline?

No, translation needs a connection. 3G works fine. On a drop-out, listeners see a “reconnecting” state instead of silence, and nothing gets lost.

How is this different from Zoom live captions or Google Translate?

One host reaches many listener languages at once, so each person picks their own and you don’t set up a separate session per language. The join flow is built for people already in the same room: a code on the wall, a QR on a sign. Nothing to install, nobody to sign up.

What happens to the audio?

We stream audio for recognition, translation, and voice output in real time. We don’t store raw audio. What we do keep: session transcripts and usage records, for billing and troubleshooting. Full details in the privacy policy.

One room.
Every language.

Thirty minutes with us, and you’ll know whether amigoh fits your floor. If it doesn’t, we’ll tell you.