Typhoon Isan: Open-Source ASR and a Language Technology Suite for Thailand’s Largest Dialect
SCB 10X develops Typhoon Isan. It is Thailand’s first open-source AI that understands the Isan language, combining useful datasets, clear transcription standards, and both real-time and accurate speech-to-text models.

Typhoon Isan is a groundbreaking, open-source AI technology specifically designed to understand and transcribe the Isan dialect, which is spoken by over 20 million people in Thailand every day. Historically, mainstream voice recognition systems have struggled to understand Isan because it is primarily a spoken language without a widely accepted spelling or writing standard. To solve this, the Typhoon team collaborated with local speakers and linguistic experts to build the first production-ready AI that can accurately process Isan speech.
Key Insights
- A Complete Language Foundation: Instead of just releasing an AI model, the team built a comprehensive “language suite”. They created standardized spelling rules, transcription guidelines, a phonetic dictionary, and a massive collection of recorded speech to give the AI the data it needed to learn.
- Two Specialized Models: The project released two versions of the AI to meet different needs:
- Typhoon Isan ASR Real-time: A lightweight, high-speed model perfect for live transcriptions and online meetings that runs easily on standard hardware.
- Typhoon Isan ASR Whisper: A highly accurate model designed for recorded audio that is especially good at handling speakers who mix Isan, Thai, and English.
- Rivaling Big Tech: Testing shows that Typhoon Isan’s open-source technology is incredibly accurate, matching or even outperforming massive, expensive commercial AI models like Gemini when it comes to understanding the Isan dialect.
Practical Benefits for Consumers
- Digital Inclusion for Rural Communities: This technology ensures that millions of people are not left behind in the digital age, allowing them to use voice-activated AI and digital tools in their own native dialect.
- Everyday Convenience: Consumers will soon see this technology powering smart home assistants, local call centers, and smart city services that can actually understand them when they speak naturally.
- Better Subtitles and Media: The AI can automatically generate highly accurate subtitles for videos, podcasts, and interviews involving local Isan communities, making media more accessible.
- Lower Costs for Local Businesses: Because the technology is free to use (open-source) and runs efficiently on everyday computers, local schools, businesses, and government agencies can adopt powerful voice technology without paying for expensive foreign services or high-end equipment.



