Speechmatics, the leading speech recognition technology scaleup, unveils plans to amplify its real-time transcription capabilities by providing real-time speech translation in an all-in-one API.
This offering will integrate real-time translation with its industry-beating real-time transcription in an all-in-one API Speechmatics, the leading speech recognition technology scaleup, unveils plans to amplify its real-time transcription capabilities by providing real-time translation in an all-in-one API. Breaking down language barriers enables more people to consume content regardless of industry and unlocks the ability to automatically translate live content from multiple regions. This combined offering enables customers to use the world’s most accurate speech-to-text engine and translate speech for 69 language pairs*.
Real-time translation follows on a month from Speechmatics’ launch of Ursa – the world’s most accurate speech-to-text engine, which is 25% more accurate than OpenAI’s Whisper and 38% more accurate than Google. Speechmatics has doubled down on these capabilities to develop real-time translation, offering language pairs to and from English*, including German, Spanish, and Vietnamese. The all-in-one API can also translate multiple languages in one request – for example, a single audio stream can provide real-time English transcription and translation to Japanese, French, Hindi, Mandarin, and Korean simultaneously.
Speechmatics’ real-time transcription and now translation delivers the same level of accuracy as its pre-recorded (batch) service, as well as providing a sliding scale to enable customers to tailor the speed (latency) and/or accuracy to meet their needs. The all-in-one API streamlines processes and speeds up workflows for businesses by combining real-time transcription and translation in one API.
Businesses can reach a wider geographical audience across multiple industries where translating in real-time has previously been a challenging and costly task when completed manually by humans. Particularly for the broadcast industry – valued at over $300 billion in the US alone in 2022 – generating quick and highly accurate translated speech in one API unlocks the ability to caption live stream content and news for viewers from around the world. Similarly, for contact centres where scale is essential, contact centres can scale operations to handle multiple languages using cost-effective automation technology and offer improved customer experiences in native languages.
Damir Derd, Head of Sales Engineering at Speechmatics, said, “This is a landmark development for speech recognition technology, and we are proud to remain at the forefront of innovation, demonstrating the commitment to our mission to understand every voice. This new offering opens up a truly global market for our customers with almost instant translation from the spoken word. As demand from viewers in different regions increases for TV shows and broadcast, sports, events, podcasts, game streaming, YouTube and social media videos, the need for captioned videos in multiple languages has too. We are excited to launch this capability to our customers in the next few weeks and will be continuing to work towards adding even more languages and enabling the engine to translate between languages, so the default isn’t always English.”
Ken Frommert, President of ENCO, said, “Speechmatics provides the most accurate speech-to-text on the market for pre-recorded files and live streams. Adding real-time translation to its all-in-one API is game-changing for live broadcast captions. The ability to not only transcribe but now leverage Speechmatics to translate in real-time to provide highly accurate captions globally.”
Find original article here.