Testing the Massively Multilingual Speech (MMS) Model that Supports 1162 Languages

Massively Multilingual Speech (MMS)¹ is the latest release by Meta AI (just a few days ago). It pushes the boundaries of speech technology by expanding its reach from about 100 languages to over 1,000. This was achieved by building a single multilingual speech recognition model. The model can also identify over 4,000 languages, representing a 40-fold increase over previous capabilities. The MMS project aims to make it easier for people to access information and use devices in their preferred language. It expands text-to-speech and speech-to-text technology to underserved languages, continuing to reduce language barriers in our global world. Existing applications can now include a wider variety of languages, such as virtual assistants or voice-activated devices. At the same time, new use cases emerge in cross-cultural communication, for example, in messaging services or virtual and augmented reality. In this article, we will walk through the use of MMS for ASR in English and Portuguese and provide a step-by-step guide on setting up the environment to run the model. <a href="https://towardsdatascience.com/testing-the-massively-multilingual-speech-mms-model-that-supports-1162-languages-5db957ee1602">Click Here</a>