The team of data scientists Waverley partners with applied neural networks and deep learning to develop an innovative speech recognition tool that can potentially work with all languages and dialects by converting sounds into texts while maintaining the high level of accuracy.
The team of data scientists developed an innovative AI-based automatic speech recognition software that can be used as a single tool to recognise any language and its dialects and even beats Google Cloud Speech API in terms of the accuracy of the results.
The engineers applied deep learning methods and neural networks, training the platform on publicly available datasets. For now it recognizes only English and its dialects, but the technique itself can be used to work with any languages because it’s brilliant in its simplicity – converting sounds into text. The platform is easy to use: the user uploads an audio file of any format with the recorded speech and receives a ready script a few moments later. To verify the program’s accuracy, the engineers tested it on English-speaking Christmas messages from the world leaders and measured the error rate, comparing it to the results achieved by Google Speech API, as shown below.
There are numerous application options for this cutting edge technology, however the main one that the team was targeted at is the healthcare industry. Modern doctors spend a lot of their precious time filling out the forms and maintaining the patient’s medical history. A technology like this could follow the conversation between a patient and a physician and fill the forms automatically thus streamlining and enhancing the actual patient care.