Amazon Reduces Alexa Footprint with End-to-End Speech Recognition Models

Amazon has confirmed that it is using end-to-end models to improve the speech recognition capabilities of the Alexa platform. With an end-to-end model, the entire speech recognition process can be completed on the device itself, from speech input all the way through to output and transcription. That contrasts with previous versions of Alexa, which processed data in the cloud because the models were too big to install on a standalone device.

Those earlier iterations of Alexa broke speech down into multiple components, such as acoustics and the actual language, each of which had to be processed with a separate model. The new version, on the other hand, is able to process speech as a single cohesive entity.

“With an end-to-end model, you end up getting away from having these separate pieces and end up with a combined neural network,” said Automatic Speech Recognition Head Shehzad Mevawalla in an interview with VentureBeat. “You’re going from gigabytes down to less than 100MB in size. That allows us to run these things in very constrained spaces.”

Despite the smaller footprint, the new Alexa model still needs to be paired with an on-device accelerator to deliver the expected performance speeds. With that in mind, Amazon has teamed up with MediaTek to develop the AZ1 Neural Edge processor, which has been deployed in the latest versions of Amazon’s various Echo devices.

According to Mevawalla, end-to-end models have also enhanced Alexa’s ability to identify individual speakers. The Natural Turn Taking feature is able to filter Alexa requests from regular background noise, and to use a camera to determine whether the speaker is directing their comments to Alexa or to a person or another device somewhere else in the room. The feature will still function without a camera, but is more accurate in devices that can capture video.

Mevawalla went on to claim that the use of end-to-end models has improved the accuracy of Alexa by as much as 25 percent. However, Natural Turn Taking will only be available in English when it debuts in 2021.

Amazon recently accredited Kudelski IoT Labs to test products with built-in Alexa capabilities. The tech giant is one of several companies working toward on-device speech and voice recognition. Frost & Sullivan has predicted that car manufacturers will prioritize hybrid voice assistants, while NXP has released a new MCU that will support offline voice recognition in IoT devices.

Source: VentureBeat

Partners

FaceTec’s patented, industry-leading 3D Face Verification and Reverification software anchors digital identity, creating a chain of trust from user onboarding to ongoing authentication on all modern smart devices and webcams. FaceTec’s 3D FaceMaps™ finally make trusted, remote identity verification possible. As the only technology backed by a persistent spoof bounty program and NIST/iBeta Certified Liveness Detection, FaceTec is the global standard for 3D Liveness and Face Matching with millions of users on six continents in financial services, border security, transportation, blockchain, e-voting, social networks, online dating and more. www.facetec.com

The Biometric Digital Identity Prism is a market landscape framework designed to help influencers and decision makers understand, innovate, and implement digital identity technologies and solutions. This innovative framework for understanding and evaluating the rapidly evolving biometric digital identity marketplace is the only market model that is truly biometric-centric based on the foundational conviction that in the age of digital transformation the only true, reliable link between humans and their digital data is biometrics. https://www.the-prism-project.com

Mobile ID World is here to bring you the latest in mobile authentication solutions and application providers. Our company is dedicated to providing users with the best content and cutting edge information on technology, news, and mobile solutions for your mobile identity management needs. https://mobileidworld.com

ID R&D combines extensive R&D capabilities with advances in AI to deliver superior biometrics and liveness detection software. Our products work across mobile, web, and telephone channels, as well as conversational interfaces, IoT devices, and embedded hardware to improve security and significantly reduce friction in the user experience. https://www.idrnd.ai

AuthenticID’s disruptive and cutting-edge, AI-driven solution quickly, accurately, and securely reproduces real-world identity verification so that companies can be assured of who they are conducting business with, strengthen underwriting, reduce the losses associated with fraud, and streamline onerous customer onboarding procedures, leading to higher conversion rates. https://www.authenticid.com/

Identity Week aims to be a significant identity industry catalyst. It’s our mission is to help accelerate the move towards a world where trusted identity solutions enable governments and commercial organisations to provide citizens, employees, customers and consumers with a multitude of opportunities to transact in a seamless, yet secure manner. All the while preventing the efforts of those intent on doing harm. https://identityweek.net/

Related News & Articles

Footer

Follow Us