Amazon Reduces Alexa Footprint with End-to-End Speech Recognition Models

Amazon has confirmed that it is using end-to-end models to improve the speech recognition capabilities of the Alexa platform. With an end-to-end model, the entire speech recognition process can be completed on the device itself, from speech input all the way through to output and transcription. That contrasts with previous versions of Alexa, which processed data in the cloud because the models were too big to install on a standalone device.

Those earlier iterations of Alexa broke speech down into multiple components, such as acoustics and the actual language, each of which had to be processed with a separate model. The new version, on the other hand, is able to process speech as a single cohesive entity.

“With an end-to-end model, you end up getting away from having these separate pieces and end up with a combined neural network,” said Automatic Speech Recognition Head Shehzad Mevawalla in an interview with VentureBeat. “You’re going from gigabytes down to less than 100MB in size. That allows us to run these things in very constrained spaces.”

Despite the smaller footprint, the new Alexa model still needs to be paired with an on-device accelerator to deliver the expected performance speeds. With that in mind, Amazon has teamed up with MediaTek to develop the AZ1 Neural Edge processor, which has been deployed in the latest versions of Amazon’s various Echo devices.

According to Mevawalla, end-to-end models have also enhanced Alexa’s ability to identify individual speakers. The Natural Turn Taking feature is able to filter Alexa requests from regular background noise, and to use a camera to determine whether the speaker is directing their comments to Alexa or to a person or another device somewhere else in the room. The feature will still function without a camera, but is more accurate in devices that can capture video.

Mevawalla went on to claim that the use of end-to-end models has improved the accuracy of Alexa by as much as 25 percent. However, Natural Turn Taking will only be available in English when it debuts in 2021.

Amazon recently accredited Kudelski IoT Labs to test products with built-in Alexa capabilities. The tech giant is one of several companies working toward on-device speech and voice recognition. Frost & Sullivan has predicted that car manufacturers will prioritize hybrid voice assistants, while NXP has released a new MCU that will support offline voice recognition in IoT devices.

Source: VentureBeat

Partners

FaceTec’s patented, industry-leading 3D Face Verification and Reverification software anchors digital identity, creating a chain of trust from user onboarding to ongoing authentication on all modern smart devices and webcams. FaceTec’s 3D FaceMaps™ finally make trusted, remote identity verification possible. As the only technology backed by a persistent spoof bounty program and NIST/iBeta Certified Liveness Detection, FaceTec is the global standard for 3D Liveness and Face Matching with millions of users on six continents in financial services, border, transportation, blockchain, e-voting, social networks, online dating and more. www.facetec.com

Oz Forensics is the independent private vendor of robust, technology-based, and AI-powered liveness detection and face-matching solutions founded in 2017 and headquartered in Dubai, UAE. We confirm the security level of our solution by certifying the ISO-30107 Level 1 and 2 standards. https://ozforensics.com/

Keyless is the leader in privacy-preserving biometric authentication, trusted by banks, fintechs, crypto platforms, and gaming companies to reduce account takeovers, secure high-risk actions, and improve operational efficiency. Available via app and web, its unique Zero-Knowledge Biometrics™ technology delivers multi-factor authentication in one glance in 300 milliseconds without storing biometric data anywhere. Keyless is ISO 27001 and ISO 30107 accredited and is the only company to hold both FIDO Biometrics and FIDO2 certifications. Find out more

AuthenticID provides 100% automated identity verification and fraud detection solutions that are leveraged by companies worldwide, including 2 of the top 3 U.S. Banks, 8 out the top 10 wireless providers in North America, and 2 of the 3 credit bureaus. Using proprietary computer vision and machine learning technology, these solutions help companies accurately verify the identity of their users across retail, digital and call center environments for onboarding and ongoing re-authentication events; KYC, IAM, and more. The solutions are easy to integrate and provide customers a large ROI by stopping fraud losses, increasing customer conversion at onboarding, reducing operational costs and allowing quick and cost-effective operational scalability, all while ensuring global privacy regulations are complied with. https://www.authenticid.com/

Founded in 2007, Lakota Software Solutions is an American company with a world-renowned reputation for developing robust biometric software and systems. Our vendor-agnostic products are tailored to ensure compliance with ANSI/NIST-ITL standards and EBTS specifications, facilitate seamless integration with other biometric systems, and optimize accuracy, cost-effectiveness, and scalability. https://lakotasoftware.com/

Identity Week aims to be a significant identity industry catalyst. It’s our mission is to help accelerate the move towards a world where trusted identity solutions enable governments and commercial organisations to provide citizens, employees, customers and consumers with a multitude of opportunities to transact in a seamless, yet secure manner. All the while preventing the efforts of those intent on doing harm. https://identityweek.net/

The Biometric Digital Identity Prism is a market landscape framework designed to help influencers and decision makers understand, innovate, and implement digital identity technologies and solutions. This innovative framework for understanding and evaluating the rapidly evolving biometric digital identity marketplace is the only market model that is truly biometric-centric based on the foundational conviction that in the age of digital transformation the only true, reliable link between humans and their digital data is biometrics. https://www.the-prism-project.com

Related News & Articles

Footer

Follow Us