Microsoft Reaches Human Parity in Speech Recognition Tech

“The pioneering system was developed using Microsoft’s Computational Network Toolkit, an open source deep learning system upon which Microsoft’s speech recognition system was able to train its neural networks.”

“We’ve reached human parity,” Microsoft’s head speech scientist, Xuedong Huang, has proclaimed, referring to the company’s speech recognition technology.

Microsoft Reaches Human Parity in Speech Recognition TechIt’s Xuedong Huang’s assessment of the latest testing from Microsoft Artificial Intelligence and Research, which has announced in a new paper that its speech recognition technology now has attained an equal or better word error rate in comparison to human transcriptionists. The technology’s word error rate has dropped from 6.3 percent to 5.9 percent – the lowest such rate ever recorded using Switchboard, an industry standard test.

The pioneering system was developed using Microsoft’s Computational Network Toolkit, an open source deep learning system upon which Microsoft’s speech recognition system was able to train its neural networks. The research team responsible says its goal now is to improve the system’s functionality in real world settings, with background noise and regional accents challenging its performance.

The work could prove critical going forward, given the rising importance of speech recognition technology. It’s widely expected that such technology could provide the primary user interface for connected devices associated with the Internet of Things, and other major tech companies such as Google and Apple are stepping up their own investments in the technology accordingly.