Microsoft Speech Recognition Reaches Human Parity (Again)

“It is a big deal, to be sure. Speech is becoming an increasingly important user interface on smartphones and devices associated with the Internet of Things, and various companies are investing heavily in technological advancement in this area.”

Microsoft has reached another new milestone in its speech recognition technology, with head speech scientist Xuedong Huang proclaiming that it has reached an error rate of 5.1 percent in a new post on the Microsoft Research Blog.Microsoft Speech Recognition Reaches Human Parity (Again)

The new record bests Microsoft’s achievement last autumn, when its speech recognition technology reached a word error rate of 5.9 percent; and it beats IBM’s word error rate of 5.5 percent from this past March.

Commenting on the achievement, Huang notes that while Microsoft had previously considered 5.9 percent to be human parity for speech recognition, “other researchers conducted their own study, employing a more involved multi-transcriber process, which yielded a 5.1 human parity word error rate.” Hence the celebration of Microsoft’s new 5.1 percent record, affording the company its second opportunity to proclaim that it has achieved human parity in its speech recognition technology.

It is a big deal, to be sure. Speech is becoming an increasingly important user interface on smartphones and devices associated with the Internet of Things, and various companies are investing heavily in technological advancement in this area. With its latest achievement, Microsoft can boast not only of having field-leading speech recognition, but the machine learning and cloud computing tools – Microsoft Cognitive Toolkit and Azure GPUs, namely – to reduce its word error rate by 12 percent over the past year.

Source: Microsoft Research Blog