Speech Analytics vs. Word Spotting

Speech analytics is a hot topic in the contact center market. Many of our peers in the industry – clients, prospects, and business partners – come to us looking for information on this exciting new technology. They are interested in call recording, but they want to use speech analytics in their quality monitoring program to quickly determine which calls should be thoroughly evaluated. A lot of people I talk to only understand half of what speech analytics can do, and the other half of their understanding usually involves a lot of things it can’t do…yet. 

With analytics, the real value is not just finding a keyword or phrase (word spotting), but in understanding the context in which that key word or phrase is used. For example, key word spotting may tell you when a competitor’s name is mentioned, but what you really need to know is why that competitor’s name is mentioned. True speech analytics does this, looking at the surrounding language and determining if there are indicators for churn or praise, for instance. There is a big difference in a customer saying “I am leaving you to go to competitor X” as opposed to “I am staying with you because you are so much better than competitor X.” The competitor’s name is mentioned in both circumstances, but the data has little meaning until you can determine why it was mentioned. If you are not able to answer the latter question, you are missing the value afforded through speech analytics and you’ll end up swimming directionless in a sea of data.

One if the biggest misconceptions I’ve come across is the tendency to confuse speech analytics with a dictation machine. Training a technology to understand your voice for dictation is very different from a technology that understands millions of voices, each with different accents, colloquialisms, and vocal undertones. The variation in the voices and the increased vocabulary raises the complexity exponentially, which means more servers, processors, and time to complete the analysis. Speech analytics is not yet to the point where you can confidently “read” the full content of a recorded call. And if you could, would you want to? Remember, spoken language is different from written language. You do not have the benefit of punctuation, tonal inflections (such as sarcasm), and general grammar. Sometimes your ear is the best tool for the job!