G711! G729! Hike! Hike! Hike!


Lets start with a little technical mumbo jumbo.  Most of the energy of human speech is contained within a bandwidth range of 300-4000 Hz (cycles/second) so when the telephone network processes your voice, the first thing it does is constrain the audio it is processing to that frequency range.   It is now ready to digitize your speech for transmission.  The traditional codec for speech has been U-LawG.711, using the Nyquist formula.  Which states that you can accurately digitize an analog signal if you sample it at twice the maximum frequency.  So… 2 x 4000 Hz = 8000 samples per second – if each sample is encoded as an 8 bit byte, that 8 bits x 8000 samples/second = 64,000 bits a second to encode full uncompressed speech.   That is the gold standard of speech quality, G.711, 64 Kb/s. 

There is however a problem.  Bandwidth can be expensive, creating a need for compressed speech that requires less bandwidth.  One commonly used method or codec is called G.729.  Codecs that compress speech operate using a couple of principles.  The first is that human speech and its analog wave forms change relatively slowly and somewhat predictably.  The other principle is the idea that once you have a point of reference to work from, you can simply send the delta changes in the wave form.  You do not have to send the entire audible signal with each byte.  This means that you  can achieve a considerable improvement or decrease in the necessary bandwidth needed to send human speech.  This is advantageous for international circuits, and for VOIP where there is a great deal of other traffic competing for the available bandwidth. 

So which codec is ‘better’?  G.711 or G.729?  For my money, I can’t hear the difference between G.711 and G.729 – UNLESS – you are trying to pass music over the call.  Music will not sound very good when using G.729.   This phenomena is the result of what I explained earlier.  Speech compression works on the principles of slow predictive changes to the analog wave form.  Music is far more dynamic and consequently the G.729 algorithm simply cannot accurately predict and encode music in to a signal that we consider pleasant and harmonious.  Music and even ringing may sound broken and of poor quality, but when speech is resumed it will function perfectly.

InContact VOIP services support both G.711 and G.729 speech encoding to satisfy what ever your speech requirements are.  If you are interested in VOIP or have questions, please contact your CSM and have them put you in touch with a VOIP Sales Engineer.