Sunday, September 11, 2011

Digital Audio

Sound is analog. The sound of a guitar string being plucked is a smooth, continuous, curved wave, close to an ideal sine wave.

The pitch or frequency of sound is measured in Hertz, Hz, or cycles per second (cps). Humans can normally hear sound in the range of twenty to twenty thousand cycles per second, 20 Hz to 20kHz (kiloHertz).

Digitizing (or quantizing) means taking a series of measurements of the sound wave, and converting those measurements to a series of numbers.

Digitizing requires samples to measure. Sampling accuracy determines the fidelity, or how closely the digitized data resembles the original sound. Sampling accuracy depends on the frequency or sampling rate (how often a sample is taken), and bit depth, or how accurately each sample is measured.

The Nyquist limit states that the sampling rate must be at least twice the original sound frequency. To accurately digitize the highest pitch humans can hear, 20 kHz, the sampling rate must be at least 40 kHz. The sampling rate for audio CDs is 44.1 kHz; camcorders usually sample at 48 kHz. The lab's R-09HR recorders can sample up to 96 kHz. The higher the sampling rate, the better the fidelity, but the larger the data file.

Digitizing must produce a set of numbers a computer can read. Computers use binary, a series of ones and zeros, representing the on or off state of a switch. Each digit is called a bit.

Since a switch has two states, binary numbers are based on powers of 2. Each digit represents another power of two. The first ten powers of two equal 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024. (These numbers are also used to measure memory capacity, so some of them will be familiar from the labels on flash drives or other computer media.)

Bit depth is the number of bits that can be used to measure a sample. A bit depth of 3 (2 to the 3rd power, or 2x2x2=8) would only have 8 levels between silence and maximum. This would give very poor fidelity. A bit depth of 10 (2 to the tenth power) would have 1024 levels, and give much better fidelity. Camcorders generally sample sound at 16-bit. The lab's R-09HR recorders can sample at either 16-bit or 24-bit. 16-bit has 64k, or over 64 thousand levels; 24-bit has 16M, or over 16 million levels. Like sampling rate, the larger the bit depth, the better the fidelity, but the larger the data file.

Resampling is the process of changing a digitized sample from one bit depth or sampling rate to another. For example, you might have to resample a CD audio recording from 44.1 kHz to 48 kHz to make it compatible with a video project you are editing. This usually causes a loss of fidelity, so it is important to make the original recording with settings that will not require resampling. Generally, you should record sound at the same settings as the video camera you will be using. 16-bit, 48 kHz is the most common setting at present.

Digitized sound data can be stored in a variety of file formats. WAV is a common audio file format that can be used on both Windows and MacOS platforms. WMA is a Windows format that must be converted before it can be used on MacOS. AIFF is a native MacOS format that can be imported by Audacity on Windows as well. All these formats have the option of being uncompressed, that is, all the original data is intact.

Another common audio file format is MP3, which is a contraction of Moving Picture Experts Group - Level 3, or MPEG-3. This was originally developed to encode sound on DVDs. It is a lossy compression format, which means it throws away part of the sound data so the file size will be smaller. Once a digitized sound is compressed in the MP3 format, the original sound is lost and cannot be reconstructed. If an MP3 file is edited and saved again, it loses even more data to compression, like making a photocopy of a photocopy.

You should always make original recordings in an uncompressed, lossless file format so you can edit them without losing fidelity. Lossy compression formats like MP3 should only be used for distributing your finished work. The lab's R-09HR recorders can record in both WAV and MP3; make sure you set the recorder to WAV for your recordings.