Sample rate conversion

From Wikipedia, the free encyclopedia

Scaling (signal) redirects here. It may also refer to upsampling or downsampling.

Sample rate conversion is the process of converting a (usually digital) signal from one sampling rate to another, while changing the information carried by the signal as little as possible. When applied to an image, this process is often called scaling.

Sample rate conversion is needed because different systems use different sampling rates, for engineering, economic, or historical reasons. The physics of sampling merely sets minimum sampling rate (an analog signal can be sampled at any rate above twice the highest frequency contained in the signal, see Nyquist frequency), and so other factors determine the actual rates used. For example, different audio systems use different rates of 44.1, 48, and 96 kHz. As another example, American television, European television, and movies all use different numbers of frames per second. Users would like to transfer source material between these systems. Just replaying the existing data at the new rate will not normally work — it introduces large changes in pitch (for audio) and movement as well (for video), plus it cannot be done in real time. Hence sample rate conversion is required.

Two basic approaches are:

Convert to analog, then re-sample at the new rate.
Digital signal processing — compute the values of the new samples from the old samples.

Modern systems almost all use the latter since this method introduces less noise and distortion and is more practical given today’s modern processing power.

Probably the most famous example of analog rate conversion was converting the slow-scan TV signals from the Apollo moon missions to the conventional TV rates for the viewers at home.

1 Digital sample rate conversion
2 Example
3 References
4 See also
5 External links

[edit] Digital sample rate conversion

There are at least two ways to perform digital sample rate conversion:

(a) If the two frequencies are in a fixed ratio, the conversion can be done as follows: Let F = least common multiple of the two frequencies. Generate a signal sampled at F by interpolating 0s in the original sample. This will also introduce aliases at multiples of the baseband frequency. Remove these with a digital low pass filter, until only the signals with less than half of the output sample frequency remain. Then reduce the sample rate by discarding the appropriate samples.

(b) Another approach is to treat the samples as a time series, and create any needed new points by interpolation. In theory any interpolation method can be used, though linear (for simplicity) and a truncated sinc function (from theory) are most common.

Although the two approaches seem very different, they are mathematically identical. Picking an interpolation function in the second scheme is equivalent to picking the impulse response of the digital filter in the first scheme. Linear interpolation is equivalent to a triangular impulse response; sinc() will be an approximation to a brick wall filter (it approaches the desirable "brick wall" filter as the number of points increase).

If the sample rate ratios are known, fixed, and rational, method (a) is better, in theory. The length of the impulse response of the filter in (a) is the same as choosing the number of points used in interpolation in (b). In approach (a), a slow precomputation such as the Remez algorithm can be used to compute the "best" response possible given the number of points (best in terms of peak error in various frequency bands, and so on). Note that a truncated sinc() function, though correct in the limit of an infinite number of points, is not the most accurate filter for a finite number of points.

However, method (b) will work in more general cases, where the sample rate ratios are not rational, or two real time streams must be accommodated, or the sample rates are time varying.

Normally, due to the mathematical operations employed, the output samples of sample rate conversion are almost always computed to more precision than the output format can hold. Conversion to the output bit size can be done by simple rounding, or more sophisticated methods such as dither or noise shaping can be employed.

[edit] Example

CDs are sampled at 44.1 kHz, but a Digital Audio Tape, or DAT is usually sampled at 48 kHz. How can material be converted from one sample rate to the other? First, note that 44.1 and 48 are in the ratio 147/160. Therefore to convert from 44.1 to 48, for example, the process is (conceptually):

Interpolate 159 zeros between every input sample. This raises the data rate to 7.056 MHz, the least common multiple of 44.1 and 48 kHz. Since this interpolation is equivalent to reconstructing with Dirac delta functions, it also creates images of frequency f at 44.1−f, 44.1+f, 88.2−f, 88.2+f, ...
Remove the images with a digital filter, leaving a signal containing only 0–20 kHz information, but still sampled at a rate of 7.056 MHz.
Discard 146 of every 147 output samples. It does not hurt to do so since the signal now has no significant content above 24 kHz. In practice, of course, there is no reason to compute the values of the samples that will be discarded.

This requires a digital filter (almost always an FIR filter since these can be designed to have no phase distortion) that is flat to 20 kHz, and down at least x dB at 24 kHz. How big does x need to be? A first impression might be about 100 dB, since the maximum signal size is roughly ±32767, and the input quantization ±1/2, so the input had a signal to broadband noise ratio of 98 dB at most. However, the noise in the stopband (20 kHz to 3.5 MHz) is all folded into the passband by the decimation in the third step, so another 22 dB (that's a ratio of 160:1 expressed in dB) of stopband rejection is required to account for the noise folding. Thus 120 dB rejection yields a broadband noise roughly equal to the original quantizing noise.

There is no requirement that the resampling in the ratio 160:147 all be done in one step. Using the same example, we could re-sample the original at a ratio of 10:7, then 8:7, then 2:3 (or do these in any order that does not reduce the sample rate below the initial or final rates, or use any other factorization of the ratios). There may be various technical reasons for using a single step or multi-step process — typically the single step process involves less total computation but requires more coefficient storage.