DSD VS PCM is the latest in a series of short educational primers from Yuri Korzunov. Do both formats have more commonalities than differences?
This article on DSD vs PCM commonalities was reproduced from the educational archives of Yuri’s Sample Rate Converter website with his kind permission.
All copyright belongs to said website and any reproduction of this information below can only be done with the express permission of Yuri Korzunov.
DSD and PCM are considered to be different audio file formats, as a general rule. However, I think, that PCM and DSD are really two sides of a single phenomenon.
DSD (sigma-delta modulation) is a 1-bit audio format that uses noise-shaping to increase dynamic range.
PCM (pulse code modulation) is a multi-bit music format that uses the number of bits to expand the dynamic range.
Dynamic range is the difference between maximal and minimal allowable levels of the transmitted signal.
The maximal signal level causes overload.
The minimal signal level is less than or equal to the quantization noise level (a.k.a. noise floor). Signal “under” high levels of noise will lead to the listener having a poor perception of the quality of the audio particularly in terms of clarity.
Simplified dynamic range
The simplified dynamic range is the difference between the maximal level and the minimum level of the noise floor.
Warning: It is not technically the correct definition. However, in the context of this article, we will use it for easier understanding.
We can discuss dynamic range as a common base for both PCM and DSD. Dynamic range is a single matter for different bit depths.
The maximum level in digital audio is accepted as 0 dB. It does not depend on bit depth. However, the noise floor or the minimum level does depend on the bit resolution.
From PCM to DSD
As an example, we have a certain noise level for some PCM bit depths. After the bit resolution is reduced the noise level gets higher and the dynamic range is thus reduced.
Bit depth truncating (from PCM to DSD)
However, the noise level may be kept in a limited useful frequency band. Noise energy, that grows due to bit depth truncating, may be pushed out of the band via noise shaping.
Noise shaping, PCM to DSD transformation
Thus it becomes very similar to DSD. And, yes! It is a real multibit DSD!
From DSD to PCM
Now let me try to come from DSD to PCM. Here we also have given noise level. Noise shaping may be characterized by the steepness. Both total and useful audio signal bands define the steepness.
Noise shaping steepness
Steeper noise shaping may be used to expand the useful band. Steepness can cause instability in sigma-delta modulators (noise shapers). So less steepness may be a good thing.
Adding a bit to the sample will decrease the noise level. So noise shaper’s steepness may be decreased too. The lower the steepness the less noise energy creeps into the useful bandwidth.
Adding bit – lesser the steepness, DSD to PCM transform
At last, the noise floor becomes flat. So the DSD signal is thus transformed into PCM.
Sample Rate Issue
DSD has a significantly higher sample rate than PCM. This is because a band reserve is required to push excess noise energy out of the useful frequency band.
However, even simple band expansion causes lesser noise levels because quantization noise energy is constant and distributed in the full band. Graphically, noise energy is the square of the noise spectrum. Consider it in the same way you consider a rectangle. If its width is expanded, height (noise level) will diminish in ratio.
Sample rate and quantization noise level
Two times band expansion decreases noise level at a rate of 6 dB.
So we can say, that DSD decreases noise levels twice:
by sample rate and
by noise shaping.
DSD and PCM are two sides of the same thing – transmitting a digital signal with a given dynamic range.
To transform DSD to PCM we need to increase the number of bits and remove noise shaping.
To transform PCM to DSD we need to diminish the number of bits with noise shaping and sample rate extension.
In my opinion, the most obvious DSD sign is a partial usage of the total signal band (reserve) to transmit the signal and push the noise energy out of the useful frequency band.