I have mentioned SDMs many times in the past.  These are, in effect, complex filter structures that are used to produce DSD and other bitstreams.  I know I talk about DSD a lot, and I also know that digital audio is way more about PCM that it it ever is – or ever will be – about DSD.  But, as I have already written, SDMs are in fact core to both ADCs and DACs and therefore also, I think, to resolving (or maybe just understanding) the debate concerning the relationship of DSD to PCM.  So I thought I would devote a post to an attempt to explain what SDMs are, how they work, and what their limitations are.  This will be doubly taxing, because I am far from being any sort of expert, and this is a deeply technical subject.  Finally, I will attempt to place the results of my ramblings in the context of the PCM-vs-DSD debate, with perhaps a surprising result.

The words Sigma and Delta refer to two Greek letters, ? and ?, which are used by convention in mathematics to denote addition (?) and subtraction (?).  Negative feedback, where the output signal is subtracted from the input signal, is a form of Delta modulation.  Similarly, in an unstable amplifier, where the output signal is inadvertently added to the input signal causing it to increase uncontrollably, this is a form of Sigma modulation.  Sigma Delta Modulators work by combining those functions into a single complex structure.  I use the term ‘structure’ intentionally, because SDMs can be implemented both in the analog domain (where they would be referred to as circuits) and in the digital domain (where they would be referred to as algorithms).  In this context, analog and digital refer only to the inputs of the SDM.  An SDM’s output is always digital.  For the remainder of this post I will refer only to digital SDMs, mainly because it is easier to describe.  But you should read it all as being equally applicable to the analog case.

At the core of an SDM lies the basic concept of a negative feedback loop.  This is where you take the output of the SDM and subtract it from its input.  We’ll call that the Delta stage.  If the output of the SDM is identical to its input, then the output of this Delta stage will always be zero.  Between the Delta Stage and the SDM output is a Sigma stage.  A Sigma stage works by maintaining an accumulated value to which it adds every input value it receives.  This accumulated value then becomes its output and therefore also the output of the SDM itself.  Therefore, so long as the output of the SDM remains identical to its input, the output of the Delta stage will always be zero, and consequently will continue to add zero to the accumulated output of the Sigma stage which will therefore also remain unchanged.  This is what we call the “steady-state case”.

But music is not steady-state.  It is always changing.  Let’s look at what happens when the input to the SDM increases slightly.  This results in a small difference between the input and the output of the SDM.  This difference appears at the output of the SDM’s Delta stage, and, consequently, at the input of it’s Sigma stage.  This causes the output of the Sigma stage to increase slightly.  The output of the Sigma stage is also the output of the SDM, and so the SDM’s output also increases slightly.  Now, the output of the SDM is once more identical to its input.  The same argument can be followed for a small decrease in the input to the SDM.  The SDM as described here is basically a structure whose output follows its input.  Which makes it a singularly useless construct.

So now we will modify the SDM described above in order to make it useful.  What we will do is to place a Quantizer between the output of the Sigma stage and the output of the SDM, so that the output of the SDM is now the quantized output of the Sigma stage.  This apparently minor change will have dramatic implications – for a start, this is what gives it its digital-only output.  To illustrate this, we will take it to its logical extreme.  Although we can choose to quantize the output to any bit depth we like, we will elect to quantize it to 1-bit, which means the output can only take on one of two values.  We’ll call these +1 and -1 although we will represent them digitally using the binary digits 1 and 0.  One result of this is that now the input and output values of the SDM will always be different, and the output of the Delta stage will never be zero.  The SDM is still trying to do the same job, which is to try to make the output signal as close as possible to the input signal.  However, since the output signal is now constrained to taking on only the values +1 or -1 it would appear that the SDM is going to flounder.

At this point, mathematics takes over, and it no longer becomes practical to reduce what I am going to describe to simple illustrative concepts.  I hope you will bear with me.

In order to understand what the SDM is actually doing, we need to make some sort of model.  In other words we’ll need a set of equations which describe the SDM’s behaviour.  By solving those equations we can then gain an understanding of what the SDM is and is not capable of doing.  There is a problem, though.  The quantizer introduces a non-linear element.  If we know what the input value to the quantizer is, we can determine precisely what the output value will be.  However, the opposite is not true.  If we know the output of the quantizer, we cannot deduce what the input value was that resulted in that output value.  The way we treat problems such as this is to consider the quantizer instead as a noise source.  We consider that we are instead adding noise (i.e. random values) to the output of the Sigma stage, such that the output values of the SDM end up being either +1 or -1.

The next thing we do is to observe that one thing we have said about how the SDM works is not entirely correct.  We said that at the input to the Delta stage we take the SDM’s input and subtract from it the SDM’s output.  In fact what we subtract is the SDM’s output at the previous time step.  This is very important, because it means that we can use this one-step delay to express the SDM’s behaviour in terms of a digital transfer function, using theories developed to understand how filters work.  I have mentioned such matters before in my previous posts on “Pole Dancing”.  Transfer functions allow you to calculate the structure’s frequency response, and when we apply this approach to the SDM we come up with two equations which we call the Signal Transfer Function (STF) and the Noise Transfer Function (NTF).  These are two very useful properties.

The STF tells us how much of the signal applied to the input of the SDM makes it through and appears in the output, whereas the NTF tells us how much of the quantization noise generated by the quantizer makes it to the the SDM’s output.  Both of these properties are strongly inter-related, and are strongly frequency dependent.  Generally, we would like to see STF~1 at low frequencies.  By contrast, we would like to have NTF~0 at the low frequencies but transition to NTF~1 at the high frequencies.  What exactly does all that gobbledygook mean?

The important thing is that at low frequencies we want the combination of STF~1 and NTF~0.  This means that at these low frequencies the output of the SDM contains all of the signal and none of the quantization noise.  However, at high frequencies we would like the opposite to be true, so that the output of the SDM contains none of the signal and all of the quantization noise.  If we can arrange it such that those so-called “low frequencies” actually comprises the audio frequency band, then our SDM can be capable of encoding that music signal with surprising precision even though the format has a bit depth of only 1-bit.  Analysis of the STF and NTF enables us to figure out how high the sample rate rate must be in order for the full 20kHz+ of the audio frequency bandwidth to fit into the “low frequency” part of the STF/NTF spectrum where sufficiently good performance can be obtained.  The answer, not surprisingly, is what drives DSD to use a sample rate of 2.8MHz.

A simpler way for the performance potential of this SDM to be viewed is to consider only the quantization noise.  This is nothing more than the difference between what the ideal (not quantized) output signal would look like and what the actual (quantized) output signal actually does looks like.  If those differences could be stripped off, then what we would end up with is the ideal output signal in all its glory.  What the NTF of the SDM has done is to arrange for all of those differences to be concentrated into a certain band of high frequencies which are quite separate from the audio frequency band containing the ideal output.  By the simple expedient of applying a suitable low-pass filter, we can filter them out completely, and thereby faithfully reconstruct the ideal output signal.

Unfortunately, the simplistic SDM I have just described is not quite up to the task I set for it.  The NTF is not good enough to meet our requirements.  In reality, there is a final step in the design of the SDM where we need to be able to fine tune the STF and NTF to acquire the characteristics needed to make a high-performance SDM.  What we do is to replace the Sigma modulator with a filter, which is generally termed the Loop Filter.  The transfer function of the loop filter then determines the actual STF and NTF of the final SDM.  Designing the SDM then becomes the task of designing the loop filter.  This is a big challenge.

In Part II I will discuss some of the limitations and challenges of SDM design, and conclude by attempting to place my observations in the context of the ongoing PCM-vs-DSD debate.