Monthly Archives: April 2013

From some of my recent posts you will have observed that BitPerfect has been heavily involved in DSD over the past several weeks.  DSD is a form of Sigma-Delta Modulation (SDM), which, as I have pointed out, is a mathematically challenging concept.  Just to grasp its most basic form is quite an achievement in its own right, but as soon as you think you have got your head around it you learn that there are yet further wrinkles you need to understand, and it just goes and on and on and on.  It is very dense in its reliance on mathematics, and in fact you could earn a PhD studying and developing ever better forms of SDM, or coming up with newer and deeper understandings regarding distortion, stability, and the like.

For BitPerfect, we have been looking to find some “grown-up help”, in the form of a person or persons in the world of academia who can (a) help us to better understand the concepts; (b) help us to steer a path through the state-of-the-art in terms of both current implementations and the latest theoretical developments; (c) help us to avoid re-inventing wheels wherever possible; and (d) simply help to sort out facts from nonsense.  The last one of these is quite important – more so than you might imagine – because there is a lot of nonsense out there, mixed in with all the facts, and you really don’t want to waste brain cycles on any of it.

You would think it would be easy to develop the sort of relationships we are looking for, but not so.  Facts and nonsense still get in the way.  Take the Nutty Professor I recently met with.  This gentleman is head of faculty of a group which calls itself something along the lines of Faculty of Digital Music Technology (I’m not going to identify this person).  Our conversation got off on the wrong foot when, right off the bat, he insisted that DSD and PCM were in essence the same thing, and that you could losslessly convert between one format and the other (such as between FLAC and Apple Lossless, for example).  In his view, both were simply digital storage formats and so they HAD to have direct equivalence.  He was quite adamant about this, but didn’t want to justify it.  I was to accept it as a fact.  Since a significant element in what I was looking for was clarity of thought on matters such as precisely this, I came away from the encounter somewhat disappointed.  At that point in time I wished I had the necessary understanding to present at least a simple argument to the Nutty Professor to counter his position, but I didn’t have one.

Today, I do – which I think is sufficiently elegant that I want to share it with you.  And I don’t think you need a background in mathematics to grasp it.

Refer to the graph below.   I have plotted signal-to-noise ratio (SNR) as a function of frequency.  The red line is a curve which is typical of DSD.  The SNR is very low across the frequency range that is important for high quality music playback (20Hz – 20kHz), and rises very dramatically at higher frequencies.  This is the famous Noise Shaping (that I described in yesterday’s post) in action.  Superimposed upon that is the blue line representing PCM in its 24-bit 88.2kHz form.  One simple way to interpret these curves is that each format is capable of fully encoding any musical signals at any points above the SNR, and is incapable of fully encoding anything below the line.

Suppose we have music encoded in the DSD format, and we convert it to 24/88.2 PCM format.  If we do this, all of the musical information represented by the hashed region labeled [A] must by necessity be lost.  This information is encoded into the DSD data stream, but cannot be represented by the PCM data stream.  Likewise, suppose we convert the 24/88.2 PCM data stream to DSD.  In this case, all of the musical information represented by the hashed region labeled [B] must by necessity be lost.  This information is encoded into the PCM data stream, but cannot be represented by the DSD data stream.  Regardless of whether we are converting DSD to PCM or the other way round, information is being lost.

Of course, there is an argument to be made regarding lost information, that if it represents something inaudible, then we can afford to throw it away.  In the example I have shown, the information contained in both [A] and [B] regions are arguably inaudible.  But don’t tell me that the conversion is lossless.  With a computer it is quite trivial to convert back and forth as often as you like between FLAC and Apple Lossless.  You can do it hundreds, thousands, even millions of times (if you are prepared to wait) and the music will remain unchanged.  Do the same thing between DSD and 24/88.2 PCM, and even after a hundred cycles the music will be all but unlistenable.

The Nutty Professor will not be advising BitPerfect.

From some of my recent posts you will have observed that BitPerfect has been heavily involved in DSD over the past several weeks.  DSD is a form of Sigma-Delta Modulation (SDM), which, as I have pointed out, is a mathematically challenging concept.  Just to grasp its most basic form is quite an achievement in its own right, but as soon as you think you have got your head around it you learn that there are yet further wrinkles you need to understand, and it just goes and on and on and on.  It is very dense in its reliance on mathematics, and in fact you could earn a PhD studying and developing ever better forms of SDM, or coming up with newer and deeper understandings regarding distortion, stability, and the like.

For BitPerfect, we have been looking to find some “grown-up help”, in the form of a person or persons in the world of academia who can (a) help us to better understand the concepts; (b) help us to steer a path through the state-of-the-art in terms of both current implementations and the latest theoretical developments; (c) help us to avoid re-inventing wheels wherever possible; and (d) simply help to sort out facts from nonsense.  The last one of these is quite important – more so than you might imagine – because there is a lot of nonsense out there, mixed in with all the facts, and you really don’t want to waste brain cycles on any of it.

You would think it would be easy to develop the sort of relationships we are looking for, but not so.  Facts and nonsense still get in the way.  Take the Nutty Professor I recently met with.  This gentleman is head of faculty of a group which calls itself something along the lines of Faculty of Digital Music Technology (I’m not going to identify this person).  Our conversation got off on the wrong foot when, right off the bat, he insisted that DSD and PCM were in essence the same thing, and that you could losslessly convert between one format and the other (such as between FLAC and Apple Lossless, for example).  In his view, both were simply digital storage formats and so they HAD to have direct equivalence.  He was quite adamant about this, but didn’t want to justify it.  I was to accept it as a fact.  Since a significant element in what I was looking for was clarity of thought on matters such as precisely this, I came away from the encounter somewhat disappointed.  At that point in time I wished I had the necessary understanding to present at least a simple argument to the Nutty Professor to counter his position, but I didn’t have one.

Today, I do – which I think is sufficiently elegant that I want to share it with you.  And I don’t think you need a background in mathematics to grasp it.

Refer to the graph below.   I have plotted signal-to-noise ratio (SNR) as a function of frequency.  The red line is a curve which is typical of DSD.  The SNR is very low across the frequency range that is important for high quality music playback (20Hz – 20kHz), and rises very dramatically at higher frequencies.  This is the famous Noise Shaping (that I described in yesterday’s post) in action.  Superimposed upon that is the blue line representing PCM in its 24-bit 88.2kHz form.  One simple way to interpret these curves is that each format is capable of fully encoding any musical signals at any points above the SNR, and is incapable of fully encoding anything below the line.

Suppose we have music encoded in the DSD format, and we convert it to 24/88.2 PCM format.  If we do this, all of the musical information represented by the hashed region labeled [A] must by necessity be lost.  This information is encoded into the DSD data stream, but cannot be represented by the PCM data stream.  Likewise, suppose we convert the 24/88.2 PCM data stream to DSD.  In this case, all of the musical information represented by the hashed region labeled [B] must by necessity be lost.  This information is encoded into the PCM data stream, but cannot be represented by the DSD data stream.  Regardless of whether we are converting DSD to PCM or the other way round, information is being lost.

Of course, there is an argument to be made regarding lost information, that if it represents something inaudible, then we can afford to throw it away.  In the example I have shown, the information contained in both [A] and [B] regions are arguably inaudible.  But don’t tell me that the conversion is lossless.  With a computer it is quite trivial to convert back and forth as often as you like between FLAC and Apple Lossless.  You can do it hundreds, thousands, even millions of times (if you are prepared to wait) and the music will remain unchanged.  Do the same thing between DSD and 24/88.2 PCM, and even after a hundred cycles the music will be all but unlistenable.

The Nutty Professor will not be advising BitPerfect.

People often ask what dithering does.  Most seem to know that it involves adding random noise to digital music for the apparently contradictory purpose of making it sound better, but don’t know how it accomplishes that.  On the other hand, very few people understand what Noise Shaping is and what it actually does – or that it is in reality a form of dithering.  Since neither concept is particularly difficult to grasp, I thought you might appreciate a short post on the subject.  I warn you, though, there are going to be some NUMBERS involved, so you might want to grab a pencil and a piece of paper to hand.

Suppose we encode a music signal as a PCM data stream (read my earlier post “So you think you understand Digital Audio?” if you are unsure how that works).  Each PCM “sample” represents the magnitude of the musical waveform at the instant it is measured, and its value is stored as that of the nearest “quantized level”.  These “quantized levels” are the limited set of values that the stored data can take, and so there will be some sort of error associated with quantization.  This “quantization error” is the difference between the actual signal value and the stored “quantized” value.  For example, suppose the value of the signal is 0.5004 and the two closest quantization levels are 0.500 and 0.501.  If we store the PCM sample as 0.500 then the associated quantization error is +0.0004.  If we had chosen to store the PCM sample as 0.501 then the associated quantization error would have been -0.0006.  Since the former is less of an error than the latter, it seems obvious that the former is the more accurate representation of the original waveform.  Are you with me so far?

One way to look at the PCM encoding process is to think of it as storing the exact music signal, plus an added error signal, comprising all of the quantization errors.  The quantization errors are the Noise or Distortion introduced by the quantization process.  The distinction between noise and distortion are critically important here.  The difference between the two is that distortion is related to the underlying signal (the term we use is “Correlated”), whereas noise is not.

I am going to go out of my way here to give a specific numerical example because it is quite important to grasp the notion of Correlation.  I am going to give a whole load of numbers, and it would be best if you had a pencil and paper handy to sketch them out in rough graphical form.  Suppose we have a musical waveform which is a sawtooth pattern, repeating the sequence:

0.3000, 0.4002, 0.5004, 0.6006, 0.7008, 0.8010, 0.7008, 0.6006, 0.5004, 0.4002, 0.3000 …
Now, lets suppose that our quantization levels are equally spaced every 0.001 apart.  Therefore the signal will be quantized to the following repeating sequence:
0.300, 0.400, 0.500, 0.601, 0.701, 0.801, 0.701, 0.601, 0.500, 0.400, 0.300 …
The resultant quantization errors will therefore comprise this repeating sequence:
0.0000, +0.0002, +0.0004, -0.0004, -0.0002, 0.0000, -0.0002, -0.0004, +0.0004, +0.0002, 0.0000 …
If you plot these repeating sequences on a graph, you will see that the sequence of Quantization Errors forms a pattern that is intriguingly similar to the original signal, but is not quite the same.  This is an example of a highly correlated quantization error signal.  What we want ideally, is for the quantization errors to resemble, as closely as possible, a sequence of random numbers.  Totally random numbers represent pure noise, whereas highly correlated numbers represent pure distortion.

In reality, any set of real-world quantization error numbers can be broken down into a sum of two components – one component which is reasonably well correlated, and another which is pretty much random.  Psychoacoustically speaking, the ear is far more sensitive to distortion than to noise.  In other words, if we can replace a small amount of distortion with a larger amount of noise, then the result may be perceived as sounding better.

Time to go back to our piece of hypothetical data.  Suppose I take a random selection of samples, and modify them so that we choose not the closest quantization level, but the second-closest.  Here is one example – the signal is now quantized to the following repeating sequence:
0.300, 0.400, 0.501, 0.600, 0.701, 0.801, 0.700, 0.601, 0.500, 0.401, 0.300 …
The resultant quantization errors now comprise this repeating sequence:
0.0000, +0.0002, -0.0006, +0.0006, -0.0002, 0.0000, +0.0008, -0.0004, +0.0004, -0.0008, 0.0000…

There are three things we can take away from this revised quantization error sequence.  The first is that it no longer looks as though it is related to the original data, so it is no longer correlated, and looks a lot more like noise.  The second is that the overall signal level has gone up, so we have replaced a certain amount of correlated signal with a slightly larger amount of noise signal.  Third, and this is where we finally get around to the second element of this post, the noise seems to have quite a lot of high-frequency energy associated with it.

So here we have the concepts of Dither and Noise Shaping in a nutshell.  By carefully re-quantizing certain selected samples of the music data stream in a pseudo-random way, we can replace distortion with noise.  Likewise, using what amounts to the same technique, we can do something very similar and replace an amount of noise in the portion of the frequency band to which we are most sensitive, with a larger amount of noise in a different frequency band to which we are less sensitive, or which we know can be easily filtered out at some later stage.

One thing needs to be borne in mind, though.  Dithering and Noise Shaping operate only on the noise which is being added to the signal as a result of a quantization process, and not on the noise which is already present in the signal.  After the Dithering and Noise Shaping, all of this new noise is now incorporated into the music signal, and is no longer separable.  So you have to be really careful about when you introduce Dither or Noise Shaping into the signal, and how often you do it, because its effects are cumulative.  If you do it too many times, it is easy to end up with an unacceptable amount of high frequency noise.

I hope you were able to follow that, and I apologize again for the ugly numbers 🙂

People often ask what dithering does.  Most seem to know that it involves adding random noise to digital music for the apparently contradictory purpose of making it sound better, but don’t know how it accomplishes that.  On the other hand, very few people understand what Noise Shaping is and what it actually does – or that it is in reality a form of dithering.  Since neither concept is particularly difficult to grasp, I thought you might appreciate a short post on the subject.  I warn you, though, there are going to be some NUMBERS involved, so you might want to grab a pencil and a piece of paper to hand.

Suppose we encode a music signal as a PCM data stream (read my earlier post “So you think you understand Digital Audio?” if you are unsure how that works).  Each PCM “sample” represents the magnitude of the musical waveform at the instant it is measured, and its value is stored as that of the nearest “quantized level”.  These “quantized levels” are the limited set of values that the stored data can take, and so there will be some sort of error associated with quantization.  This “quantization error” is the difference between the actual signal value and the stored “quantized” value.  For example, suppose the value of the signal is 0.5004 and the two closest quantization levels are 0.500 and 0.501.  If we store the PCM sample as 0.500 then the associated quantization error is +0.0004.  If we had chosen to store the PCM sample as 0.501 then the associated quantization error would have been -0.0006.  Since the former is less of an error than the latter, it seems obvious that the former is the more accurate representation of the original waveform.  Are you with me so far?

One way to look at the PCM encoding process is to think of it as storing the exact music signal, plus an added error signal, comprising all of the quantization errors.  The quantization errors are the Noise or Distortion introduced by the quantization process.  The distinction between noise and distortion are critically important here.  The difference between the two is that distortion is related to the underlying signal (the term we use is “Correlated”), whereas noise is not.

I am going to go out of my way here to give a specific numerical example because it is quite important to grasp the notion of Correlation.  I am going to give a whole load of numbers, and it would be best if you had a pencil and paper handy to sketch them out in rough graphical form.  Suppose we have a musical waveform which is a sawtooth pattern, repeating the sequence:

0.3000, 0.4002, 0.5004, 0.6006, 0.7008, 0.8010, 0.7008, 0.6006, 0.5004, 0.4002, 0.3000 …
Now, lets suppose that our quantization levels are equally spaced every 0.001 apart.  Therefore the signal will be quantized to the following repeating sequence:
0.300, 0.400, 0.500, 0.601, 0.701, 0.801, 0.701, 0.601, 0.500, 0.400, 0.300 …
The resultant quantization errors will therefore comprise this repeating sequence:
0.0000, +0.0002, +0.0004, -0.0004, -0.0002, 0.0000, -0.0002, -0.0004, +0.0004, +0.0002, 0.0000 …
If you plot these repeating sequences on a graph, you will see that the sequence of Quantization Errors forms a pattern that is intriguingly similar to the original signal, but is not quite the same.  This is an example of a highly correlated quantization error signal.  What we want ideally, is for the quantization errors to resemble, as closely as possible, a sequence of random numbers.  Totally random numbers represent pure noise, whereas highly correlated numbers represent pure distortion.

In reality, any set of real-world quantization error numbers can be broken down into a sum of two components – one component which is reasonably well correlated, and another which is pretty much random.  Psychoacoustically speaking, the ear is far more sensitive to distortion than to noise.  In other words, if we can replace a small amount of distortion with a larger amount of noise, then the result may be perceived as sounding better.

Time to go back to our piece of hypothetical data.  Suppose I take a random selection of samples, and modify them so that we choose not the closest quantization level, but the second-closest.  Here is one example – the signal is now quantized to the following repeating sequence:
0.300, 0.400, 0.501, 0.600, 0.701, 0.801, 0.700, 0.601, 0.500, 0.401, 0.300 …
The resultant quantization errors now comprise this repeating sequence:
0.0000, +0.0002, -0.0006, +0.0006, -0.0002, 0.0000, +0.0008, -0.0004, +0.0004, -0.0008, 0.0000…

There are three things we can take away from this revised quantization error sequence.  The first is that it no longer looks as though it is related to the original data, so it is no longer correlated, and looks a lot more like noise.  The second is that the overall signal level has gone up, so we have replaced a certain amount of correlated signal with a slightly larger amount of noise signal.  Third, and this is where we finally get around to the second element of this post, the noise seems to have quite a lot of high-frequency energy associated with it.

So here we have the concepts of Dither and Noise Shaping in a nutshell.  By carefully re-quantizing certain selected samples of the music data stream in a pseudo-random way, we can replace distortion with noise.  Likewise, using what amounts to the same technique, we can do something very similar and replace an amount of noise in the portion of the frequency band to which we are most sensitive, with a larger amount of noise in a different frequency band to which we are less sensitive, or which we know can be easily filtered out at some later stage.

One thing needs to be borne in mind, though.  Dithering and Noise Shaping operate only on the noise which is being added to the signal as a result of a quantization process, and not on the noise which is already present in the signal.  After the Dithering and Noise Shaping, all of this new noise is now incorporated into the music signal, and is no longer separable.  So you have to be really careful about when you introduce Dither or Noise Shaping into the signal, and how often you do it, because its effects are cumulative.  If you do it too many times, it is easy to end up with an unacceptable amount of high frequency noise.

I hope you were able to follow that, and I apologize again for the ugly numbers 🙂

BitPerfect user John Bacon-Shone correctly pointed out in response to my recent musings on DSD that the SACD format delivers a huge amount of music in surround sound format, which is a particular boon to classical music listeners.  And not many people are aware of that.

Surround sound as a consumer format goes back to the 1970’s although its roots precede that by several decades in cinematic applications, and even in concert performances such as Pink Floyd’s “Games for May” concert of 1967.  The appeal of surround sound is quite obvious – why constrain the sonic image to the traditional one of a stage set out in front of you?  Arguably, this idea was first reduced to practice by Hector Berlioz in his “Grand Messe des Morts” or Requiem, waaaaay back in 1837, which called for four brass bands to be located in the front, the back, and the two sides of the performance venue.

In the 1970’s several consumer formats appeared, each aimed at extending the two-speaker stereo layout with an additional pair of rear speakers.  The term “Quadrophonic” was coined to describe this arrangement, and there was much enthusiasm in the music industry to support 4-channel technology with recorded material.  As we now know, when new a consumer technology tries to emerge, the major stakeholders take turns to shoot themselves in the foot.  In this case, the hardware manufacturers brought forth a plethora of incompatible solutions to deliver a four-channel experience.  QS, SQ, CD-4 (all LP-based formats), 8-track tape, and surprisingly many others, all came and went.  It was another 20 years before the movie industry, and its DVD technology, finally lit a fire under the surround sound concept.

One of the problems with surround sound is that it is much harder to create a solid 3-dimensional sonic image which creates the same soundfield for multiple listeners distributed throughout the listening room.  This problem is exacerbated for home theater applications where there is a physical image (the screen), and a need for much of the sound – particularly the dialogue – to appear to come from it.  This resulted in the adoption of the front center speaker, through which dialogue can be readily piped.  Also, in movie soundtracks the role of deep bass is dramatically different from that of pure audio, and so a special channel which provides only deep bass (the “Low Frequency Effects” channel) was specified.  This complete configuration is well known today as “5.1”.  Additional main speakers tend to be added from time to time, and today’s home theater receivers often support up to “7.1” channels.

Now that surround-sound delivery formats have at last become established, the music industry can now focus on recording and delivering music in multi-channel formats.  The venerable CD is too old to be adapted to surround sound, and so SACD is now the only viable format available for delivery of multi-channel audio content (its one-time competitor, DVD-Audio, is all but extinct now).  Except that here in the West, as consumers, we omitted to climb on the SACD bandwagon.  If only Sony and Philips had marketed SACD as a surround-sound format rather than an audio quality format, things might have turned out differently.

Anyway, audiophiles being audiophiles, the surround sound debate is alive and well.  There is an emerging body of opinion that says the centre speaker is actually ruinous when it comes to creating a stable sonic image.  Additionally, sub-woofer advocates believe that a single LFE channel is inadequate, and that each full-range speaker needs its own sub-woofer.  There is also (thankfully) some agreement that, for classical music at least, the two rear speakers do not need the full 20Hz bass response.  What we used to refer to as “Quadrophonic” is now called 4.0 and one of its keenest advocates is Peter McGrath, a revered recording engineer whose day job is Sales Director for Wilson Audio Specialties.  Peter’s classical recordings are absolutely the finest I have ever heard, so his opinion counts for something!  But I’m not sure whether you can actually buy music in 4.0 format.

BitPerfect user John Bacon-Shone correctly pointed out in response to my recent musings on DSD that the SACD format delivers a huge amount of music in surround sound format, which is a particular boon to classical music listeners.  And not many people are aware of that.

Surround sound as a consumer format goes back to the 1970’s although its roots precede that by several decades in cinematic applications, and even in concert performances such as Pink Floyd’s “Games for May” concert of 1967.  The appeal of surround sound is quite obvious – why constrain the sonic image to the traditional one of a stage set out in front of you?  Arguably, this idea was first reduced to practice by Hector Berlioz in his “Grand Messe des Morts” or Requiem, waaaaay back in 1837, which called for four brass bands to be located in the front, the back, and the two sides of the performance venue.

In the 1970’s several consumer formats appeared, each aimed at extending the two-speaker stereo layout with an additional pair of rear speakers.  The term “Quadrophonic” was coined to describe this arrangement, and there was much enthusiasm in the music industry to support 4-channel technology with recorded material.  As we now know, when new a consumer technology tries to emerge, the major stakeholders take turns to shoot themselves in the foot.  In this case, the hardware manufacturers brought forth a plethora of incompatible solutions to deliver a four-channel experience.  QS, SQ, CD-4 (all LP-based formats), 8-track tape, and surprisingly many others, all came and went.  It was another 20 years before the movie industry, and its DVD technology, finally lit a fire under the surround sound concept.

One of the problems with surround sound is that it is much harder to create a solid 3-dimensional sonic image which creates the same soundfield for multiple listeners distributed throughout the listening room.  This problem is exacerbated for home theater applications where there is a physical image (the screen), and a need for much of the sound – particularly the dialogue – to appear to come from it.  This resulted in the adoption of the front center speaker, through which dialogue can be readily piped.  Also, in movie soundtracks the role of deep bass is dramatically different from that of pure audio, and so a special channel which provides only deep bass (the “Low Frequency Effects” channel) was specified.  This complete configuration is well known today as “5.1”.  Additional main speakers tend to be added from time to time, and today’s home theater receivers often support up to “7.1” channels.

Now that surround-sound’s structural formats have at last become established, the music industry can now focus on recording and delivering music in multi-channel formats.  The venerable CD is too old to be adapted to surround sound, and so SACD is now the only viable hardware format available for delivery of multi-channel audio content (its one-time competitor and supposed vanquisher, DVD-Audio, is all but extinct now).  Except that here in the West, as consumers, we omitted to climb on the SACD bandwagon.  If only Sony and Philips had marketed SACD as a surround-sound format rather than an audio quality format, things might have turned out differently.

Anyway, audiophiles being audiophiles, the surround sound debate is alive and well.  There is an emerging body of opinion that says the centre speaker is actually ruinous when it comes to creating a stable sonic image.  Additionally, sub-woofer advocates believe that a single LFE channel is inadequate, and that each full-range speaker needs its own sub-woofer.  There is also (thankfully) some agreement that, for classical music at least, the two rear speakers do not need the full 20Hz bass response.  What we used to refer to as “Quadrophonic” is now called 4.0 and one of its keenest advocates is Peter McGrath, a revered recording engineer whose day job is Sales Director for Wilson Audio Specialties.  Peter’s classical recordings are absolutely the finest I have ever heard, so his opinion counts for something!  But relatively few multichannel SACDs are presented in 4.0 format.

The bleeding edge of the audiophile universe – inhabited by those of us who probe ever deeper the outer reaches of diminishing returns in search of audio playback perfection – is strangely characterized by apparently outdated, abandoned, superseded technologies, shouting their last hurrahs in stunningly expensive Technicolor. Tubes and turntables are guilty as charged here, and I own both.

Why do some apparently stone-age technologies still persist, yet others less venerable vanish never to be heard from again (hello cassette tapes, receivers, and soon CD players)? In the cases of tubes and turntables, I venture to suggest that these are technologies which, at their zenith, were the products of craftsmanship and ‘black art’ rather than the concentrated application of science. Their full scientific potential was never truly reached, and they were replaced for reasons of practicality, convenience, and cost. But they still have not gone away.

A slightly different situation arose for the SACD, a technology developed by Sony and Philips as an intended replacement for the CD around the turn of the millenium. SACD was designed from the start to be a vehicle for delivering notable superior sound quality compared to the CD, which is strange, since the same two companies foisted CD on us under the pretext of “Pure Perfect Sound, Forever”. But whereas in the 1980’s they were able to create a real consumer demand for a delivery platform which was convincingly marketed as being superior to the LP, with SACD they found that there was in fact no market interest in a sound quality superior to CD. In fact, their customers were more preoccupied with a delivery format of demonstrably INFERIOR sound quality – the MP3 file. But that is another story.

The SACD fizzled upon launch, but thanks to the Japanese, it didn’t actually die. There is a healthy market for the SACD in Japan, and this is sufficient to keep the format alive, if not necessarily healthy. So what is it with the SACD? Does it actually sound better? And if so, how does it do that?

Well, yes, there is broadly held agreement that SACD does indeed sound markedly better than CD, and arguably even CD’s high-resolution PCM format cousins (with 24-bit bit depth and higher sampling rates). You see, SACD stores its digital music in a totally different way than CD. It uses a format called DSD, which I shall not go into here, save to say that conversion from DSD to PCM seems to consistently result in some significant sacrifice of sound quality.

Here in the West, where we never really adopted the SACD, we moved from listening to music on CDs to listening to music stored in computer files. So, instead of wondering whether or not to adopt the SACD, we ask whether or not we can store music in DSD format in computer files and have the best of both worlds. Well, of course we can! What did you think?…

Two file formats, one developed by Sony called DSF, and one developed by Philips called DFF, seem to have recently emerged. If you have a PC, you can easily send DSD bitstreams from DSF and DFF files to DACs that support DSD. In the Mac, it is a little more complicated, and there is an emerging standard called DoP (DSD over PCM) which enables Mac users to transmit DSD over USB and other asynchronous communications interfaces. Boutique record labels are emerging, such as Blue Coast Records, which record exclusively in DSD, and sell DSF/DFF files for download.
http://bluecoastrecords.com/

Perhaps most intriguing is that many of the major labels – but DON’T go looking for much in the way of public acknowledgement – have discovered a preference for using the DSD format for archival of their analog tape back catalog, having once already gone down the path of digitizing it to PCM and finding it to have been sadly lacking. Don’t look for this to happen any time soon, but this lays the groundwork for the major labels to finally release their back catalog in a format that truly captures the sound quality of the original master tapes. Before that happens, the labels are going to have to realize that the only sustainable format for music distribution is going to be one that works on-line, and they are going to have to find a way to make that work for them.

DSD could end up emerging as the format of choice for audiophile quality audio playback.

The bleeding edge of the audiophile universe – inhabited by those of us who probe ever deeper into the outer reaches of diminishing returns in search of audio playback perfection – is strangely characterized by apparently outdated, abandoned, superseded technologies, shouting their last hurrahs in stunningly expensive Technicolor. Tubes and turntables are guilty as charged here, and I own both.

Why do some apparently stone-age technologies still persist, yet others less venerable vanish never to be heard from again (hello cassette tapes, receivers, and soon CD players)? In the cases of tubes and turntables, I venture to suggest that these are technologies which, at their zenith, were the products of craftsmanship and ‘black art’ rather than the concentrated application of science. Their full scientific potential was never truly reached, and they were replaced for reasons of practicality, convenience, and cost. But they still have not gone away.

A slightly different situation arose for the SACD, a technology developed by Sony and Philips as an intended replacement for the CD around the turn of the millenium. SACD was designed from the start to be a vehicle for delivering notable superior sound quality compared to the CD, which is strange, since the same two companies foisted CD on us under the pretext of “Pure Perfect Sound, Forever”. But whereas in the 1980’s they were able to create a real consumer demand for a delivery platform which was convincingly marketed as being superior to the LP, with SACD they found that there was in fact no market interest in a sound quality superior to CD. In fact, their customers were more preoccupied with a delivery format of demonstrably INFERIOR sound quality – the MP3 file. But that is another story.

The SACD fizzled upon launch, but thanks to the Japanese, it didn’t actually die. There is a healthy market for the SACD in Japan, and this is sufficient to keep the format alive, if not necessarily healthy. So what is it with the SACD? Does it actually sound better? And if so, how does it do that?

Well, yes, there is broadly held agreement that SACD does indeed sound markedly better than CD, and arguably even CD’s high-resolution PCM format cousins (with 24-bit bit depth and higher sampling rates). You see, SACD stores its digital music in a totally different way than CD. It uses a format called DSD, which I shall not go into here, save to say that conversion from DSD to PCM seems to consistently result in some significant sacrifice of sound quality.

Here in the West, where we never really adopted the SACD, we moved from listening to music on CDs to listening to music stored in computer files. So, instead of wondering whether or not to adopt the SACD, we ask whether or not we can store music in DSD format in computer files and have the best of both worlds. Well, of course we can! What did you think?…

Two file formats, one developed by Sony called DSF, and one developed by Philips called DFF, seem to have recently emerged. If you have a PC, you can easily send DSD bitstreams from DSF and DFF files to DACs that support DSD. In the Mac, it is a little more complicated, and there is an emerging standard called DoP (DSD over PCM) which enables Mac users to transmit DSD over USB and other asynchronous communications interfaces. Boutique record labels are emerging, such as Blue Coast Records, which record exclusively in DSD, and sell DSF/DFF files for download.
http://bluecoastrecords.com/

Perhaps most intriguing is that many of the major labels – but DON’T go looking for much in the way of public acknowledgement – have discovered a preference for using the DSD format for archival of their analog tape back catalog, having once already gone down the path of digitizing it to PCM and finding it to have been sadly lacking. Don’t look for this to happen any time soon, but this lays the groundwork for the major labels to finally release their back catalog in a format that truly captures the sound quality of the original master tapes. Before that happens, the labels are going to have to realize that the only sustainable format for music distribution is going to be one that works on-line, and they are going to have to find a way to make that work for them.

DSD could end up emerging as the format of choice for audiophile quality audio playback.

Naim Audio now provides SetUp instructions for using BitPerfect with their DAC-V1 that you can download from their web site.   A very nice route to obtain “World Class Sound”!

Naim Audio now provides SetUp instructions for using BitPerfect with their DAC-V1 that you can download from their web site.   A very nice route to obtain “World Class Sound”!