As support for regular DSD (aka DSD64) starts to become close to a requirement for manufacturers of not only high-end DACs, but also a number of entry-level models too, so the cutting edge of audio technology moves ever upward to more exotic versions of DSD denoted by the terms DSD128, DSD256, DSD512, etc.  What are these, why do they exist, and what are the challenges faced in playing them?  I thought a post on that topic might be helpful.

Simply put, these formats are identical to regular DSD, except that the sample rate is increased.  The benefit in doing so is twofold.  First, you can reduce the magnitude of the noise floor in the audio band.  Second, you can push the onset of undesirable ultrasonic noise further away from the audio band.

DSD is a noise-shaped 1-bit PCM encoding format (Oh yes it is!).  Because of that, the encoded analog signal can be reconstructed simply by passing the raw 1-bit data stream through a low-pass filter.  One way of looking at this is that at any instant in time the analog signal is very close to being the average of a number of consecutive DSD bits which encode that exact moment.  Consider this: the average of the sequence 1,0,0,0,1,1,0,1 is exactly 0.5 because it comprises four zeros and four ones.  Obviously, any sequence of 8 bits comprising four zeros and four ones will have an average value of 0.5.  So, if all we want is for our average to be 0.5, we have many choices as to how we can arrange the four zeros and four ones.

That simplistic illustration is a good example of how noise shaping works.  In effect we have a choice as to how we can arrange the stream of ones and zeros such that passing it through a low pass filter recreates the original waveform.  Some of those choices result in a lower noise floor in the audio band, but figuring out how to make those choices optimally is rather challenging from a mathematical standpoint.  Theory, however, does tell us a few things.  The first is that you cannot just take noise away from a certain frequency band.  You can only move it into another frequency band (or spread it over a selection of other frequency bands).  The second is that there are limits to both how low the noise floor can be depressed at the frequencies where you want to remove noise, and how high the noise floor can be raised at the frequencies you want to move it to.

Just like digging a hole in the ground, what you end up with is a low frequency area where you have removed as much of the noise as you can, and a high frequency area where all this removed noise has been piled up.  If DSD is to work, the low frequency area must cover the complete audio band, and the noise floor there must be pushed down by a certain minimum amount.  DSD was originally developed and specified to have a sample rate of 2,822,400 samples per second (2.8MHz) as this is the lowest convenient sample rate at which we can realize those key criteria.  We call it DSD64 because 2.8224MHz is exactly 64 times the standard sample rate of CD audio (44.1kHz).  The downside is that the removed noise starts to pile up uncomfortably close to the audio band, and it turns out that all the optimizing in the world does not make a significant dent in that problem.

This is the fundamental limitation of DSD64.  If we want to move the ultrasonic noise further away from the audio band we have to increase either the bit depth or the sample rate.  Of the two, there are, surprisingly enough, perhaps more reasons to want to increase the bit depth than the sample rate.  However, these are trumped by the great advantages in implementing an accurate D/A converter if the ‘D’ part is 1-bit.  Therefore we now have various new flavours of DSD with higher and higher sample rates.  DSD128 has a sample rate of 128 times 44.1kHz, which works out to about 5.6MHz.  Likewise we have DSD256, DSD512, and even DSD1024.

Of these, perhaps the biggest bang for the buck is obtained with DSD128.  Already, it moves the rise in the ultrasonic noise to nearly twice as far from the audio band as it was with DSD64.  Critical listeners – particularly those who record microphone feeds direct to DSD – are close to unanimous in their preference for DSD128 over DSD64.  The additional benefits in going to DSD256 and above seem to be real enough, but definitely fall into the realms of diminishing returns.  However, even though the remarkably low cost and huge capacity of hard disks today makes the storage of a substantial DSD library a practical possibility, if this library were to be DSD512 for example, this would start to represent a significant expense in both disk storage and download bandwidth costs.  In any case, as a result of all these developments, DSD128 recordings are now beginning to be made available in larger and larger numbers, and very occasionally we get sample tracks made available for evaluation in DSD256 format.  However, at the time of writing I don’t know where you can go to download samples of DSD512 or higher.

In the Apple World where BitPerfect users live, playback of DSD requires the use of the DoP (“DSD over PCM”) protocol.  This dresses up a DSD bitstream in a faux PCM format, where a 24-bit PCM word comprises 16 bits of raw DSD data plus an 8-bit marker which identifies it as such.  Windows users have the ability to use an ASIO driver which dispenses with the need for the 8-bit marker and transmits the raw DSD data directly to the DAC in its “native” format.  ASIO for Mac, while possible, remains problematic.

As mentioned, DoP encoding transmits the data to the DAC using a faux PCM stream format.  For DSD64 the DAC’s USB interface must provide 24-bit/176.4kHz support, which is generally not a particularly challenging requirement.  For DSD128 the required PCM stream format is 24-bit/352.8kHz which is still not especially challenging, but is less commonly encountered.  But if we go up to DSD256 we now have a requirement for a 24-bit/705.6kHz PCM stream format.  The good news is that your Mac can handle it out of the box, but unfortunately, very few DACs offer this.  Inside your DAC, if you prise off the cover, you will find that the USB subsystem is separate from the DAC chip itself.  USB receiver chipsets are sourced from specialist suppliers, and if you want one that will support a 24/705.6 format it will cost you more.  Additionally, if you are currently using a different receiver chipset, you may have a lot of time and effort invested in programming it, and you will have to return to GO if you move to a new design (do not collect $200).  The situation gets progressively worse with higher rate DSD formats.

Thus it is that we see examples of DSD-compatible DACs such as the OPPO HA-1 which offers DSD256 support, but only in “native” mode.  What this means is that if you have a Mac and are therefore constrained to using DoP, you need access to a 24/705.6 PCM stream format in order to deliver DSD256, and the HA-1 has apparently been designed with a USB receiver chipset that does not support it.  It may not be as simple as that, and there may be other considerations at play, but if so I am not aware of them.

Interestingly, the DoP specification does offer a workaround for precisely this circumstance.  It provides for an alternative to a 2-channel 24/705.6 PCM format using a 4-channel 24/352.8 PCM format.  The 8-bit DoP marker specified is different, which enables the DAC to tell 4-channel DSD128 from 2-channel DSD256 (they would otherwise be identical).  Very few DAC manufacturers currently support this variant format.  Mytek is the only one I know of – as I understand it their 192-DSD DAC supports DSD128 using the standard 2-channel DoP over USB, but using the 4-channel variant DoP over FireWire.

Because of its negligible adoption, BitPerfect currently does not support the 4-channel DoP variant.  If we did, it would require some additional configuration options in the DSD Support window.  I worry that such options are bound to end up confusing people.  For example, despite what our user manual says, you would not believe the number of customers who write to me because they have checked the “dCS DoP” checkbox and wonder why DSD playback isn’t working!  Maybe they were hoping it would make their DACs sound like a dCS, I dunno.  I can only imagine what they will make of a 2ch/4ch configurator!!!

As a final observation, some playback software will on-the-fly convert high-order DSD formats which are not supported by the user’s DAC to a lower-order DSD format which is.  While this is a noble solution, it should be noted that format conversion in DSD is a fundamentally lossy process, and that all of the benefits of the higher-order DSD format – and more – will as a result be lost.  In particular, the ultrasonic noise profile will be that of the output DSD format, not that of the source DSD format.  Additionally, DSD bitstreams are created by Sigma-Delta Modulators.  These are complex and very challenging algorithms which are seriously hard to design and implement successfully, particularly if you want anything beyond modest performance out of them.  The FPGA-based implementation developed for the PS Audio DirectStream DAC is an example of a good one, but there are some less-praiseworthy efforts out there.  In general, you will obtain audibly superior results pre-converting instead to PCM using DSD Master.

In mathematics, the word ‘convolution’ describes a very important class of manipulations.  If you want to know more about it, a pretty good treatment is shown on its Wikipedia page.  And even if you don’t, I am going to briefly summarize it here before going on to make my point 🙂

A convolution is an operation performed on two functions, or on two sets of data.  Typically (but not always) one is the actual data that we are trying to manipulate, and the other is a weighting function, or set of weights.  Convolution is massively important in the field of signal processing, and therefore is something that anybody who wants (or needs) to talk knowledgeably about digital audio needs to bone up on.  The most prominent convolution processes that you may have heard of are Fourier Transforms (which are used to extract from a waveform its audio spectrum) and digital filtering.  It is the latter of those that I want to focus on here.

In very simple terms, a filter (whether digital or analog) operates as a convolution between a waveform and an impulse response.  You will have heard of impulse responses, and indeed you may have read about them in some of my previous posts.  In digital audio, an impulse response is a graphical representation of the ‘weights’ or ‘coefficients’ which define a digital filter.  Complicated mathematical relationships describe the way in which the impulse response relates to the key characteristics of the filter, and I have covered those in my earlier posts on ‘Pole Dancing’.

Impulse responses are therefore very useful.  They are nice to look at, and easy to categorize and classify.  Unfortunately, it has become commonplace to project the aesthetic properties of the impulse response onto the sonic properties arising from the filter which uses it.  In simple language, we see a feature on the impulse response, and we imagine that such a feature is impressed onto the audio waveform itself after it comes out of the filter.  It is an easy mistake to make, since the convolution process itself is exactly that – a mathematical impression of the impulse response onto the audio waveform.  But the mathematical result of the convolution is really not as simple as that.

The one feature I see misrepresented most often is pre-ringing.  In digital audio, an impulse is just one peak occurring in a valley of flat zeros.  It is useful as a tool to characterize a filter because it contains components of every frequency that the music bit stream is capable of representing.  Therefore if the filter does anything at all, the impulse is going to be disturbed as a result of passing through it.  For example, if you read my posts on square waves, you will know that removing high frequency components from a square wave results in a waveform which is no longer square, and contains ripples.  Those ripples decay away from the leading edge of the square wave.  This is pleasing in a certain way, because the ripples appear to be caused by, and arise in response to, the abrupt leading edge of the square wave.  In our nice ordered world we like to see effect preceded by cause, and are disturbed by suggestions of the opposite.

And so it is that with impulse responses we tend to be more comfortable seeing ripples decaying away after the impulse, and less comfortable when they precede the impulse, gathering in strength as they approach it.  Our flawed interpretation is that the impulse is the cause and the ripples the effect, and if these don’t occur in the correct sequence then the result is bound to be unnatural.  It is therefore common practice to dismiss filters whose impulse response contains what is termed “pre-ringing” because the result of such filters is bound to be somewhat “unnatural”.  After all, in nature, effects don’t precede their cause, do they?

I would like you to take a short break, and head over to your kitchen sink for a moment.  Turn on the tap (or faucet, if you prefer) and set the water flow to a very gentle stream.  What we are looking for is a smooth flow with no turbulence at all.  We call this ‘laminar’ flow.  What usually happens, if the tap outlet is sufficiently far above the bottom of the sink is that the laminar flow is maintained for some distance and then breaks up into a turbulent flow.  The chances are good that you will see this happening, but it is no problem if you don’t – so long as you can find a setting that gives you a stable laminar stream.  Now, take your finger, and gently insert it into the water stream.  Look closely.  What you will see are ripples forming in the water stream **above** your finger.  If you don’t, gradually move your finger up towards the tap and they should appear (YTMV/YFMV).  What you will be looking at is an apparently perfect example of an effect (the ripples) occurring before, or upstream of, the cause (your finger).

What I have demonstrated here is not your comfortable world breaking down before your eyes.  What is instead breaking down is the comfort zone of an over-simplistic interpretation of what you saw.  Because the idea of the finger being the cause and the ripples being the effect is not an adequate description of what actually happened.

In the same way, the notion of pre-ringing in the impulse response of a filter resulting in sonic effects that precede their cause in the resultant audio waveform, is not an adequate description of what is happening.  However, the misconception gains credence for an important, if inconvenient reason, which is that filters which exhibit pronounced pre-ringing do in fact tend to sound less preferable than those which don’t.  These sort of things happen often in science – most notably in medical science – and when it does it opens a door to misinformation.  In this case, the potential for misinformation lies in the reason given for why one filter sounds better than another – that the one with pre-ringing in its impulse response results in sounds that precede the things which caused them.  By all means state your preference for filters with a certain type of impulse response, but please don’t justify your preference with flawed reasoning.  It is OK to admit that you are unclear as to why.

I want to finish this with an audio example to make my point.  The well-known Nyquist-Shannon theory states that a regularly sampled waveform can be perfectly recreated if (i) it is perfectly sampled; and (ii) it contains no frequency components at or above one half of the sample rate.  The theory doesn’t just set forth its premise, it provides a solid proof.  In essence, it does this by convolving the sampled waveform with a Sinc() function, in a process pretty much identical to the way a digital filter convolves the waveform with an Impulse Response.  Nyquist-Shannon proves that this convolution results in a mathematically perfect reconstruction of the original waveform if – and only if – the two stipulations I mentioned are strictly adhered to.  This is interesting in the context of this post because the Sinc() function which acts as the Impulse Response exhibits an infinitely long pre-ring at a significantly high amplitude.  Neither Nyqvist or Shannon, nor the entire industry which their theory spawned, harbour any concerns about causality in reconstructed waveforms!

Here is a short video that should give pause to those who have asked that question with the confident skepticism of someone who has never tried to actually make a pair themselves. This person has made his own pair of B&W 800 Diamond loudspeakers. Has he succeeded? We will never know, but it sure looks most impressive.

In practice, he has restricted himself to making his own set of elaborate cabinets, as it looks as though he has bought all the drive units from B&W. But even so, the overwhelming impression is of the expensive resources he has had to bring to bear to realize the project. OK, he has done the grunt work himself, but the project has clearly taken a HUGE amount of time and effort. Aside from some initial consternation, I imagine that the executives at B&W are having a good chuckle over it.

Presumably his motivation was purely the satisfaction of creating his own work of art. Think about it. How much money can he possibly have saved by doing it himself? Do you think you could do it yourself for less, without sacrificing at least some of the core design objectives?

Whatever, as I contemplate my own B&W 802 Diamonds, I am sure glad I bought mine!

The SACD format was built around DSD right from the start.  But since DSD takes up about four times the amount of disk space of a 16/44.1 equivalent this meant that a different physical disc format was going to be required.  Additionally, SACD was specified to deliver multi-channel content, which increases the storage requirement by another factor of 3 or more, depending on how many channels you want to support.  The only high-capacity disc format that was on the horizon at the time was the one eventually used for DVD, and even this was going to be inadequate for the full multi-channel capability required for SACD.

The solution was to adopt a lossless data compression protocol to reduce the size of a multi-channel DSD master file so that it would fit.  This protocol chosen was called DST, and is an elaborate DSP-based method derived from the way MP3 works.  Essentially, you store a bunch of numbers that represent the actual data as a mathematical function which you can later use to try to re-create the original data.  You then store a bunch of additional numbers which represent the differences between the actual data and the attempted recreation.  If you do this properly, the mathematical function numbers, plus the difference data, takes up less space than the original data.  On a SACD the compression achieved is about 50%, which is pretty good, and permits a lot of content to be stored.

Given that DST compression is lossless, it is interesting that the SACD format allows discs to be masted with your choice of compressed or non-compressed data.  And, taking a good look at a significant sample of SACDs, it appears that a substantial proportion of those discs do not use compression.  Additionally, if you look closely, you will see that almost all of the serious audiophile remasters released on SACD are all uncompressed.  So the question I have been asking is – is there any reason to believe that DST-compressed SACDs might sound worse than uncompressed ones?

First of all, let me be clear on one thing.  The DST compression algorithm is lossless.  This means that the reconstructed waveform is bit-for-bit identical to the original uncompressed waveform.  This is not at issue here.  Nor is the notion that compressing and decompressing the bits somehow stresses them so that they don’t sound so relaxed on playback.  I don’t buy mumbo jumbo.  The real answer is both simpler than you would imagine (although technically quite complicated), and at the same time typical of an industry which has been known to upsample CD content and sell it for twice the price on a SACD disc.

To understand this, we need to take a closer look at how the DSD format works.  I have written at length about how DSD makes use of a combination of massive oversampling and noise shaping to encode a complex waveform in a 1-bit format.  In a Sigma-Delta Modulator (SDM) the quantization noise is pushed out of the audio band and up into the vast reaches of the ultrasonic bandwidth which dominates the DSD encoding space.  The audio signal only occupies the frequency space below 20kHz (to choose a number that most people will agree on).  But DSD is sampled at 2,822kHz, so there is a vast amount of bandwidth between 20kHz and 2,822kHz available, into which the quantization noise can be swept.

One of the key attributes of a good clean audio signal is that it have low noise in the audio band.  In general, the higher quality the audio signal, the lower the noise it will exhibit.  The best microphones can capture sounds that cannot be fully encoded using 16-bit PCM.  However, 24-bit PCM can capture anything that the best microphones will put out.  Therefore if DSD is to deliver the very highest in audio performance standards it needs to be able to sustain a noise floor better than that of 16-bit audio, and approaching that of 24-bit audio.

The term “Noise Shaping” is a good one.  Because quantization noise cannot be eliminated, all you can hope to do is to take it from one frequency band where you don’t want it, and move it into another where you don’t mind it – and in the 1-bit world of DSD there is an awful lot of quantization noise.  This is the job of an SDM.  The design of the SDM determines how much noise is removed from the audio frequency band, and where it gets put.  Mathematically, DSD is capable of encoding a staggeringly low noise floor in the audio band.  Something down in the region of -180dB to -200dB has been demonstrated.  What good DSD recordings achieve is nearer to -120dB, and the difference is partly due to that fact that practical real-world SDM designs seriously underperform their theoretical capabilities.  But it also arises because better performance requires a higher-order SDM design, and beyond a certain limit high-order SDMs are simply unstable.  A workmanlike SDM would be a 5th-order design, but the best performance today is achieved with 8th or 9th order SDMs.  Higher than that, and they cannot be made to work.

So how does a higher-order SDM achieve superior performance?  The answer is that it packs more and more of the quantization noise into the upper reaches of the ultrasonic frequency space.  So a higher-performance higher-order SDM will tend to encode progressively more high-frequency noise into the bitstream.  A theoretically perfect SDM will create a Bit Stream whose high frequency content is virtually indistinguishable from full-scale white noise.

This is where DST compression comes in.  Recall that DST compression works by storing a set of numbers that enable you to reconstruct a close approximation of the original data, plus all of the differences between the reconstructed bit stream and the original bit stream.  Obviously the size of the compressed (DST-encoded) file will be governed to a large degree by how much data is needed to store the difference signal.  It turns out that the set of numbers that reconstruct the ‘close approximation’ do a relatively good job of encoding the low frequency data, but a relatively poor job of encoding the high frequency data.  Therefore, the more high frequency data is present, the more additional data will be needed to encode the difference signal.  And the larger the difference signal, the larger the compressed file will be.  In the extreme, the difference signal can be so large that you will not be able to achieve much compression at all.

This is the situation we are in with today’s technology.  We can produce the highest quality DSD signal and be unable to compress it effectively, or we can accept a reduction in quality and achieve a useful degree of (lossless) compression.

So what happens when we have a nice high-resolution DSD recording all ready to be sent to the SACD mastering plant?  What happens if the DSD content is too large to fit onto a SACD, and cannot be compressed enough so that it does?  The answer will disappoint you.  What happens is that the high quality DSD master tape is remodulated using a modest 5th-order SDM, in the process producing a new DSD version which can now be efficiently compressed using DST compression.  Most listeners agree that a 5th order SDM produces audibly inferior sound to a good 8th order SDM, but with real music recordings it is essentially impossible to inspect a DSD data file and determine unambiguously what order of SDM was used to encode it.  So it is easy enough to get away with.

How do you tell if a SACD is compressed or not?  Well, if you have the underground tools necessary, you can rip it and analyze it definitively.  For the rest of us there is no sure method except for one.  You simply add up the total duration of the music on the disc, and calculate 2,822,400 bits of data per second, per channel.  If the answer amounts to more than 4.7GB then the data must be compressed.  If it adds up to less, there is no guarantee that it won’t be DST-compressed, but the chances are pretty good that it is not.  After all, if the record company wants to compress it, they’d have to pay someone to do that, and that probably ain’t gonna happen.  The other simple guideline is that if it is multi-channel it is probably compressed, but if it is stereo it probably is not.

Of course, none of this need apply to downloaded DSD files.  If produced by reputable studios these will have been produced using the best quality modulators they can afford, and since DST encoding is not used on commercial DSF and DFF* files this whole issue need not arise.  However, if the downloaded files are derived from a SACD (as many files are which are not distributed by the original producers), then the possibility does exist that you are receiving an inferior 5th-order remodulated version.  The take-away is that not all DSD is created equal.  Yet another thing for us to have to bear in mind!

[* Actually, the DFF file format does allow for the DSD content to be DST compressed, because this format is used by the mastering house to provide the final distribution-ready content to the SACD disc manufacturing plant.  However, for commercial use, I don’t think anybody employs DST compression.]

Today, BitPerfect 2.0.2 has been released to the App Store.  It may take up to 48hrs before it shows up in all regions.  V2.0.2 contains several minor bug fixes, plus some minor enhancements to the audio engine to improve stability.

BitPerfect 2.0.2 is a free upgrade for existing BitPerfect users.

Once again LINN Records are announcing a “24-Bits Of Christmas” promotion, where they are offering a free high-resolution download every day from December 1st through Christmas.  The doors are opening early this year, and the first track is already available.  Check it out!

I noticed a peculiar thing the other day.  At least, I thought it was peculiar at the time.  Let me tell you all about it.

One of the tricks you can easily use to fine-tune the positioning of your loudspeakers is to move your head instead.  Moving your head from side-to-side, from back-to-front, and up-and-down, you can listen to how dependent your system’s sonic balance and imaging are with regard to listening position.  These things are not so much governed by your system as by your system’s interaction with your listening room.  This is why identical systems can sound radically different when installed in different listening rooms.

Changes in sound balance as you move your head are usually caused by the inescapable fact that sound generated by a loudspeaker driver is not uniformly distributed into the room.  Most of it propagates straight ahead out of the loudspeaker, but as you move away from the straight-ahead position the output starts to fall off.  Complicating this behaviour is the fact that as the frequency rises, so this off-centre drop-off gets progressively worse.  We refer to this phenomenon using the term ‘dispersion’, and it is a natural consequence of the fact that the loudspeaker driver is not infinitely small.  One consequence of this dispersion is that the listener’s perceived frequency balance will depend to some extent on where in the room he is located.

The perception of a good ‘holographic’ spatial image is a much more complex matter, and, if we are honest about it, is not fully understood.  The spatial image is a construct that our brains create for us, rather than a specific property of the system, and so is very much a matter that dwells within the realm of psychoacoustics.  Having said that, there are a number of things that we do know to have a positive impact on a system’s ability to generate a holographic spatial image.  Chief among those is timing coherence.  It seems that the more extreme the measures taken to improve timing coherence, the better the imaging we end up with.  The real problem arises because we cannot actually measure this ‘timing coherence’ at all.  Frankly, we can only wave our arms in attempting to define what it actually is.

The best way to rationalize timing coherence is to think of a loudspeaker.  Modern speaker design theory takes great pains to minimize cabinet resonance.  These days even the most budget-friendly designs from the better manufacturers have non-resonant cabinets that respond with a dead thud when you rap them with your knuckles, a property that was evident only on the best of high-end designs as little as 20 years ago.  A resonant cabinet will store energy and release it as sound waves a faction of a second later.  This, after all, is what you hear when rapping a cabinet with your knuckles produces a distinctive sound.  By contrast, rapping the cabinets of my B&W 802 Diamonds produces nothing more than sore knuckles.

Understanding these concepts in loudspeakers is quite simple, but extending them to electronic components is less so.  Even so, some concepts are well understood.  Removing capacitors from the signal path is one such example.  Mechanically isolating the chassis, less so.  But if you get the chance to listen to Nordost’s Sort Füts and Sort Kones it can be very instructive.

Anyway, all this is to say that if your audio components are well designed they can generate that holographic sense of image that many of us crave from our systems.  But you still need to set the system up correctly in the listening room in order to make it happen.  This is because the sound that reaches the listening position is a composite of direct sound and a combination of different reflected sounds.  If you ever get to hear a high-end loudspeaker inside an anechoic chamber – which I recognize very few of you ever will – you would be amazed as to how awful it sounds.  It will sound so dry you’ll need to take a bottle of water in there with you.  When you come out, you’ll feel like you have cotton wool in your ears.  So it is important to recognize the dominant effect of the room interaction on how your system actually sounds.

It also explains how the concept of a ‘sweet spot’ actually arises.  There is usually only one place in your listening room where the combination of direct and reflected sound comes together to generate the optimum image.  When you set up your listening room, your challenge is to make it such that this optimum spot coincides with where you place your listening chair.  Usually, when things are close to ideal, the optimum spot will move with the loudspeakers, so if it is two feet in front of your listening chair, you can correct the situation by moving the speakers two feet forward.  Or you could just move your chair.

I have one last observation to make here, and it is quite an important one.  Think about your listening chair.  If it has a high back, then reflections off the back of the chair will tend to dominate the sound field, and you may find that regardless of where you position it, you just don’t get a good image.  In general, you should always strive to use a listening chair with a low back.  In my own listening room, therefore, I have a rather stylish Italian white leather sectional sofa with a low back that comes below my shoulder line.

When a new component comes along which makes a significant change to your system, such as my new DirectStream DAC, its contribution may be such as to require a reassessment of where that optimal listening position is located.  It is quite an easy process – or at least it should be.  Sitting in your favourite listening position, you move your head from side-to-side, then back-and-forth, and finally up-and-down, until you locate the new optimal position.  You then adjust your speaker position, and/or move your listening chair, to correct for the offset.

It should be easy, but in my case it has proven not to be so.  You see, regardless of the adjustments I make, the optimum position is always about 10-12 inches higher than where I am sitting.  I have come to realize that the culprit is my much-loved sofa.  Even though its back doesn’t even come up to my shoulders, it appears that it still manages to contribute significant reflections up from its seat cushions.  Also, as I sit on it with my palms lightly touching the seat cushions, I can plainly pick up vibrations from the leather surface.  These are not at all evident if I instead place my hands on fabric cushions.  Right now I have co-opted a pair of seat cushions from another of my sofas to raise my listening position by about 10 inches.  It will do for some listening tests, but of course I now have no back support whatsoever. I have a bad back, so that is not the basis of a long-term solution.

So is my problem down to reflections from the leather surfaces, or re-radiation from the vibrating surface?  I am working on the notion of the former, because reflections tend to disrupt imaging, whereas vibrations tend to disrupt tonal neutrality, and in any case are surely too heavily damped.  For reasons of practicality (and in the interests of sustaining a 36-year marriage which is worth more than my stereo) the sofa needs to stay.  I am contemplating a solution to damp the source of these refections by judicious placement of an absorptive panel on the ceiling above the sofa.  Last year I placed one on the ceiling above and between my speakers to great effect, so I am thinking along the lines of something similar.

Meanwhile, I plan to experiment with covering the sofa’s leather surfaces with some absorptive material just to see what that does.  Such are the joys of the high end.  Your system and your room are like two top drivers on the same Formula One team.  Getting them to cooperate can be a challenge.

BitPerfect user Stefan Leckel has come up with a useful solution to the Yosemite Console Log problem.  In case you are unaware, under Yosemite, when you use BitPerfect, iTunes fills the Console Log with a stream of entries – several per second – which rapidly fills the Console Log to capacity.  At that point, the oldest messages are deleted.  In effect, this renders the Console Log pretty useless as a diagnostic tool.

Stefan’s ingenious solution is a simple script file which, in effect, sets up the Console App so that it ignores these specific messages.  However, because the script works at the system level, using it requires a level of comfort with working on OS X using tools that are capable of wreaking havoc, although hopefully the instructions below are easy enough for most people to use with a degree of comfort.  As with anything that involves tinkering at the system level, YOU USE THIS TOOL ENTIRELY AT YOUR OWN RISK, AND WITH NO EXPRESS OR IMPLIED WARRANTY.  If in doubt, channel Nancy Reagan, and “Just Say No”:)

First, you need to download a special script file which you can download by clicking here.  This will download a file called  It doesn’t matter where you place this file.  Your downloads folder would do fine.  If you are concerned about the authenticity of this file, or what it might be doing to your Mac, the contents are reproduced below for you to inspect and compare.

To use the script file, you need to first open a Terminal window.  Inside the terminal window type the following: “sudo bash ” – don’t type the quote marks, and be sure to leave a space after the bash – and DON’T press the ENTER key.  Next, drag and drop the file that you just downloaded into the Terminal window.  This action will complete the “sudo bash ” line with the full path of the file.  Now you can press ENTER.  You will be prompted to enter your system password.  Enter it (nothing will show in the Terminal as you type), and hit ENTER.

That’s it.  The Console Log should now work fine.  If you want to reset it back to how it was, just re-run the same sequence.  The same command is designed to toggle the modification on and off.

Thank you Stefan!

Below, for reference, I have reproduced the content of in full (lines shown in green are wrapped from the end of the previous line):

# !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
# use at your own risk, no warranty
# !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
# checks if asl.conf is already modified
set -x
cat /etc/asl.conf|grep -F “? [= Facility] [=
Sender BitPerfect] [S= Message] ignore” > /dev/null

if [ $? -eq 0 ]
    echo “removing bitperfect modification from /etc/asl.conf file”
    cat /etc/asl.conf|grep -v -F “? [= Facility] [= Sender BitPerfect] [S= Message] ignore” > /etc/asl.bitperfect
    echo “adding bitperfect modifications to /etc/asl.conf file”
    echo “? [= Facility] [= Sender BitPerfect]
[S= Message] ignore” > /etc/asl.bitperfect
    cat /etc/asl.conf >> /etc/asl.bitperfect

echo “backup /etc/asl.conf to /etc/asl.conf.bitperfect”
cp /etc/asl.conf /etc/asl.conf.bitperfect

echo “activating new config”
mv /etc/asl.bitperfect /etc/asl.conf

echo “restarting syslogd daemon”
killall syslogd

echo “done.”

Roger has been somewhat shunted unceremoniously to one side in the modern world.  We seem to have forgotten why he was ever there in the first place, and the important role he used to play.  Without Roger, our world today is a less friendly place, one in which misunderstandings are easy to come by.  Personally, I miss him, but then again I suppose I am just another old fart.

In the early day of person-to-person radio communications, Roger played a critically important role.  If you are flying an aeroplane, and you want to announce to the control tower that you’re commencing your takeoff roll, you want to be sure that the control tower is aware of that, otherwise all sorts of unpredictable outcomes could potentially result, some of them dire.  That’s where Roger comes in.  The control tower responds “Roger” and now you know your message has been received and, by extension, that the control tower knows you are rolling.  It is part of what we today recognize as a handshaking protocol, something that ensures the effectiveness of a two-way communication.  Handshaking is a tool to ensure that a message has been received, that it has been understood, and that both parties know either who is expected to speak next, or that they are agreed that the conversation is over.

When speaking to someone face-to-face, or over the telephone, there are implied cues to which we tend to adhere in order to provide this handshaking element.  These can be turns of phrase, vocal inflections, gestures, and the like.  They often vary among cultures.  How we communicate with a person has important ramifications as to how the other person perceives us, and how we in turn perceive the other person.  We may perceive that person to be brusque, friendly, rude, gregarious, or to have any of a number of attributes.  If, as a person, your inter-personal communications cause others in the world to perceive you wrongly, it is well-understood that you could have problems in your life.

Generally, it is important in our day-to-day inter-personal communications that we understand how the subtext of our communication is being received.  If you ask someone if they want to have a beer with you after work, there is world of difference between “No” and “Gee, I’m sorry, but my daughter has soccer practice”.  Most of us, when we speak with someone face-to-face or on the telephone, understand the subtext, even as we recognize that the understanding itself is sometimes in error.

Roger’s absence first became a problem with the widespread introduction of e-mail into mostly business correspondence.  If I send an e-mail inviting a colleague out for dinner when I’m in town next week, many people will find it acceptable to reply “No” in an e-mail when they really mean “Gee, I’m sorry, but I’m out of town that day”, even when they would never dream of responding with the terse “No” in a face-to-face situation.  It is part of a complex issue, one on which I don’t propose to write a treatise, but a major contributory factor is that, for most of us, it takes far longer to compose an e-mail message that properly encapsulates the subtext with which we wish to endow our response, and often we just don’t feel we have that time.  Personally, I find that excuse to be a lazy one, and if not, then disrespectful towards the recipient.

In today’s world, for many people the text message has replaced the e-mail, particularly for one-on-one conversations.  Partly by their nature, and partly due to the hardware typically used to send them, text messages tend to be terse by default.  Additionally, text message conversations tend to replace telephone conversations for many people.  They want to multi-task.  They will fit your text conversation in as they find time during the course of their day.  And so will you.  Consequently, the alternative of a phone call takes on something of the aura of an intrusion.  Which is rather frustrating, since, in the grand scheme of things, a phone conversation is always many, many times more effective.  But that too is another discussion.

This is where Roger comes in.  The lingua franca of texting is the curt message.  I don’t know about you, but I really feel the need to know that a message has been received and understood.  If I send a text that says something to the effect of “I need to see your report by the end of the day”, I feel unhappy if I don’t get a response.  It’s like if I said the same thing to that person in my office, and he just walked out without responding.  It is very clear to me how I would interpret such an action.  And shortly thereafter it would be equally clear to the other person.  What is missing is Roger.  All you need is a text response that says “Will do”.  Or even “K”, if that’s your thing.  Sadly, and frustratingly, I am finding that Roger is very much the exception rather than the rule these days.

I said earlier that inter-personal “handshaking” protocols are to a large degree cultural, and maybe that’s what’s going on here.  A texting culture is arising – or has arisen – in which subtext is no longer being conveyed at all – even emoticons, as best as I can tell, are en route to being passé.  If so, that is a counter-productive development.  I do converse with a few people routinely whose preferred mode of communication is the test message, and have done for some time.  I still find it a frustrating medium, and mainly because I dearly miss Roger, which I still give, but rarely receive.

It has been several months now since I concluded my review of the PS Audio DirectStream DAC, and a pretty positive review it was.  Since that time the unit has continued to very gradually break in, and there have also been a couple of firmware updates (all of which are available on PS Audio’s web site at no charge), each an improvement, but not sufficient to justify adding substantially to the gist of my review.  But recently there has been a major firmware update – version 1.2.1 is its designation – which is a sufficient game-changer that it warrants a whole update of its own.

So why should something as perfunctory as firmware change the sound of a DAC?  Normally when we think of firmware updates we think of functionality rather than performance.  And indeed there are functionality issues which are addressed here – the DirectStream now fully supports 24/352.8 PCM, which it did not do with the original firmware.  But in a DAC in particular, a large part of what it does performance-wise lies in the processing of the incoming data in the digital domain, and those processes are often under the control of the firmware.  Particularly in the DirectStream, where all that processing happens on in-house PS Audio firmware rather than within the proprietary workings of a third-party chipset.  What goes on under the aegis of its firmware is to a large degree the heart and soul of what the DirectStream is all about.

I have communicated at length with Ted Smith, designer of the DirectStream, about the nature and effect of the changes he has made.  I’m not sure how open those discussion were intended to be, and so I will not share them in detail with you, but there are two areas in which his attention has been primarily focussed.  The first is on the digital filters, and how their optimal implementation is found to affect jitter, something which initially surprised me.  The second is on the Delta-Sigma Modulators which generate the single-bit output stage, always an area ripe for improvement, in which Ted has reined in the attack dogs which stand guard to protect against the onslaught of instabilities.  Together, the effect of these significant updates has been transformative, and that is not a word I use lightly.

The simple description of the sound of the 1.2.1 firmware update is that it has opened up the sound.  Everything has more space and air around it.  Sonic textures have acquired a more tactile quality.  The music just communicates more freely.  It would be easy to sit back and characterize the sound as more “Analog-” or “Tube-” like.  These are words the audiophile community likes to use as currency for sound that is quite simply easy to listen to.  It is interesting that we audiophiles admire and value attributes such as sonic transparency, detailed resolving power, and dynamic response, and yet how often is it that when we are able to bring them together the result is painfully unlistenable?  It is these Yin and Yang elements that are foremost in my mind as I listen to the 1.2.1 version of the DirectStream.

So, without further ado, what am I listening to, and how is it sounding?

First up is “Bending New Corners” by the French jazz trumpeter Erik Truffaz.  This is a curious infusion of early-‘70s Miles Davis, ambient groove jazz, and trip-hop, which brings to mind the sort of music that might have deafened you in the über-trendy restaurant scene of the 1990’s.  I first heard it on LP at the Montreal high-end dealer Coup de Foudre, and today I’m playing the CD version.  The mix is a relatively simple one involving trumpet, bass, keyboards and drums, plus the occasional vocal stylings of a rapper called ‘Nya’.  The music is set in an atmospheric ambient, and is quite simple in its sonic palette, but nevertheless I have always had trouble separating out the individual instruments.  I was keen to know what the additional resolving power of the 1.2.1 DS would make of it.

What the additional clarity brought was the realization that I have been hearing the limits of this recording all along.  The trumpet has a very rich spectrum of harmonics which overlay most of the audio spectrum, and when it plays as a prominent solo instrument those harmonics can intermodulate with the sounds of many other instruments, making it difficult to hear through the trumpet and follow the rest of the mix.  If the intermodulation is baked into the recording, then no degree of fidelity in the playback chain is going to solve that problem.  This is what I am plainly hearing with the 1.2.1 DS.  This recording, far from being a clean and atmospheric gem waiting for an extraordinary DAC to liberate its charms, is a bit of a digital creation.  The extraordinary DAC instead reveals its ambience as a digital artifact.  The lead trumpet and vocals can be heard to have a processed presence about them.

Once you have heard something, you can never “un-hear” it again.  It’s a bit like skiing, in that once you’ve mastered it, it becomes impossible to ski like you did when you were still learning.  At best, all you’ll manage is a caricature of a person skiing like a novice.  I can now go back to the CD of “Bending New Corners” on a lesser system and will recognize its flaws for what they are, even though previously I would have interpreted what I was hearing differently.

My experience with Bending New Corners was to be repeated many times.  As I type, I am listening to Ravel’s Bolero with the Minnesota Orchestra conducted by Eiji Oue on Reference Recordings, ripped from CD.  It begins with a pianissimo snare drum some 20 feet behind the speakers and slightly to the right of center.  This recording has always been one of which I have thought highly.  The solo pianissimo snare is a good test for system imaging.  However, I now hear the snare as living in a slightly smeared space.  I perceive its sonic texture differently – more plausibly accurate if you will (a layer of sonic mush hovering around the instrument itself has evaporated away like the early mist on a spring morning) – but I somehow cannot place the image more accurately than a few feet.  I surmise that, because my brain is more confident that it is hearing the sound of a pianissimo snare drum, it therefore also expects to hear that sound more accurately localized in space.  But it is unable to do that.  As a consequence, although I never previously thought that the stereo image was wanting, I now appreciate that in fact it is, and I wonder how a higher-resolution version of this recording would compare.

Here is a song my wife likes.  It is “Hollow Talk” from the CD “This is for the White in your Eyes” by the Danish band Choir of Young Believers.  My wife had me track it down because it is the theme tune on a Danish/Swedish TV show we have been watching on Netflix called The Bridge (Bron/Broen).  It is another example of how the DS 1.2.1 can render a studio’s clumsy machinations clearly manifest.  The echo applied to the vocal adds atmospherics but is just unnatural.  As the track proceeds, the production gets layered on and layered on – and then layered on some more.  The effect is all very nice when heard on TV, but on my reference system driven by the DS 1.2.1 it just calls out for a lighter touch.  For example, at the beginning I heard a faint sound off to the left like someone getting into or out of a car and closing the door.  I don’t see why they wanted to include that – I can’t imagine it is particularly audible unless you have a highly resolving system such as a DS 1.2.1, one which makes clear the dog’s breakfast nature of the recording.

Next up is another old favourite of mine, “Unorthodox Behaviour” by 1970’s fusion combo Brand X.  I saw the band live at Ronnie Scott’s club in London back in 1975 (or thereabouts) and bought the album on LP as soon as it came out.  Today, I’m playing a 24/96 needle-drop.  I just love the opening track, Nuclear Burn.  Percy Jones’ bass lick is original and memorable, and extremely demanding of technique.  DirectStream 1.2.1 lets me hear the bass line more clearly than I have ever heard it before.  I had always thought it to have a slightly muddy texture – not surprising, given that playing it would tie most people’s fingers into inextricable knots – but now I hear just how extraordinarily skilled Jones’ bass chops really were.  And below it, Phil Collins’ kick drum has acquired real weight.  Not that it sounds any louder, or deeper.  It is more like the pedal mechanism has had an extra 5lb of lead affixed to it.

Now to a lonely corner of your local music store, where the Jazz, Folk, and Country aisles peter out.  This is where you’ll find Bill Frisell’s 2000 CD “Ghost Town”, a finely recorded ensemble of mostly acoustic guitar and banjo music with Frisell playing all the instruments.  Despite the album’s soulful and contemplative mood, due at least in part to the sparse arrangements and absence of a drum track, I keep expecting it to break out suddenly into ‘Duelling Banjos’.  The track list comprises mostly Frisell original compositions together with a handful of well-chosen covers.  Apart from enjoying the music, the idea here is to play Spot The Guitar.  On a rudimentary system this involves telling which are the guitars and which the banjos.  As the system gets better, you start to be able to tell how many different models of each instrument are being played.  With the DS 1.2.1 I suspect you could go further and identify the brands (Klein, Anderson, Martin, etc.).  Me, I’m not a guitar head, and can’t do that (although, back in the day, I used to be able to reliably tell a Strat from a Les Paul, even on the most rudimentary systems), but I do hear the different tonalities and sonorities very clearly.

Gil Scott-Heron is credited in some circles as being the father of rap.  He was a soulful yet extremely cerebral poet-musician with a strong sense of a social message.  His 1994 CD “Spirits” was a bit of a swan song, and contains a track “Work for Peace” which is a political rant against the ‘military and the monetary‘, who ‘get together whenever it’s necessary‘.  I kinda like it – it is, I imagine, great doper music … yeah, man.  But the mostly spoken voice is very soulfully and plausibly captured.  You can imagine the man himself, in the room with you.  I would just love to hear the original master tape transferred to DSD.

I Remember Miles” is a 1998 CD from Shirley Horn.  It’s a terrific recording, and won the Grammy for Best Jazz Vocal Performance.  But really, it is an all-round wonderful album.  And the standout track is an absolute classic 10-minute workout of Gershwin’s “My Man’s Gone Now” from Porgy and Bess.  It begins with Ron Carter’s stunning, ripely textured, ostinato-like bass riff which underpins the track.  It has always sounded to me like two basses – one electric and one acoustic – but with the latest DS 1.2.1 the electric bass tones now sound more and more like an expertly played and finely recorded acoustic bass, and in addition I’m beginning to think there’s just the one bass – perhaps even double-tracked.  I’d love to know what you think.  Aside from the tasty bass, the rest of the recording is revealed to have a smooth but slightly congested, slightly coloured sound, a bit like what I hear when I listen to SETs played through horn speakers (I know, I know, heresy.  Kill me now.).  The immediacy and sheer presence of a fine DSD recording is just not there.  Unfortunately, this has not been released on SACD either.  Perhaps a DSD remaster will finally put the bass conundrum to bed?

Which brings me to the nub of this review.  Finally, the DirectStream is delivering on its huge promise as a DSD reference.  With the 1.2.1 firmware, it is opening up a clear gap between its performance with DSD and PCM source material, along the exact same lines as my previous experience with the Light Harmonic Da Vinci Dual DAC.  The DSD playback just adds that extra edge of organic reality to the sound.  It just sounds that little bit more like the actual performer in the room with you.  Sure, CD sounds great on it – probably as good as I’ve ever heard it sound – but the DS 1.2.1 consistently shows CD at its limits.  Great sound requires more than CD can deliver across the board, and in my view the DS 1.2.1 – through its excellent performance – makes this about as clear as it’s ever going to be.

In Part II of my review I mentioned the CD of Acoustic Live by Nils Lofgren.  I recently came across a SACD of music from the TV series “The Sopranos”, and it contains “Black Books” from the Lofgren album.  The CD is a pretty special recording, but the DSD ripped from the SACD just blows it clean out of the water, if you can imagine such a thing.  The vocal has incredible in-the-room-in-front-of-you presence.  All of the acoustics, which were already pretty open, really open up.  The pair of tom-toms I mentioned take on individual tonality, texture, and weight.  And the guitar work, which I previously characterized as being ‘aggressively picked’ comes across with a much more natural and plausible sound.  You just cannot go back to the CD and hear it the same way.  DAMN!  Someone needs to release this whole album on SACD, and preferably as a DSD download.

Another great SACD is MFSL’s remastering of Stevie Ray Vaughan’s “Couldn’t Stand The Weather”, with its perennial audiophile favourite “Tin Pan Alley”.  Beginning with a solid kick drum thwack, it launches into a cool, laid-back, 12-bar blues.  Vaughan’s guitar has just the right combination of restraint and blistering finger work, and his vocal is very present and stable, just to the left of centre.  The rhythm section lays down a fine metronomic beat, playing the appropriate foundational role upon which SRV builds his performance.  By contrast, in their uncomplicated take on Hendrix’s “Voodoo Chile”, the drums are given full rein to pound out a tight and impactful rhythm, and SRV gives his guitar hero chops a good airing.  If you’re unfamiliar with SRV and want to know what the man was about, this would be the place to start.  It is a fantastic recording, and one that has been expertly transferred to SACD.

The Japanese Universal Music Group has remastered and released many classic albums in their SHM-SACD series, all of which are both hard to come by outside of Japan, and ruinously expensive.  Their work on Dire Straits’ “Brothers in Arms” is interesting.  To the best of my knowledge the original recording was on 20-bit 44.1kHz digital tape (but there are people around that know more than me about those things).  Anyway, the fact is that there is no obvious reason why a remastered SACD should sound significantly better than the original CD, unless, of course, the latter was not well mastered.  However, the conventional wisdom is that Mark Knopfler was particularly anal about the recording and mastering quality, and so maybe that argument doesn’t hold water.  Additionally, the Universal SHM-SACD can be compared with a contemporary remastering by MFSL, and both can be compared to the original CD.

Right away, both SACDs come across as superior to the CD in all the important ways.  The title track, Brothers in Arms, is one of my all-time go-to tracks.  On both remasterings, with the DS 1.2.1 the vocal has that signature SACD presence, and Knofler’s guitar work sounds more organic, more like a real instrument in the room with you – just like with the Nils Lofgren.  I puzzled over how and why two SACD remasters from impeccable digital sources could sound different.  But they do, and maybe someone could enlighten me about that.  The two remasters sound almost stereotypical (there’s gotta be a pun in there somewhere) of how we think of Japanese and American musical tastes.  The Japanese SHM-SACD is massively detailed, but with slightly flat tonal and spatial perspectives compared to the American MFSL.  The latter’s tonal bloom fills the acoustic space in a more immediately appealing manner, but at the apparent cost of some of that delicious detail.  If one is right, then the other must be wrong, so they say.  You pays your money, and you takes your choice.  But the bottom-line is that with a DAC of the resolving power of the DS 1.2.1 considerations such as these are going to weigh more heavily than might otherwise be the case.

So there you have it.  The 1.2.1 firmware update will transform your DirectStream from a great product into a game-changing product.  I concluded my last review by comparing the DirectStream, with its original firmware, to my all-time reference, the Light Harmonic Da Vinci Dual DAC.  I felt that, based on my aural memory, since I no longer have the Da Vinci to hand, that the DirectStream was not quite up to the latter’s lofty standards.  With the 1.2.1 firmware I am no longer so sure about that.  I would need to have both DACs side-by-side in order to be certain.  But this time around my aural memory tells me that the DirectStream in its 1.2.1 incarnation could very well give the Da Vinci a good run for its money.  And in some areas, such as its bass performance, I even wonder if the DirectStream might not come out on top.  Let’s bear in mind the price difference – $6k vs $31k.  That’s an extraordinary achievement.