BitPerfect user Eugene Vivino writes: “As you may be aware, PS Audio is releasing the DirectStream – a product that upscales all signals to 2xDSD. They say they do this because all PCM processors mask the sound in some way, while DSD outputs a more realistic and believable signal. And reviewers say that it sounds ‘right’. Now that BitPerfect supports DSD, would it be possible to add real time DSD conversion?”.
Actually, we will soon be taking delivery of a DirectStream. We are getting one of the first batch of production units. We continue to hear great things about this product, from both public and private sources, and as the hype has been quite intentionally built up I feel justified in expecting great things of it. But does DirectStream’s contribution to the state-of-the-art derive from the fact that it upscales to 2xDSD? I have received both private and public communications from Paul McGowan and his team on this subject, and I have no intention of commenting on any aspect of any information communicated privately. But I am prepared to comment on some of the publicly-acknowledged elements of DirectStream’s design.
One of the defining aspects of the DirectStream design lies in the fact that it does not use anything that you would describe as a conventional DAC chip. So I need to recap briefly on what a ‘conventional DAC chip’ actually does. What these devices do is to convert the incoming PCM signal to what is often erroneously described as DSD, primarily for the entirely laudable motive of avoiding customer confusion. This is because DSD is 1-bit 2.88MHz, and nothing else. To create DSD it is necessary to pass an incoming signal – whether analog or digital – through a thing called a Sigma-Delta Modulator. These SDMs can be configured to produce all sorts of different outputs, and 1-bit 2.88MHz (DSD) is just one possibility. More specifically, SDMs can produce a multi-bit output as well, with as many bits as you like. It turns out that there is at least one seriously good reason to use a SDM with a multi-bit output. This is that SDMs can be unstable, and, in particular, a 1-bit SDM can be seriously unstable. SDM design is not for the faint of heart. A typical commercial DAC chip will take the incoming PCM and pass it through a SDM to create a multi-bit, even higher sample rate, version of “DSD”, which it then converts to analog.
What Paul’s team have done is to focus on one of the fundamental benefits of single-bit SDM-based DACs. With a multi-bit output – such as PCM itself – the DAC has to create an output voltage which has a magnitude determined by the bit pattern. The more bits, the more different possible output levels there are. With 1-bit, there are only two levels – encoded as 1 and 0 – and from an electrical perspective, all a DAC has to do is switch its output between two fixed voltage sources represented by those numbers. These voltage sources can be, for example +1V and -1V, and can be controlled and regulated with fantastic precision, and with extremely low noise. The job of the DAC is then simply to switch the output signal line between one voltage source and the other. This is something you don’t need a chip to do, and is furthermore something you can employ a lifetime of audio electronics circuit design experience to realizing in the best possible manner.
Of course, DirectStream has some other quite unique elements to its design, such as its approach to jitter rejection. But all that aside, the only thing that counts is what it sounds like, and I am as intrigued as the next guy on that score, and quite impatient to boot!
So when Eugene Vivino observes that DirectStream upscales all incoming PCM to 2xDSD, the truth is that virtually all DACs do something very similar, except that it isn’t necessarily 2xDSD. And there is nothing particularly special about 2xDSD, other than that it is a bit better than 1xDSD. Unless you are using what is termed a ladder DAC, a vanishing beast with only a few examples still in production, the process of Digital-to-Analog conversion fundamentally involves converting the PCM data to one of the “DSD-like” SDM-produced formats. This in and of itself is unlikely to be the source of any differentiating performance attributes that might emerge from DirectStream.
Now, as I mentioned earlier, SDM design is not for the faint of heart. It is based on some massively complicated signal-processing mathematics. A breakthrough in SDM design will often earn you a PhD. There is also a lot of untapped potential in SDM design waiting patiently in the wings for computing power – always inexorably moving forward – to reach a level necessary for these process-intensive SDM designs to be brought into real-world implementation.
Yes, BitPerfect could implement PCM-to-DSD conversion on-the-fly to enable your PCM content to be delivered to your DAC in DSD form. But simply converting PCM to DSD is no panacea. It will not suddenly release trapped information in your PCM data stream. The only reason you would want to do that is if BitPerfect’s PCM-to-DSD conversion algorithm would produce a significantly better result than the equivalent functionality inside your DAC’s DAC chip. Make no mistake about it, these chips contain algorithms designed by teams of dedicated mathematicians, who know their stuff, and are implemented on silicon designed expressly for signal processing. So, at BitPerfect, we do not have an SDM of that quality, and with all respect to the other fine player Apps out there, some of which offer this capability, none of them do either.
The SDM technology we are working on requires seriously impressive amounts of processing power, so when we reach the point where we are able to implement it, it will not be introduced in a real-time system, but likely as an option in a forthcoming version of DSD Master.
They say “seeing is believing”, but nobody ever proposes that “hearing is believing”. Yet how our brains make sense of what we see, and how our brains make sense of what we hear, seem to be accomplished more or less the same way. In each case, the brain starts by trying to figure out what it might be seeing/hearing, and tries to correlate what it actually sees/hears with its pre-determined idea. If the correlation is good, then our brains are able to conclude convincingly what it is that we are seeing/hearing. So, in order to see or hear something accurately, we don’t so much have to actually see it, or actually hear it. Rather what we need is an ensemble of evidence that allows our brains to make the necessary correlation in order for us to feel confident that we know what we are looking at or listening to.
There are many examples of this in action. The most obvious ones are ambiguous images, such as the one that is either a black table lamp against a white background or the white silhouettes of two faces looking at each other against a black background. I’m sure you can think of many other examples. When we look at such a picture, we cannot see both interpretations simultaneously. When we consider it to be a table lamp, we don’t perceive the faces. And when we see the faces, we don’t perceive the table lamp. For us to switch the way we see the image, we must consciously switch from one mode to the other. The more complex the image, the more effort is needed to switch our perception from one perspective to the other.
Over the last couple of weeks this has been illustrated to me in an interesting way. During this time, the hour I have been waking up in the morning has coincided with the time the sun rises, and starts to illuminate by bedroom. I have quite heavy curtains that do a pretty good job of keeping the sun out. But nonetheless the sun works way past the cracks. In doing so, it starts to illuminate the ceiling above my bed, throwing shadows of my curtain rail as it does so, making a pattern of dark lines across the ceiling. My ceiling is a flat white, and has an all-white 5-blade ceiling fan in the middle. At this time of year the fan is turned off. The angle of the fan blades is such that some of the blade faces are in the shadow of the encroaching sunlight, whereas one in particular faces it directly.
OK, you get the idea. Now, as the sun gradually rises, and illumination level rises, the fan blade which faces the sun just happens to take on the exact same shade of white as the ceiling behind it, and I cannot tell the two surfaces apart. I know exactly where that blade should be, because I can see the rest of the fan outlined quite clearly by its shadows. But over the course of about ten minutes, as the light gradually improves, what I see on my ceiling is a five-bladed fan, with one blade clearly missing. It is just invisible, as though it had been removed. My eyes do not detect the difference in tone of the white fan blade and the white ceiling behind it, so my brain tries to interpret the image as best as it can, and the best correlation it can come up with is the one with the missing fan blade. I know it is there, but try as I might I cannot perceive any indication of the presence of the apparently missing blade.
As the sun continues to rise, and the room gets brighter, eventually the optical illusion is replaced with the reality of the five-bladed fan, but at the point of transition an interesting thing happens. At a certain moment, if I concentrate hard enough I can see the emergence of some contrast around the edges of the “missing” fan blade. My brain can lock onto that and suddenly can “see” all five blades. However, like the aforementioned optical illusions, with some effort I can switch between perception modes. The fan can have either four blades or five.
This is where it gets interesting, though. I mentioned at the start that the curtain rail casts a shadow across the ceiling comprising a number of thin dark lines. Bear in mind that this is still a low-light situation, so the thin shadow lines are only just visible, but visible nonetheless. One of those shadow lines happens to run right through the disappearing fan blade. So when my brain sees the five-bladed fan, it sees that the shadow runs across the ceiling behind the fan blade but not across the fan blade itself. In other words, my brain recognizes that the fan blade obscures the shadow, so I see a shadow line interrupted by the 6 inches or so of the fan blade. Am I making myself clear? Good.
So what happens when I switch my perception mode to that of the four-bladed fan? In that case, my brain’s model has constructed no blade to obscure the shadow line, and so it expects to see the shadow line pass uninterrupted across the portion of the ceiling no longer obscured by the apparently missing fan blade. In short, when I visualize the five-bladed version of the fan, I see the shadow line interrupted by the fan, but when I visualize the four-bladed fan, I see the continuous shadow line. This is not a sub-conscious artifact – I can consciously switch my perception between the two versions of the apparent optical illusion. I am fully aware of the apparent contradiction which is that something which is quite obviously visible in one version becomes equally obviously invisible in the other version. Seeing is believing, indeed.
This holds some lessons for us in interpreting how we hear, if we are willing to accept this model of cognitive perception. It tells us that in order to be satisfied that we are hearing a certain thing, it is not sufficient to determine whether that thing is in and of itself audible. What we need is for that thing to make sense in the light of everything else that the brain is hearing at the same time. And I should say perceiving, rather than hearing.
As an example, humans have an uncanny ability to locate sounds in three dimensions. Not only can we locate something from left to right, but we can also locate it in terms of distance, and also in terms of height. How we are able to do this remains a topic of active research. The simple picture of how we hear is that we have two ears, and that our brains can infer the direction from which a sound is coming based on the time delay between the arrival of those sounds at our two ears. But this does not explain, for example, how we can perceive any vertical component to the localization.
This area of research has also shown up some remarkably interesting results. In the right conditions, test subjects can reliably differentiate between sounds originating from two points in space which are remarkably close together. When you crunch the numbers, the time difference between the arrival of the signals from those two locations is absolutely minuscule – of the order of 10 microseconds. This generates some tough problems, since in order to achieve that degree of temporal resolution, it is more or less a requirement that whatever is doing the detecting must have a bandwidth of around 250kHz. Considering that our hearing has a measurable upper limit of the order of 20kHz, this is problematic, and no theories yet exist to physically account for it.
However, some interesting observations do suggest that humans can perceive audio signals at frequencies considerably higher than 20kHz. By connecting a person up to a brain scanning device of some type, researchers have shown that the human brain can show a measurable response to audio signals at frequencies as high as 45kHz, even though the subject reports that they don’t hear a thing.
So if we base our models of audio reproduction theory solely upon simple stereo signals with a bandwidth of 20kHz, we may find ourselves unable to account for everything that practical experience throws at us. There are new things that we need to learn, but the fact that we don’t know what they are is not an excuse for assuming they don’t exist.
Most of you will be very familiar with Moore’s Law, formulated by Gordon E. Moore, the founder of Intel, way back in 1965. Imagine, if you can, the state of electronics components technology back then. Integrated circuits were in their infancy, and indeed few people today would look at the first-ever 1961 Fairchild IC and recognize it as such. This was the state-of-the-art when Moore formulated his law which states that the number of transistors in an IC would double every two years. Considering the infancy of the industry at the time Moore made his prediction, it is astonishing that his law continues to hold today. In 1965, commercial ICs comprised up to a few hundred transistors. Today, the biggest commercial ICs have transistor counts in the billions. Also, every ten years or so, sage observers can be counted on to pronounce that Moore’s Law is bound to slow down over the coming decade due to [fill-in-the-blanks] technology limitations. I can recall at least two such major movements, one in the early 1990’s, and again about 10 years later. The movers and shakers in the global electronics industry, however, continue to base their long-range planning on the inexorable progress of Moore’s Law.
Last night I attended a profoundly illuminating talk given by John La Grou, CEO of Millennia Media. John showed how Moore’s law applies in similar vein to a number of core technologies that relate to the electronics industry. He touched on the mechanisms that underly these developments. However, what was most impressive was how he expressed the dry concepts such as transistor counts in more meaningful terms. The one which particularly caught my attention was a chart that expressed the growth in computer power. Its Y axis has units like the brainpower of a flea, the brainpower of a rat, the brainpower of a human, and the combined brainpower of all humans on earth. In his chart, today’s CPU has slightly more than the brainpower of a rat, but falls massively short of the brainpower of a human. However, by 2050, which will be within the lifetimes of many of you reading this, your average computer workstation will be powered by something approaching the combined brainpower of every human being on earth.
I wonder if, back in 1965, Gordon Moore every paused to imagine the practical consequences of his law. I wonder if he contemplated the possibility of having a 2014 Mac Pro on his office desk, a computer possessed of processing power equivalent to the sum total of every computer ever built up to the time Apple introduced their first ever PC. Now Moore was a smart guy, so I’m sure he did the math, but if he did, I wonder if he ever asked himself what a person might ever DO with such a thing. I don’t know if posterity records his conclusions. In the same way, I wonder (and I most assuredly do not stand comparison to Moore) what a person might do in 2050 with a computer having at its disposal the combined brainpower of every human being on the planet. As yet, posterity does not record my conclusions.
La Grou’s talk focussed on audio-related applications. In particular he talked about what he referred to as immersive applications. In effect, wearable technology that would immerse the wearer in a virtual world of video and audio content. He was very clear indeed that the technology roadmaps being followed by the industry would bring about the ability to achieve those goals within a remarkably short period of time. He talked about 3D video technology with resolution indistinguishable from reality, and audio content to match. He was very clear that he did not think he was stretching the truth in any way to make these projections, and expressed a personal conviction that these things would come to fruition quite a lot faster than the already aggressive timescales he was presenting to the audience. He showed some really cool video footage of unsuspecting subjects trying out the new Occulus Rift virtual reality headsets, made by the company acquired yesterday by FaceBook. I won’t attempt to describe it, but we watched people who could no longer stand upright. Grou has tried the Occulus Rift and spoke of its alarmingly convincing immersive experience.
At the start of La Grou’s talk, he played what he described as the first ever audio recording, made by a Frenchman some 30 years before Edison. Using an approach similar to Edison’s, his recording was made by a needle which scratched the resultant waveform on a piece of (presumably moving) inked paper. This recording was made without the expectation that it would ever be replayed; in fact the object was never to listen to the recorded sound, but rather to examine the resultant waveforms under a microscope. By digitizing the images, however, we can replay that recording today, more than 150 years after the fact. We can hear the Frenchman humming rather tunelessly over a colossal background noise level. One imagines he never rehearsed his performance, or even paused to consider what he might attempt to capture as history’s first ever recorded sound. Anyway, the result is identifiable as a man humming tunelessly, but not much more than that.
At the end of the talk we watched the results of an experiment where researchers were imaging the brains of subjects while they (the subjects, that is, not the researchers) were watching movies and other visual stimuli. They confined themselves to imaging only the visual cortex. In doing so, there was no pattern to how particular images caused the various regions within the cortex to illuminate, but computers being the powerful things they are (i.e. smarter than the average rat), they let the computer attempt to correlate the images being observed with the patterns being produced. If I understand correctly, they then showed the subjects some quite unrelated images, and asked the computer to come up with a best guess for what the subject was seeing, based on the correlations previously established. There is no doubt that the images produced by the computer corresponded quite remarkably with the images which the subject was looking at. In fact, the computer was making as good a reproduction of the image that the subject was looking at, as the playback of the 150-year old French recording was to what one might imagine was the original.
I couldn’t help but think that it would be something less than – quite a lot less than – 150 years before this kind of technology advances to a practically useful level, one with literally mind-bending ramifications.
Here is a tip from BitPerfect user Jim Brower: “When ripping CDs, do not auto rip and eject, which lets iTunes determine the ripping order. Instead, use show CD, which allows you to view the ripping order. Where you see the track numbers begin, there is an arrow. You can click the arrow until it points up. This organizes the track order in the correct sequence from 1 to the end of the album. The first track will be on top and the normal linear sequence will be from top to bottom. This only has to be done once, iTunes will remember this setting. I have rerecorded about 50 live CDs with no gaps using this method.“
Thank you, Jim!
BitPerfect 2.0.1 contains several bug fixes only. This version only runs on OS X 10.7 (Lion) and up. We may return to supporting Snow Leopard in the future if we are able to do so.
BitPerfect 2.0.1 is a free upgrade for existing BitPerfect users.
Apple has now got back to us, and while they acknowledge that the iTunes bug which we reported does exist, they do not intend to devote any resources to fixing it. This is disappointing. In effect, Snow Leopard users are apparently no longer being supported by Apple.
So where do we go from here? Well, we do a have a couple of ideas. They should work for Snow Leopard users from a perspective of functioning well, but the question is entirely whether Apple will approve them for sale on the App Store. There are reasons to be pessimistic on that front. However, we will continue to plug away at it until we get an answer one way or the other.
In the meantime we are about to release an update on the App Store, version 2.0.1, containing numerous bug fixes. Version 2.0.1 will be marked for OS X 10.7 (Lion) and upwards, so Snow Leopard Users will not have access to it.
Until the situation resolves itself, Snow Leopard users are advised to contact BitPerfect via our support line.
There is a major bug in BitPerfect 2.0 for some Snow Leopard users. It turns out this is actually a bug in iTunes, not BitPerfect. We have reported it to Apple, and it remains to be seen if they will act upon it.
In the meantime, we are actively beta testing a workaround. Snow Leopard users are asked to contact BitPerfect Support for further assistance.
It appears there is a minor bug in BitPerfect 2.0. We will issue an update with a fix pretty soon, but in the meantime, the work-around is very simple.
If you enable DSD support in BitPerfect 2.0, then every time you select a new DAC as your Audio Output Device, BitPerfect will ask if it supports DSD. However, if the currently selected Audio Output Device supports DSD this dialog is never executed and so the existing Device remains flagged as not supporting DSD. This is only a problem if it DOES support DSD – because BitPerfect will not then play DSD through it. Comprende?
The solution is to go into BitPerfect’s Preferences window and select another device – AirPlay or the Built-In Output will do – and then go back and select your main DAC. Now, if this DAC has the basic capability needed to support DSD (i.e. it supports 24/176 or 24/352) it will ask if it supports DSD, and you will be good to go.
We need to execute this dialog on the currently selected Audio Output Device when DSD Support is first enabled.
BitPerfect Sound Inc is pleased to announce the release of version 2.0 of its music player for Mac, BitPerfect. The new version introduces DSD playback support, but also incorporates a new and improved 3rd-Generation 64-bit audio engine, and has support for future plug-in features.
The biggest new feature of BitPerfect v2.0 is its ability to support DSD playback. BitPerfect supports DSD playback in a new and unique way.
BitPerfect uses the iTunes App as both its music database manager and command/control interface. Many users appreciate this paradigm, either because they like iTunes, or because they have grown comfortable with it and prefer not to change. Either way, it has been a popular approach for BitPerfect, with over 20,000 active users worldwide. When it comes to DSD, though, this does present a problem, since iTunes does not recognize DSD files and will not import them.
The preferred solution adopted by others to date has been the use of proxy files. Proxy files are typically Apple Lossless files which contain highly-compressed digital silence of the exact same duration as the desired DSD file, plus a link to the location of the related DSD file. If iTunes tries to play the file, all it plays is silence. But if instead the proxy file’s related player plays the file, it is able to follow the link to the original DSD file and play that instead. Proxy files work fine, but they have limitations. Many installations use the same music database to serve multiple playback systems within the same home network. Only playback systems that support DSD will be able to play music from proxy files; everyone else will hear only silence. The solution to this is to use a converter App to make a PCM version of the DSD original. This PCM version can be loaded into the iTunes database, and playback systems which do not support DSD can play this version instead. Again, this works fine, but has limitations. In this case your music database will contain duplicate versions of all of your DSD tracks – the DSD originals represented by proxy files, and the PCM conversions. These each have to be carefully labelled so as to enable every user to identify which is which. And then all these users need an element of “training” so as to know which version is to be played on which system. At BitPerfect, we have never liked the path down which proxy files lead us.
Our solution involves the introduction of a new file format we term “Hybrid-DSD”. This is an extension of the Apple Lossless file format. A Hybrid-DSD file is an Apple Lossless file. It can be imported into iTunes (and any other software that supports ALAC) and will play normally like any other Apple Lossless file, with or without BitPerfect. However, it also contains a hidden copy of the native DSD file, and when BitPerfect is controlling playback it can read this DSD content and, if the audio output device supports DSD, will play it back in its native DSD format. So if the audio output device supports DSD, BitPerfect will play it natively. If it does not, it will play the PCM content (as in fact will any other playback software that supports Apple Lossless). None of this need ever concern the user. No special instruction is required. For installations where the same music library is shared among multiple playback stations this is the most elegant and user-friendly approach we can think of.
The creation of Hybrid-DSD files requires BitPerfect’s new companion App “DSD Master”, also available from the Mac App Store. Not only does DSD Master produce Hybrid-DSD files, it also produces conventional PCM files in Apple Lossless, AIFF, WAV, and FLAC formats. As an added bonus, DSD Master’s conversion algorithm is the best in the industry and we invite you to compare for yourself.
Aside from DSD support, BitPerfect v2.0 introduces a new and improved audio engine which enables us to provide a plug-in based structure. We will be able to use this structure to offer a range of future plug-ins which will provide various specialist functionalities which would not justify being included in the baseline package. After all, BitPerfect’s paradigm is to keep things simple and stay out of the user’s way as much as possible, even while doing highly sophisticated and uncompromised state-of-the-art work.
With BitPerfect v2.0 we drop our support for the very oldest 32-bit Intel CPUs which were able to run OSX 10.6. Henceforth, BitPerfect will require a 64-bit CPU. BitPerfect v2.0 is available at the same low price as before, and the upgrade from previous versions of BitPerfect remains free of charge.
I have today installed the 10.9.2 update to OS/X and the 11.1.5 update to iTunes. I have been running both with BitPerfect most of the day and have not encountered any problems on either of my main test systems.