[FekLar]'s MP3 Clipping Tutorial Page 1

[FekLar]'s MP3 Clipping Tutorial Page 1

Important Notice:

When you check for clipped files, make sure WinAMP's Equalizer is turned off, and don't have any DSP plugins running !!!

The sound level display shows a song that was ripped with the recording levels set perfectly.

WinAMP kicks ass

Here's an image you are all familiar with: WinAMP's main window.

Pay special attention to the VU level (fire) display. Notice that the image is animated, but at a much slower speed than real time. This is just what is needed for the purpose of this explanation.

This image is displaying the playback of an mp3 that was made with the record levels set properly when the file was ripped and encoded.

Many people, including even the programmers of some of the available mp3 rippers, have been making a serious error in the ripping process, in that the record levels are set too high, and the mp3 file that results is overdriven and distorted.

The animated image above is not important now, but you will want to refer back to it later, after you examine images of files that weren't so well recorded, and you have a better understanding of what clipping is. If you want to take a shortcut, now would be the time to do it. You can always come back and read the details of clipping here later. A detailed description of clipping follows this paragraph, but if you just want the basics, scoll down now to the images of Cool Edit below, examine them, and then scoll to the end and go on to the second page. Recordng an mp3 with the signal levels set too high results in clipping and distortion. For the solution, after you have seen the other three pages, read what follows the Cool Edit images below.

Most the phenomenon of clipping has its roots in common cassette tape recording procedures. In the past, when recording music onto a cassette tape, it was a common and recommended procedure to set the record levels high enough so that the VU meters went repeatedly into the red zone (past 100 percent signal level), at about 120 percent signal level. If this was not done, tape hiss would be very noticable on the finished recording, and the recording would have to be redone.

The tape was actually physically capable of storing at recoding made at 120 percent signal level properly, because some of the signal was lost in the circuits after it passed through the VU meters on its way to the tape heads, and more was lost in the magnetic field between the tape head and the tape. The VU meters, in other words, except in very expensive high-end decks, were usually not a valid indicator of the signal that was being delivered to the tape, only of the signal that was passing through the VU meters. The recording circuit acted as a simple amplifier circuit, and was capable of going well over 120 percent without producing distortion, because the amplifer (the source) was not being overdriven, only the targets were (the VU meters and the tape media). In other words, the VU meters did not indicate amplifier distortion, but the distortion which was to be expected to be present on the tape when it was played back.

In addition, many tape oxide formulations had enough headroom to record signal levels in excess of the standard capacity. VU meters on standard tape recorders were calibrated to the lowest common denominator, that is, the least expensive tape. Most standard bias tapes (for example, TDK D-90s) was better quality that this, and could withstand an extra 10 percent past the standard oxide saturation ratings for 100 percent saturation without suffering any clipping or distortion. Chrome oxide (CrO2) bias tapes were a vast improvement, and could withstand as much as 40 percent beyond 100 percent. With metal tape, one could go up to 160 percent, and still get no level-induced distortion. The peaks could be in the red zone on the VU meters half the time, and there would be no distortion.

This is somewhat simplified, and there are other reasons why tape could sustain higher levels, but it would complicate the matter to discuss them.

Suffice it to say, that's the way everyone did it, because that's the way it worked best. Most people used chrome tape, and recorded at about 130 percent signal level.

Then came digital recording. Gone were the numerous hard to quantify variables associated with recording to tape.

100 percent signal level to the hard disk meant precisely 100 percent back during playback, not the 80 percent back from tape had the recording been made to a cassette tape.

Professional digital recording is entirely noiseless, so the problem of tape hiss disappeared. Although your sound card probably generates a small amount of electronic noise that it picks up from the PCI bus and the CPU, it is usually less than -45 decibels, far below the noise and hiss levels found on tape. In professional digital studio recording, they do use virtually noiseless equipment. This is what gets transferred to the audio CD that you end up ripping to mp3. Except in a very rare case where a professional band or musician wants distortion for an effect, all studio tracks have been recorded with volume levels of between 0 and 100 percent.

In 16 bit digital recording, each 16 bit value in the (wave file) data stream contains a value between -32768 to 32767, with 0 representing 0 (mapped to 0), -32768 mapped to -100 percent signal level, and 32767 mapped to 100 percent signal level. This allows 65536 (or technically, 32768) possible volume levels.

The problem of clipping shows up in the ripper, not in the encoder. A digital CD track, even in a direct CD track to mp3 recorder, is first ripped to a temporary .WAV file before getting compressed to an .mp3 file.

This is where there is potential for trouble, because the WAV file recorder can act exactly as a tape deck. Actually, its worse than a tape deck. The tape at least attempts to assimilate the extra signal level beyond the point of saturation, but digital recording outright discards the excess values.

Record levels past 100 percent signal level can be sent into the wav file geneerator program by the ripper. Clipping is the result.

Clipping is defined as the tops and the bottoms of the sound wave being cut off, and any subtleties or music information in the clipped parts of the waveform are lost. If you know what a sine wave looks like, this flattens the tops and bottoms of the wave. Clipping also happens when a tape is overdriven with too high recording levels. The result is saturation of the tape oxide, and the oxide cannot carry any signal past its saturation level. The result is the same, the tops and bottoms of the waveforms are clipped off. Tape recording is where the common use of the term clipping originated. Before cassette and 8 track tape, clipping was a term known only to broadcast and recording engineers, and to a few musicians.

When levels greater than 100 percent are fed into the wave file generator, any amount past 100 percent is simply discarded, and a value of 65535 (100 percent) is placed into the file at that point to represent that value. It might, at first glance, appear that a better approach would have been to generate an error condition and exit, but the issue is not as simnple as that.

Although significant clipping as rare in studio produced music, almost all studio pruduced music has some small degree of clipping. This is a result of a tradeoff between the least amount of distortion and the greatest degree of dynamic bandwidth. More dynamic bandwidth can offset tiny amounts of distortion that are the inevitable price of increasing the dynamic range to a large amount. This is not a bad thing, and music sounds best when it has the geatest degree of dynamic range, but a small amount of clipping is the inevitable result. So, since almost all CD music contains a very small amount of clipping, if the .wav file generator shut down every time it encountered clipping, it would never be able to write a wave file from a CD track.

The end result of a large amount of clipping is distortion, because any signal level beyond 100 percent is lost. They say a picture is worth a thousands words, so let [FekLar] save you a few thousands of words.

Take a graphic look at the consequences of clipping:

Cool Edit kicks ass

Here is a sound wave, a part of a standard 16 bit 22KHz stereo WAV file. (a Windows MechWarrior II Desktop sound theme file) Note the easily discernable variations in the signal level and the well defined wave peaks. Especially note the differences in amplitude of the various peaks. This type of waveform display graphically displays the entire dynamic range of the wave. The red horizontal line through the center of each channel represents zero signal level, and the wave radiates towards the positive (above the line) and negative (below the line) voltage or signal values. Time is read from left to right along the line.

This file displays no clipping whatsoever. This file was recorded almost correctly. It could actually have been recorded about 3 decibels higher, and still no clipping would have taken place. If recorded at 6 db higher, there would be a small amount of clipping, similar to a dynamic range enhanced, recording studio produced file, and still be acceptable. Only the highest peaks of the waveform would be clipped off. At 12 decibels of amplification, the clipping becomes severe, affecting almost every peak in the waveform. After this screen shot was taken, the file was amplified 12 decibels, using the amplify function of Cool Edit, and then reduced in volume by 3 decibels so that the resultant clipping is more obvious.

The same file, amplified 12 decibels, and then reduced in volume by 3 decibels:

Cool Edit kicks ass

This is the result of clipping: All the details beyond the signal level were simply chopped off, and lost permanently. In this severe case, over 80 percent of the original signal and original dynamic range has been lost or mutated. This waveform is obviously very different than the original waveform. This wave has literally been flattened.

Especially notice how the variations of amplitude between the various peaks have been 'averaged out', and how the distincions between the amplitudes of various peaks is not as clear as before. The only place where the wave has not sustained truly severe damage is near the beginning, where the signal level very originally very small. Although this is a severe example, it graphically demonstrates the waveform having its edges clipped off; hence the term 'clipping'.

You can easily repeat this demonstration yourself, by getting a copy of Cool Edit, loading a small wave file, amplifying (choose the macro '3 decibels louder' two, three, or four times, depending on the original amplitude of your wave file), and then choosing amplify, and choosing the macro '3 decibels quieter' once. This last operation simply backs off the wave peaks from the 100 percent levels, so that the peaks are easier to view. It is not required.

There is enough signal differentation left so that the original signal is still recognizable, but there is no way for WinAMP (or in this case, Cool Edit) to know how to reconstruct the clipped portion of the waveform. The result in this grossly abused example is severe and very noticable audio distortion.

A CD-ripped mp3 waveform damaged by clipping has its quality reduced to FM radio quality, and in many cases, much lower. If you have a $1000 stereo, playing a clipped file makes it sound like you only paid $199 for it at K-Mart.

The damage can become painfully obvious if you try to write a CD, or worse, record a cassette tape for use in your automobile's stereo, with its reduced audio capabilities vs. home stereo. The damage is most apparent at higher frequencies, such as a high keyboard range or high female human voice. Electric piano usually suffers the worst, although the distortion can be obvious with many singer's voices as well. Turn down the bass, and listen.

The above images also graphically demonstrate why simply calling up MP3 Workshop and using it to lower the volume of a clipped, distorted, overdriven mp3 will accomplish nothing. Once the tips of the waveform are lost, they are lost for good.

It is conceivable that someone skilled enough could write a program to repair mp3s with clipping damage, by reconverting the mp3 to a wave file, and attempting to synthesize the lost parts of the wave based on wave phase and vector, analyzing the upslope and downslope angles, using this geometry to take a best guess as to what the tip of the wave originally looked like, and replacing the lost part of the wave. The resultant repair would not be precise or perfect, but could save many rare mp3s that exhibit only minor cases of clipping. Most human ears can tolerate some small degree of clipping, up to 105 or 110 percent or so, without noticing. However, no one has done this as yet, and if it were done, the program would inherently be a slow, processor and RAM intensive resource hog, easily taking 4 times as long as it took to rip and encode the file in the first place.

It is better and easier to re-rip the damaged file. It is best to know enough to get it right the first time, so that there never needs to be a second or third time. The easist way to do this is simply to avoid the use of normalization, compression, and level adjustment in mp3 ripping.

The main source of clipping-distorted mp3 files is from CD rippers with defective normalization and signal level functions. Most of these programs first go through the entire file, taking samples of the signal level at random locations, and then try to calculate the average and peak signal levels based on this sampling. Many programs are superb in this respect, remaining very faithful to the original, and only activating their normalization level adjustments in the case of a defective original recording. In these cases, these normalizers can bring up low signal levels, and even increase the dynamic range.

However, there are other ripper programs, and these appear to be the majority of these types of program, that attack the levels of every single file they process, whether the normalization is actually needed or not. I assume it must be a selling point for the ignorant, a 'more bells and whistles must mean a better product' approach. After all, clipping is not a widely known or recognized phenomenon, or there wouldn't be a need for this page. In the vast majority of cases, the musicians and studio engineers got the original recording as good as it can be, and have acheived the greatest possible balance between wide dynamic range and low distortion, and the original levels do not need to be touched.

For this reason, DO NOT CHOOSE NORMALIZE as the DEFAULT setting for the files that you rip!!! ONLY if a ripped file shows very low volume levels should you re-rip and normalize the file.

WinAMP has two great plugins, Audiostocker and RockSteady, that can perfom the normalization function when it is desired, in real time, rather than risking permanent distortion being introduced into your files. If a WinAMP plugin doesn't do a song justice, it can be turned off, but normalizing a rip is permanent, and most rips do not need to be normalized.

Although these plugins process in real time, neither of these plugins are CPU hogs. In fact, turning on WinAMP's 'scroll songname in the taskbar' option eats up an order of magnitude more CPU clock cycles than either of these plugins. Under Waterfall, my machine (Win98 on Cyrix MII-300, 40 megs RAM) runs WinAMP at about 40 percent CPU usage. This increases to 41 percent with RockSteady, and 43 percent with AudioStocker. (Strangely, turning on 'scroll songname in the taskbar' makes the CPU usage rise to over 55.. this has to be a bug in WinAMP.)

If for some reason you feel that you must normalize, you should pay attention to the levels in the resulting files, and if they are too high then set the normalization level lower, say from 80 percent to 70 percent.

[FekLar] recommends Audiograbber. Audiograbber is a straight ripper, with linkage to almost any encoder you can find and download. This software author knows the right way to rip, and [FekLar] has always gotten decent results with this product. Try it sometime.

Jackie Franck, the author of Audiograbber, has this to say:

"What you say about clipping is basically correct but I do not know of any rippers that clip the peaks of the tracks. A ripper basically copies the track as it already is on the disc, if it has clipped peaks on the disc then there is not much to do about it. So, clipping occurs in the normalizing part and not the ripping part. But Audiograbber and most rippers only allows normalizing up to 100% so then there is no risk for clipping. (Clipping will only occur if normalizing goes over 100%).

Clipping can of course happen with line in sampling in Audiograbber if the user set the recording level too high. It can also happen if the user uses Analog as sample method under general settings."

Judging by the sheer numbers of clipped mp3s floating around on IRC, either a lot of people are using analog or line-in recording and setting the levels too high, or there are some rippers that do normalize past 100 percent. There are a lot of older machines out there yet, with older CD drives that don't support DAE, and analog is the only option in these cases. Analog ripping is usually at least passable, so long as there is no clipping. The sheer numbers of these bad files makes [FekLar] think that there are at least a few rippers that do routinely normalize past 100 percent.

Whatever the cause of these large numbers of defective files (the RIAA?), the Audiograbber product does do digital (DAE) rips properly. If for some bizarre reason you want to pay more than the $25 chump change for Audiograbber and use a different ripper, at least get the shareware version of Audiograbber and the free LAME encoder, and make a few rips with it to compare to the ones your other software product makes. If your other software product makes clipped rips, toss it and use Audiograbber.

For the purposes of comparison, don't use WinAMP's graphic display: use Cooledit's. Cooledit's display is far more precise and obvious than WinAMP's more simplistic display.

Use the WinAMP .WAV file writer plugin or some other means to convert an mp3 that you ripped to a wave file. (Warning: a 5 meg mp3 can produce a WAV file as large as 60 megabytes or larger, usually about 50 megabytes.) The images on these pages taken from WinAMP and Cooledit show why Cooledit is the better tool to use for this purpose. If you load a clipped wave file into Cooledit, that the file is clipped becomes immediately apparent.

For Audiograbber: (and links to LAME and other free mp3 encoder downloads)

For Cooledit:

When ripping, using a compression option is also almost always not a good idea. To use this type of option when ripping implies that you have complete faith in the software author's capability to outguess the original studio recording engineer's talents.

Again, the Audiostocker and RockSteady WinAMP plugins can perfom compression when it is desired, in real time. This is far preferable to the risk of permanent distortion being introduced into your files. Again, there is no major CPU cycle usage when using these plugins. Exceptions to this rule might be oldies, say a rare old Motown 45 you convert to a wave file, run a pop and noise remover fillter on, and convert to simulated stereo. The older equipment that was used to produce some of these early recordings had its limitations, and compression can significantly increase the quality of some of these types of files.

Judging by the quality of files routinely downloaded from the Dalnet and Undernet mp3 channels, I estimate that at least half the files that are available there have clipping problems ranging from significant to terminal. In many cases, one has to download the file from three or four different servers before a copy of acceptable quality is obtained. Because of ignorance of clipping distortion, many users did not even realize their files were defective. As a result, many of the users served the files they downloaded, and the channels have become highly polluted with low quality files.

You can tell the degree of clipping in an mp3 file just by looking at WinAMP as it plays the file. This pretty much concludes the description of the clipping problem and its solutions. The remaining link below is to three short graphics display pages.

Before we proceed to these images of poorly recorded mp3 files, you might want to set up your WinAMP the same as mine was set when the images were recorded, so that you have an identical comparison base. This way you can more easily check your own mp3 files and compare them to the images. My WinAMP was set to use the base skin, and the preferences were set this way:

WinAMP kicks ass

Go to [FekLar]'s MP3 Clipping Tutorial Page 2