Investigating the quality of lossy algorithms: Advanced Audio Coding
(AAC), MPEG Audio Layer 3 (MP3) and Yamaha's SoundVQ, an implementation
Back to the Audio compression page, which leads to
tests on lossless algorithms (totally updated in December
Back to the First Principles main page - for material on telecommunications, Internet music marketing, stick insects . . .
14 - 19 December 2000 Please note:
There is a highly significant listening test report from the EBU in June 2000 on a variety of algorithms, including AAC and MP3. http://www.ebu.ch/trev_dolby_frm.html Proper listening tests are very difficult and expensive to conduct. I recommend you read this report in its entirety before bothering too much with what I wrote below, in December 1998.
The report and a separate file with the results in greater graphic detail are both .PDF files. Current Acrobat plugins are a menace in terms of not caching the file when re-viewing it or printing it, and are often too dumb to save to disk with the original file name. Here are the URLs of the main report and two sub-reports which contain graphs in a larger format. If you shift click on them, you should be able to save them to disc and read them at your leisure. They are about 1.3 Megs in total
The EBU report tests the following codecs:
These were tested at:
- Microsoft Windows Media 4.
- AAC - implementation by FhG-IIS.
- MP3 - or close to it, by Opticom.
- Q-Design Music Codec 2 - prototype version of that for Quicktime.
- Real Networks 5.0.
- Real Networks G2. Newer, widely used system based on "DolbyNet".
- Yamaha Sound VQ.
This report also discusses the codecs specifically. The Microsoft and Q-Design codecs show highly variable results on different test material at 48 and 16 kbps respectively.
- 16 kbps mono. Q-Design gets special mention for music, but not for speech.
- 20 kbps stereo. Lower subjective results than 16 kbps mono. Ditto the Q-Design special mention.
- 32 kbps stereo. AAC leads.
- 48 kbps stereo. AAC leads with MP3 close behind. Windows Media gets special mention for a folk music test for being indistinguishable from the reference. Q-Design is not much better than at 20 kbps.
- 64 kbps stereo. AAC wins by a country mile averaging 80 points. At this data rate, AAC was the only codec which evaluated in the "excellent" range for all items tested.
While the report does not give the complete breakdown of results, by codec, by test item, my interpretation of this is:
Its horses for courses!
- Forget TwinVQ.
- The Windows and Q-Design codecs were very fussy about what material they encoded. With some items they were better and others much worse. Q-Design shows no significant improvement as the data rate increases.
- Real Audio G2 is solid at all rates, except 20kbps stereo where Real Audio 5 is better. G2 rates a fraction better than AAC at 16 kbps mono.
- MP3 tails slightly behind AAC as the data rate increases, except for at 64kbps where AAC is very significantly better than both MP3 and Real Audio 2, which have about the same score.
Unfortunately, while AAC is widely regarded as being better then MP3 (as good at 96kbps as MP3 at 128 kbps) MP3 is good enough and is so established that the more tightly licensed AAC is unlikely to displace it for a while. Think Beta vs. crappy, widely marketed VHS, except VHS coming first - and as before, the average user not being fussy enough to care. Fortunately, with decoders in software on PCs, we aren't stuck with the fixed hardware and media investments which makes only one kind of video cassette system viable, even if it is not the best. Portable MP3 players, including CD players, imbed decoders which cannot be updated as can PC software.
I think Real Audio G2 is here to stay for a few years for streaming applications, and for archived files. Its ability for a single file on disc to generate multiple streams, including via HTTP, for different players, is very snappy.
AAC licensing is apparently tied up with attempts to keep music "secure" - which I think is a waste of time.
Here are some other important new URLs:http://www.commvergemag.com/commverge/extras/P178673.htm Extensive analysis and links regarding lossy (MP3 and WMA at least) compression and some lossless codecs. Be sure to check this site! When I looked at it, the page was corrupt and would only display properly on MS Internet Explorer. There are many interesting things here, including a link to his listening tests of a watermarking system (Hiss!!) which was clearly audible and is apparently to be used on DVD audio discs. Watermarks are a waste of time, for too many reasons to explain here, but see what I wrote in 1997 about them: http://www.cni.org/Hforums/cni-copyright/1997-02/1005.html .
http://CodecReview.com/ Dave Weekly's specialist site with many links, some tests of lossless codecs and plans for a much more extensive and interactive codec comparison.
http://privatewww.essex.ac.uk/~djmrob/mp3decoders/ David J M Robinson tests 24 MP3 decoders with a variety of encoders, including VBR (variable bit rate) and finds that only five pass all his tests. Salute!
A Dolby AAC site is: http://www.aac-audio.com . The announce that Music-Match Jukebox will support AAC. I had a suspicion that AAC or some related Dolby approach is used in Real Audio, which I think achieves remarkable results in stereo at only 20 kbps. The music lacks top-end detail, and speech sounds a little odd, but the music is still well worth listening to, for instance, from the archives or real-time source at fab community music station WMNF in Florida. However, the EBU report mentioned above distinguishes between AAC and Real Audio. CodecReview.com states that Real Audio 3 to 5 is based on DolbyNet/AC-3 http://www.dolby.com/tech/ac3flex.html . But what technology is behind Real Audio G2?
The MP3 Encoder's Mailing List is at: http://geek.rcc.se/mp3encoder/ .
This page documents my own investigation of the audio quality provided by AAC (an early, unlicensed and non-optimised encoder / decoder) , MP3 and TwinVQ/SoundVQ. These are not full-blooded double-blind listening tests. They are for my own interest and concentrate on finding musical sounds which are most likely to cause audible differences in the decoded signal. These test show the performance of particular encoders and decoders, and do not necessarily show the maximum possible performance of the algorithm.
This site also contains links to other sites regarding these three compression algorithms.
I am particularly interested in the applicability of these compression algorithms to music delivery - as part of my interest in music marketing, which is the subject of a separate page: musicmar .
Note that this is not an investigation of low bit-rate schemes suitable for streaming (real-time delivery) of music via 33.6 or 56 kbps modems. Although I tested some lower bit rates, I didn't really investigate them. My question was: "What algorithm and bit rate can be relied upon to encode a very wide range of music so it is audibly indistinguishable from the original, including with demanding listeners and listening environments?"
6 July 2000 Please note:
- This work was done in late 1998 and I am not attempting to keep up with developments in this rapidly changing field. I can't keep this as an up-to-date link farm for lossy compression either.
- See the following sites for more recent developments and links:
- http://www.mp3-tech.org/ Lots of up-to-date analyis.
- http://users.belgacom.net/gc247244/ Detailed testing of MP3 encoders, showing that open-source LAME is the way to go!
- LAME is now available as executables for Windows ( http://www.mp3-tech.org/encoders_win.html but you might be violating patents to use it) as well as in source versions for Linux, Windows etc. LAME is an intense collaborative effort and no longer relies on ISO code. Salute! http://www.sulaco.org/mp3/ .
13 September 1999 Please note:
- This work was done in late 1998 and I am not attempting to keep up with developments in this rapidly changing field. I can't keep this as an up-to-date link farm for lossy compression either.
- My aim was not to find the best MP3 encoder or decoder, but to find out roughly how good the various algorithms were, or could be.
- Most of the things I tested have now been superseded by later versions - for instance MusicMatch http://www.musicmatch.com/ is now (Sept 99) up to version 4.1 – totally different from the demo 2.50.005 version I used.
- I am currently using LAME http://www.sulaco.org/mp3/ on my Linux machines for MP3 encoding.
AAC is a most impressive compression algorithm. According to carefully conducted listening tests, at 128 kbps, it seems to be superior to MP3 at 192 kbps. This is reported by David Meares, Kaoru Watanabe and Eric Scheirer in their February 98 paper which is in a Word 6 file, zipped at: http://www.cselt.it/mpeg/public/w2006.zip . I have quoted some of the results below, in the AAC section.
I found that the audio quality of the Yamaha SoundVQ encoder (2.54eb1) and decoder (2.51eb1) is noticeably inferior to MP3 or AAC at the available bit rates of 96 and 80 kbps for stereo. Its performance on simple slowly swept-frequency sine-waves in the 3 to 6 kHz range is really bad. Amongst TwinVQ users, these problems are generally well recognised and accepted - with the argument that TwinVQ's artefacts are not too unpleasant, that it's lower bit rate (80 or 96 kbps) is attractive and that it copes well with a wide variety of music, including tracks which work badly with MP3 joint stereo (for instance those from analogue master tapes which have significant L - R phase differences).
Test sound files, and some of the decoded files are provided in .WAV format. I have included some graphic frequency analysis images as well.
I don't believe that the term "CD quality" should be applied to any lossy algorithm. That said, I believe that for the majority of music and listening conditions, MP3 when properly implemented at 128 kbps (though it seem that joint stereo will fail with some out of phase material) and AAC when properly implemented at 128 and probably 96 kbps will probably reproduce virtually all music in a way that the degradation is inaudible to virtually all listeners.
Personally, if I was buying music, I would want a delivery system that wasn't teetering on the edge of human perception. My tests of lossless algorithms (See here.) suggest that for pop, rock and techno, music can only be compressed losslessly to about 55 to 75% of its normal size.
Until Internet bandwidth and costs improve, MP3 and soon AAC will play a vital role in the discovery and delivery of music for commercial and non-commercial purposes.
I do not have a lot of experience with these algorithms. This was an attempt to find whatever it took to trip MP3, AAC and TwinVQ up. TwinVQ, trips up on the most fundamental component of sound - the sine wave - and so I cannot take it seriously. Nor do I think claims that "music does not contain sine waves" are valid. (Think of the Theremin in the Dr Who theme.) Accepting its limitations, it does cope remarkably well with a wide range of music. Lots of people like TwinVQ, and a lively discussion about it can be found at the VQF.COM discussion forum: http://www.vqf.com/bbs/?board=VQF.comForum , particularly starting with my post.
This field is changing rapidly. I may not be able to keep this page up-to-date. Be sure to check with the sites mentioned below for the latest developments.
There are many MP3 encoders and decoders, and it is evident that depending on the combination of encoder/decoder, the data rate, the type of music, the choice of stereo or joint-stereo for encoding (if you can choose), and characteristics of the original material which can cause joint-stereo encoding to sound bad, the audible results may vary considerably. To test all the combinations would be a mammoth task. Please let me know if you find anyone doing this even partially.
I can't keep up with all the developments in lossy audio compression, but I will attempt to update this page - primarily by linking to more up-to-date sites.
One set of updates is flagged in the text as: up990424 for 24 April 1999. If you search for this, you will see what has changed.
Another set is flagged in the text as: up990606 for 6 June 1999.
I believe that if the Analogue to Digital Conversion (ADC) and Digital to Analogue Conversion (DAC) are performed properly, then the 44.1 kHz sampling rate and linear 16 bit resolution system established by Sony in the early 1980s for the audio CD is entirely adequate for reproducing stereo signals which are to be heard by humans in any "ordinary" listening environment. (This includes the highest quality headphones and speakers with the most exquisite music. It does not involve hiding a safe distance from the speakers when the cannons in the 1812 overture go off, and then running up to the speaker to hear quantitisation noise as the track fades out.)
Achieving the potential of 16 bit 44.1 kHz digital audio is a challenging task - it only became possible around 1990 as far as I am aware. It can best be accomplished with oversampling ADCs followed by linear-phase digital decimation filters to bring the sampling rate down to 44.1 kHz, whilst rejecting frequencies outside the audio range without the need for high-Q analogue filters. For instance see the Delta-Sigma ADCs of Crystal Semiconductor. (The mathematical and electronic principles of these delta-sigma ADCs are partially beyond me.)
With the existence of the CD, the DAT recorder and the CD-R, these extraordinary ADCs which Crystal and AKM pioneered have, as far as I can see, solved the problems of audio recording and storage.
So why do some people want 96 kHz sampling? Maybe to keep their canine friends happy or to impress those, including themselves, who believe that 44.1 kHz is inadequate? (There are some people who work professionally in audio who are very keen about 96 kHz sampling. Check the Seneschal site for material on 96 kHz audio.) I agree that 20 bit resolution is highly desirable for recording, mixing and editing, but I still think that a properly edited (with dither) recording in a form suitable for playback on headphones or loudspeakers can contain a perfectly adequate signal to noise+distortion ratio with a 16 bit signal resolution at 44.1 kHz. (Dither extends the resolution in the most audible frequencies by several bits - to 18 or 19 bits or so. The playback is probably best done with 4 or 8 times oversampling digital filters and 18 bit current switching DACs (the extra bits are output by the filters and should be used) so that only a very gentle analogue low-pass filter is required.
Lossless compression (compression is here used as a synonym for "data-reduction") algorithms for 16 bit 44.1 kHz stereo signals (1,411,200 bits per second) seem to reduce most music by only about 30% - so they are not very widely used. It looks like a daunting task to do much better than this.
So why are people saying that MPEG Audio Layer 3 compression to 128,000 bits per second (128 kbps - a compression ratio of 11.025 to 1) is "CD Quality"? Because, they want to believe it is true, or they can't tell the difference. (But see later - when I found it hard to tell the difference too.) "CD Quality" should rightfully mean any lossless form of conveying the full 44.1 kHz 16 bit stereo bitstream - but the term has been so widely misused now that I think it is best avoided.
MPEG Audio Layer 3 (hereafter referred to as "MP3") and perhaps AAC (MPEG Advanced Audio Coding) are shaping up as the preferred form of distributing and storing music via the Internet. In general the bit rate of 128 kbps is used at present - so I am concerned that we are taking a serious step backwards in audio quality from the potentially pristine and transparent 16 bit 44.1 kHz system established by Toshi Doi and his colleagues at Sony in the late 1970s.
These two algorithms - and TwinVQ (Yamaha calls it SoundVQ) - all work by breaking the sound into short time segments, filtering those segments into separate frequency bands, encoding the signal in each frequency band, and then - using a mathematical model of human hearing, sending the most audible parts of the signal to the output stream. With enough bits in the output stream, the result may be lossless - the decoded file is bit-for-bit identical with the original. However at the data-rates of interest to Internet users, these compression algorithms are certainly lossy. With a lot of music, on the crappy speakers that many people listen to music on, in the imperfect listening conditions (computer, car and other background noises), this loss in the compression system may not be audible at all.
So for general use, with lots of boisterous music, I think these algorithms are likely to be fine at 128 kbps - assuming the encoding (compression) is performed optimally, which may not always be done due to not all encoders (or decoders) being perfectly written and due to CPU-intensive nature of filtering, analysis and of the recursive approaches to figuring out the best way to pack the data into the output stream.
However this is not to say that the losses in the compression algorithms are insignificant or should be ignored. Sound and human hearing involves very subtle processes - and having come all this way to the point where we can record and reproduce stereo audio without any significant degradation, I don't believe we should put up with lossy compression algorithms if we are purchasing music for keeps.
This page links to some sites of interest regarding compression, and then documents my attempt to find the weaknesses of MP3, AAC and VQ.
In the future I may have some links regarding "digital watermarking" or "fingerprinting". For now, let me say that I think watermarking is doomed to failure for a number of technical and business reasons.
AAC: The AAC compression algorithm is documented at http://mp3tech.cjb.net and www.mp3.com has a list of AAC software. From that list I found the site of the enigmatic Astrid/Quartex (up990424 it was at http://www.geocities.com/ResearchTriangle/Facility/2141/ but see the AAC links section below on where to get it) - who has a Windows based AAC encoder and decoder. Thanks to firstname.lastname@example.org for making this software available! The files I got were called aacdec01.zip and aacenc02.zip. These contain version 0.1 of the decoder and 0.2 of the encoder. The encoder zip file contained an executable and an aacenc.txt file which were dated 12 October 1998.
Be sure to check at Astrid's site above, and at the AAC sites listed below for later versions - but here are the zip files in case you find them hard to get. aacdec01.zip aacenc02.zip
According to the Fraunhofer AAC FAQ, any software (such as Astrid/Quartex's) which is based on the MPEG source code will not be of the highest quality, and any AAC implementation must be licensed by the patent holders. In case Astrid/Quartex's site disappears, you may wish to search AltaVista for "aacenc" or "aacdec", (or with "02" or "03" etc, after that name - such as "aacenc02" or refer to some of the sites in the AAC links section below. There is another AAC encoder/decoder from Homeboy as well. See the AAC links section below for more sites for the Astrid/Quartex encoder/decoder.
MP3: The Munich based Fraunhofer Institut for Integrated Circuits IIS-A is in many respects the home of MP3 - they did a lot of the work on developing the standard: http://www.iis.fhg.de/amm/techinf/layer3/ They are not so popular in MP3 circles at present (November 1998) because of claims they are making regarding patents and pressure they have successfully exerted on a number of authors of freely available and/or shareware MP3 programs. I used their Windows demo-edition encoder and decoder for these experiments. The versions I used are: WinPlay 3 Version 2.3 beta 5 from http://www.iis.fhg.de/amm/download/mp3player/index.html and the command-line encoder program "mp3encdemo31.exe" which identifies itself as "MPEG Layer-3 Encoder V3.1 Demo (build Sep 23 1998)" and which comes in the file: mp3encdemo_3_1_win32.zip. The encoder is available for various Unices - including x86 Linux - and Windows at http://www.iis.fhg.de/amm/download/mp3enc/index.html .
VQ: Yamaha has a freely available VQ (more properly TwinVQ) encoder and decoder for Windows - which I used in these tests: http://www.yamaha-xg.com/english/xg/SoundVQ/index.html . The versions I used are: encoder 2.54eb1 and decoder 2.51eb1.
AACAAC will be part of the forthcoming MPEG-4 standard, so "AAC", "MPEG-4" and "MP4" may be used interchangeably at some sites.
There are three "profiles" for AAC in the MPEG-2 data stream. "Main" is the fully fledged AAC. "LC" (Low Complexity) and "SSR" (Scalable Sample Rate) are lower quality options for restricted CPU power implementations. I think that all AAC software mentioned here is not mucking around with the lower quality profiles.
- The definitive reference for MPEG Audio, including AAC (AKA MP4, of which it is a subset) is the MPEG Audio FAQ by D. Thom, H. Purnhagen, and the MPEG Audio Subgroup: http://www.cselt.it/mpeg/faq/faq-audio.htm. (Note this server is also known as drogo.cselt.stet.it )It mentions that Dolby Laboratories should be contacted for AAC licensing - they and associated companies have some of the AAC technologies covered by patents.
Dolby Laboratories has an email address, listed at: http://www.dolby.com/trademark/ for AAC licensing. In a letter to MP3.com's CEO Michael Robertson http://www.mp3.com/news/135.html ( 23 November 1998 ) Dolby Laboratories states that they are "the licensing administrator for a new compression technology called AAC". The AAC patent rights apparently belong to AT&T, Dolby, Fraunhofer and Sony. Dolby asked Robertson to remove links from his www.filez.com to unlicensed AAC software. "These companies take the unlicensed use of their technology very seriously, and are presently in the process of communicating with each of your linked sites. Our goal is to provide them inexpensive licensing arrangements so that they can continue to utilize AAC technology." Following on from the letter to Michael Robertson, there is a lively discussion board (as there is for each MP3.com news item) at: http://bboard.mp3.com/ubb/Forum4/HTML/000148.html . At present, the only AAC encoders and decoders which are generally available are the Homeboy and Astrid/Quartex pairs. While recognising that these are far from optimal, I consider their availability to be vital for those such as myself who are interested in foreseeing the development of music marketing - and probably for quite a few other purposes. If I become convinced that these authors are materially and negatively affecting whoever owns the patents for these principles, or if I think they are lowering the standard of audio and software development, then I might take a dim view of them. At present, they are the only place you can get an AAC encoder or decoder without paying very large up-front license fees - and they are doing it for free because of their interest in audio. Hopefully Dolby Laboratories will succeed in making this excellent technology available to all those who can use it at a price which makes sense. For a long time they did it with Dolby B, and maintained standards at the same time. Now it's software and a very different marketing model. Astrid/Quartex (up990424) used to have a site with the command line Windows AAC encoder/decoder I used: http://www.geocities.com/ResearchTriangle/Facility/2141/ . The programs are mirrored here , here and at this site (see the AAC section above). According to the MPEG Audio FAQ V9, referring to the publicly available reference software on which the Astrid/Quartex 0.2 encoder is based: "The encoder software is not yet a general multi-channel encoder, and does not yet make use of all AAC coding tools." Therefore, this early software does not provide the full performance which is possible with AAC.
K+K Research in Denmark(up990424) has a new AAC encoder and decoder: http://kk-research.hypermart.net/ . I have not tried it.
KM (up990424) http://cad-audio.fsn.net/ (who is associated with K+K) has extensive and up-to-date pages on audio compression in general and on AAC in particular: http://cad-audio.fsn.net/aacinfo.htm .
Homeboy Softwarehttp://www.eotd.com/hbsaudio/default.htm are the other people who have gone to the trouble of writing and freely releasing AAC encoders/decoders in late November 1998. They seem to have an AAC player plug-in for WinAmp, and AAC encoder (aacenc05.zip) which has known problems - they are working on a new version - and soon an AAC player for the Macintosh. Who are these dudes? One of the directors apparently posted to the AAC discussion at: http://bboard.mp3.com/ubb/Forum4/HTML/000148.html . CSELT's Official MPEG web site http://www.cselt.it/mpeg/ has a Word 6 file w2006.doc, pkzipped, containing a detailed February 1998 report from David Meares, Kaoru Watanabe and Eric Scheirer comparing AAC and MP3 at various bit rates with carefully conducted listening tests. The title is: "Report on the MPEG-2 AAC Stereo Verification Tests". http://www.cselt.it/mpeg/public/w2006.zip . I quote some of the results below in the AAC section - they are most impressive. mp3.com has a list of AAC software. See the next section for a link to MP3Tech. A site dedicated to AAC is the Advanced Audio Coding Homepagehttp://nedhosting.com/users/aac/ . They have a discussion section. The Fraunhofer Institut has some excellent technical material, including an encoder block diagram on AAC http://www.iis.fhg.de/amm/techinf/aac/ See also the FAQ at this site. Forbidden Donut Unlimited has a site http://www.forbiddendonut.com which includes a copy of the Astrid/Quartex aacenc02.zip file. There was a Windows player for AAC, MP3 and VQ files, called KJofol 0.402 The site used to be at: http://www.audioforge.net/kjofol/ but seems to be gone now . . . but see below for new sites. There was a letter.txt there requesting the authors stop distributing the program, because of a claimed patent violation. Take a look at the screenshot. For some reason, this audio compression field leads programmers to create interfaces they think are exquisitely beautiful and easy to use, but which I think just the opposite! MP3 players FreeAmp, Sonique and quite a few others are really non-standard and focused on circles and curves and trying to be like a piece of hi-fi equipment, rather than a plain, easy-to-use program.
A new site for KJofol (Windows player for MP3, AAC and VQ files) is http://kjofol.org On 26 November 1998, this has the v0.42 and promises v0.5 soon. Mirrors are here, here and here. A company called Mayah plans an editor for AAC files: http://www.mayah.com/english/n980918e.html . A site called AAC Nethttp://www.worldzone.net/ss/aacnet/ has some AAC information and also seems to be available from: http://come.to/justmp3 (In the Tongan domain!). They have a discussion section. MP4 Central http://people.goplay.com/MP4Central/ or http://come.to/mp4central concentrates on AAC audio files. Liquid Audio has a commercial program, Liquefier Pro for Windows, which will soon encode AAC (AKA MP4) files. http://www.liquidaudio.com/products/liquifier.html However, I think the output is probably a proprietary format - or at least optionally so. Liquid Audio promote the use of watermarking and encryption in an attempt to stop people copying music. I think this is a waste of time. Two Japanese AAC sites: http://ha2.seikyou.ne.jp/home/tlswosk/comp/aac.html and http://www.moemoe.gr.jp/~hibari/aacjapan.html . AAC is also used, together with encryption and proprietary file formats, by AT&T's http://www.a2bmusic.com with their "a2b" player and music control system. Files purchased from their site reside on the user's computer and are supposedly unplayable on any other computer. See my music marketing material for why I think this approach to hang onto old certainties about uncopyable music is doomed to failure.
MP3 and other algorithms
See above for links new sites in 2000.
Quite a few of these sites concern AAC, TwinVQ and PAC compression too.
- The Official MPEG site is at CSELT in Torino, north-west Italy: http://www.cselt.it/mpeg/
MP3Tech http://www.mp3tech.org/ has information regarding MP3, AAC, TwinVQ, Dolby AC-3, listening tests, patents etc. There is a web discussion forum and an mp3tech mailing list. There is also results of a limited but interesting listening test of MP3 at different bit rates . A significant development is LAME http://www.sulaco.org/mp3/ This is an open source patch for the publicly available ISO encoder source file to correct errors in the algorithms, improve sonic performance and make it run faster. Distribution of this patch should be free of the patent restrictions concerning functional MP3 encoders (executables or source). This is a very promising development! The LAME crew are working intensively on all this and have a deep understanding of the psycho acoustics and the encoding algorithms. There is also a link to an MP3 Encoders mailing list. ( up990606 ). There are zillions of MP3 sites and a vast range of software. I won't attempt to keep up with it - see the biggest activist site in the MP3 universe is http://www.mp3.com . There you will find extensive discussion of the technical, legal, moral, industry and political aspects of audio compression, electronic delivery of music and of copying and copyright. They also have an extensive set of links to all the relevant MP3 play, decode, encode etc. software. An essential starting point! GoodNoise - soon to be Emusic - is a pioneering company (together with Nordic Communications) in selling music with discovery, delivery and payment via the Net, in an open standards format (MP3) and without attempts at preventing listener copying. There are many other important music marketing sites - so see www.mp3.com and my music marketing material on another page of this web site: musicmar . Cedric Amand's http://mp3bench.com has a variety of interesting technical, performance and popularity material regarding MP3 software, and AAC as well. The Fraunhofer Institut has some excellent technical material: http://www.iis.fhg.de/amm/techinf/ . The Motion Picture Experts Group site http://www.mpeg.org has lots of information on MP3 and AAC - and on the data streams they can be put into. MPEG numbering and terminology is a mess - I won't get into it here. Karsten Madsen <email@example.com> has a site: http://cad-audio.fsn.net/ reviewing the Liquefier Pro encoder's AAC (Proper Dolby/Fraunhofer encoder, I believe), Astrid/Quartex's AAC software, PAC, MP3 and VQF. Not related to the audio quality, but relevant to the way people organise large numbers of MP3 files, is the ID3v2 tagging specification: http://www.lysator.liu.se/id3v2/ . This is an informal and evolving standard, and I think the web site is beautifully organised and presented. From their explanation: "ID3v2 is a new tagging system that lets you put enriching and relevant information about your audio files within them. In more down-to-earth terms, ID3v2 is a chunk of data prepended to the binary audio data. Each ID3v2 tag holds one or more smaller chunks of information, called frames. These frames can contain any kind of information and data you could think of such as title, album, performer, website, lyrics, equalizer presets, pictures etc. (Update 8 Jan 1999.) Leonardo Maffi has some detailed material, mainly in Italian, testing the performance of lossy audio and other compresssion algorithms: http://computer.digiland.it/1609/ .
- The Official MPEG site is at CSELT in Torino, north-west Italy: http://www.cselt.it/mpeg/. They have a version 2 draft of the MPEG-4 work: http://www.cselt.it/mpeg/standards/mpeg-4/mpeg-4.htm This mentions that AAC and TwinVQ will be part of the forthcoming MPEG-4 standard. MPEG-4 covers a bewildering array of concepts beyond direct compression of audio and video. One relatively straightforward aspect is SAOL - Structured Audio Orchestra Language http://sound.media.mit.edu/~eds/mpeg4-old/ This is a portable and flexible approach to digital synthesis of sound with software - based on Csound: http://mitpress.mit.edu/e-books/csound/frontpage.html or http://www.firstpr.com.au/csound/ The bewildering stuff is when they start talking about compressed coding for facial, head and body animation! Apparently, rather than compressing a video of a person, they are planning on analysing them according to facial structure, expression, skin texture etc and synthesising an image based on these parameters at the receiving end. These images would then be merged together with MPEG-2 video or some VRML nonsense. Propellerhead zone!
To Page 2
TwinVQ"TwinVQ" is the proper term. But I use "VQ" at this site. "SoundVQ" is Yamaha's term for this compression system, and files are normally stored with an extension of "VQF".
TwinVQ will also be a part of MPEG-4.
TwinVQ (Transform-domain Weighted Interleave Vector Quantitisation) was developed by NTT Human Interface Laboratories: http://www.hil.ntt.co.jp/top/index_e.html. The English version of the TwinVQ home page is: http://music.jpn.net/ . Yamaha's site is: http://www.yamaha-xg.com/english/xg/SoundVQ/index.html . A big activist site for TwinVQ is VFQ.COM: http://www.vqf.com . They have a discussion area, which I posted to regarding these tests. My posting is: http://www.vqf.com/bbs/display.php3?board=VQF.comForum&DISP=2436 . Follow this link for alternative viewpoints to my negative assessment of TwinVQ! Search for "twinvq" with AltaVista by clicking here!
Other related algorithms
- Dolby AC-3 is a highly respected, proprietary, multi-channel compression system which is also introduced at this site. DVD uses it at 384 kbps, and cinemas use it at 640 kbps. I don't know of any easy to obtain encoders or decoders for it, so have not investigated it further. http://www.dolby.com/tech/
A relatively low loss system used for broadcasting is Audio Processing Technology's 4:1 fixed rate apt-X system. http://www.aptx.com This is a real-time, high quality, low delay system (2.76ms for encode and decode combined) - which does not rely on psycho-acoustic models etc. The FAQ describes it:
ADPCM as used by APT for its apt-X 4:1 compression algorithm
takes the digital signal and breaks it into four frequency sub bands
by means of a QMF digital filter. Each of these sub bands is
subsequently coded by means of predictive analysis; the coder predicts
what the next digital sample in the audio signal will be and subtracts this
prediction from the actual sample. The resulting, small error signal is
transmitted to the decoder which then adds back in the prediction from
identical tables stored in the decoder. NO psycho-acoustic auditory
mask is used to throw away any of the original audio signal resulting in
a near lossless compression system.
In March 2001, a chap from APT wrote to me that the algorithm is available on a demo basis as a Windows DLL.
Microsoft (up990424) has developed a new low-bit-rate audio compression system: http://www.microsoft.com/windows/windowsmedia/ . The encoder is available here. An article and discussion of its merits is at MP3.COM: http://www.mp3.com/news/230.html . As always, keep an eye on http://www.mp3.com for the latest news.