Audio compression (simplified)

in #engineering7 years ago (edited)

Audio compression


As we all know, sound are vibrations that travel through a certain medium and it can be heard once it reaches your ear. Throughout history we were able to record audio (analog) on eg wax (Rob Scallon did a rather nice video on this) but when computers came along we wanted this data digitally. There was one issue in the early years of computers (and the internet), we did not have a lot of storage and bandwidth to use and uncompressed, raw audio was rather large in file size. This was mainly solved by “audio compression” (lossy) to reduce the file size dramatically, think of your mp3’s. But as technology advanced and we got more storage & bandwidth to our disposal the demand to “uncompressed” audio could grow…

What compression?


Data compression basically encodes data in a certain way so the output uses less bits than the input. We are not going to go in depth of the maths behind certain compression methods so do not worry about the complexity. There are two types of compression, lossy and lossless compression. With lossy compression you lose pieces of data that are “not very useful” according to the algorithm that was used, an example of lossy compression is mp3 files. The output of lossy compression can in other words not be converted back into its original input. Meanwhile with lossless compression you do not lose any sort of data while compressing, think of flac files!

A basic example of lossy compression


Let’s say we have a string/sentence, “And I was thinking to myself”. We can easily say the “to myself” part is useless information for us and we could easily say “And I was thinking”instead. If we would take the amount of bytes (UTF-8) of the string would use then we went from 28 to 18. But we lost some data that we can never get back…

A basic example of lossless compression

Let’s say we have a string with repetitive parts/words.

Such a lovely place (such a lovely place)
Such a lovely face.

(Kudo’s if you get the reference)

In this string we can easily “replace” the “biggest common part” with some sort of identifier, “uch a lovely “ could become “$A” giving us

S$Aplace (s$Aplace)
S$Aface.

We can even go further and replace “ace” with “$B”.

S$Apl$B (s$Apl$B)
S$Af$B.

If we compare the amount of bytes the strings take up (without taking $B=ace into account aka something what we can call “overhead”) we can see that some data is being saved. The original string (UTF-8) was 61 bytes long, the second 28 and the last one 25. But we can easily replace the $A and $B back with the supplied data and poof we got the original input back!

Those are of course a simple and “stupid” examples but this would give a basic idea how compression works roughly. Real compression techniques are of course far more complex than those example.

So what’s the impact on audio?


For not making this too complex I am going to limit this to one lossy compression type, mp3, and one lossless compression type, flac.

What the mp3 LAME encoder does in a simple description is removing some “unhearable” bands (for us, humans) from the audio source. This forms no issue for regular use and on regular equipment. But since that data/info was removed from the file we cannot use it if we want to mix/resample it…

As a quick example I imported a FLAC (lossless) file into audacity, exported it as mp3 (lossy) at 64 kbps (yes this low, it’s just as demonstration purposes).

FLAC:

MP3:

The “difference” (data lost):

As you can see there is a fair bit of data lost between the conversion.

You might be thinking “my mp3’s are never encoded at such low bitrate”, this might be true but there is definitely a difference, you might not hear it perhaps because your ears are not “trained” for it or your equipment are not suited for it.

Note


I do not prefer any compression method over another (from the same type) nor am I going to debate if FLAC is better than ALAC (Apple’s version) or so.

Appendix


You can find some demo’s of a song in various compressed mp3 rates here. The song is not copyrighted and it could be used, it was not obtained on any illegal matter. I can however not share the song out of the screenshots as that is copyrighted… (and I could not find any lossless or raw uncopyrighted audio sadly enough)

https://soundcloud.com/thelonely-dev/sets/steemit-audio-compression-demo/s-k363M

A cool video example can be found on YouTube (not by me)


Sort:  

Can I think lossy compression as some kind of dimensionality reduction? Like in PCA.

That's a nice view on lossy compression! I guess you could say so to some degree. as both methods try to reduce the amount of "random" data. (I have read a paper before on dimensional reduction in image processing if you are interested, http://www.ijritcc.org/download/browse/Volume_5_Issues/August_17_Volume_5_Issue_8/1502352051_10-08-2017.pdf)

@thelonelydev: your content is really professional. I am resteeming this. Following too :-)

Thank you for your feedback & support :)

Great article, really enjoyed the read! I'll be taking a closer look at compression in an upcoming university course of mine, this was a great introduction. It's also pretty incredible how much MP3 cuts off, yet so many people don't really care about lossless audio.

Thanks! Yeah, lossy compression gets rid of a lot but at least most streaming services use a decent bitrate (160kbps-320kbps for Spotify, https://support.spotify.com/us/article/high-quality-streaming/) but most people will not hear the difference in quality because of eg poor laptop/smarphone speakers or no high quality headphones (I don't have some either). But it certainly does matter when editing and using a better setup! And I actually wrote this post because this was part of a class of my uni course :)

Congratulations @thelonelydev, this post is the sixth most rewarded post (based on pending payouts) in the last 12 hours written by a Dust account holder (accounts that hold between 0 and 0.01 Mega Vests). The total number of posts by Dust account holders during this period was 11360 and the total pending payments to posts in this category was $5806.15. To see the full list of highest paid posts across all accounts categories, click here.

If you do not wish to receive these messages in future, please reply stop to this comment.

extraordinary post is very appropriate to be trending, hopefully I also one day later my post can be trending like you.

Great informative post. What will be the standard encoding for audio in the future? Is it going to be AAC?

Thanks!

While AAC is an improvement over mp3 (better sound for the same bitrate) it is worth noting that not every music player can play the AAC format (although it is not too bad) and that it is a proprietary format again... But it certainly would be a nice direction to head for...

But (as mentioned in the post) as bandwidth & storage become less of an issue it would be even better to use some lossless compression to even get better sound!

As an amateur audacity user, I've always wondered what is "lost" upon certain types of compression. Thanks for sharing!