In my last post, Plotting a Sound Wave in Flash AS3 I detailed a method to use when displaying audio data. The method itself works great, but due to Flash’s frame-based code execution and event processing the user looses input capabilities while the flash player chugs through millions of numbers adding, rounding and comparing. In order to make displaying an audio waveform easier on both the programmer and the user I wrote a class that analyzes a Sound object progressively, and dispatches a special event containing the analyzed data. The class will construct a left and right channel Vector, each containing one data point [a number between 0 and 1] for a given number of windows, between two positions in the sound. The left and right position are measured in samples and two types of analyzation are offered. Here is a demo of the class in action:
Screen Capture of Waveform Plot
The calculated data can be reached incrementally through the WaveformEvent object which is dispatched every frame, or at the end of all analyzation in the Waveform object’s leftChannel and rightChannel properties. The details are listed in the documentation below.
I’ve always been really into wave editors. I used to make songs in Amadeus by piecing together samples from other songs. Tedious but very rewarding. In a post I made not a long time ago I detailed a little Theremin project which included some wave data visualization. In this post I”ll be going further into detail about plotting sound data.
Digital audio in it’s rawest form [PCM wave data] is a long list of numbers from 1 to -1, which represent the sound’s amplitude. Another way of thinking about this is that each number represents your speaker’s distance away from it’s rest position. At 1 the speaker is fully extended, blowing out your ears and scaring your cats, while at -1 it is fully retracted, blowing out your ears and scaring your cats. To make a meaningful visual out of this we just set up a graph where time is plotted on the horizontal axis and amplitude on the vertical. So at a really high resolution, that might look like this:
Okay, maybe it would look like that if you were living in the 80′s. Or if you were really into oscilloscopes. In reality with most popular songs being two to five minutes long, we’d be looking at HUGE graphs. One three minute song sampled at 44.1kHz/s comes out to be eight million samples per channel. Per channel. Since most modern music is in stereo, we’re looking at two graphs now. So how do we compress this data and view it in a meaningful way? We cheat a little. We kinda scrap the whole graph/function thing. Well kinda. Let’s say you have a window 1000 pixels wide and a sound 3 minutes long. We have to compress enough samples together in order to represent them using each pixel [about 7,938 samples per pixel]. Averaging doesn’t work because with values oscillating between -1 and 1 the mean is usually zero. We could take one sample every so often to represent an entire chunk and plot that, but that’s just resampling at a much lower resolution, which results in aliasing and all sorts of bogus stuff. Take this video for instance:
Just instead of helicopter blades not moving, it’ll be your data points. No, instead what we do is we scan every single sample and pick out the largest and the smallest numbers from the chunk, which in our case means running through every 7,938 samples and picking the biggest and smallest ones. Then we plot them on top of each other, maybe with a line connecting them. Flash gets a little slow working with lists and arrays this big, but I’m sure we’ll figure out some neat tricks to get this stuff working fast. That said, here’s a little demo. If you can figure out the song I’ll give you a million bucks. [Let it chug for a while, Flash is a slow beast]: