Audio Synthesis

Hardware:

The audio output is connected through an audio jack to the 8 bit DAC. The DAC required an extended power supply, requiring +5,0,+12,-12V connections. The analog output is sent directly to the internally amplified computer speaker. The end result sounded really good. Note that no opamps or analog circuitry was needed in the signal path.

Software:

The software was fairly complex, though it obviously had to be kept short in order to meet the speed requirements to produce the audio in real-time.

At bootup, the UART (if connected properly) will prompt the user to enter the first sample. The easiest way to upload the samples is to open the sample file in notepad, copy the text, and 'Paste to Host' in hyperterminal. The sample format is simple ACII hex data looking like the following: 80 8D 7F 00 FF and so on.... It does not matter where the spaces or carriage returns are. The end of the sample data is denoted by an 'X'. After four Xs it is assumed that all the samples have been received and the upload process is over.

In real-time mode the processor has a bunch of tasks to perform in order to create an audio sample. The following things need to happen (22050 times/second):

Scan 4 input lines and determine in a voice needs to start playing.
Determine if a voice is playing at all.
Determine if a voice is at the end of it's sample area and should stop playing on the next loop.
Load all sample data from SRAM for each active voice.
Create the output waveform using a noise-dither technique.
Wait until the interrupt is called to output wave to hardware.

The above scheme has the following audio properties. Any MIDI in data will start the corresponding waveform which will play to completion. If a MIDI in packet is received for a voice that is already playing, the voice will be cut-off and the new one started. If a voice packet is received on a new channel, the old one will continue to be played. Using this technique, up to 4 voices can be played simultaneously.

A sample which is not playing add 127 to the output wave data. Thus, silence on all 4 channels corresponds to 127+127+127+127=508. A random 2 bit number is then added which will aid in a better output bit reduction. The 10bit number is then rotated 2 bits right, producing an 8 bit output. In complete silence, because of dithering, the output will fluctuate between 127/128. A full scale output on all 4 voices will produce 255 at the output, and the minimum out on all the channels will be 0. The details of the theory behind the dither is complicated, but you can just assume that the noise created in the dither will mask errors created if we had just truncated off the last 2 bits of the signal, resulting in a much more pleasing sound to the ear.

To look at the details of how the loop was performed you can look at our code.