According to our measurements, the audio signal driven by a computer sound card typically ranges from -2V to 2V. However, our ADC is only capable of sampling within the rail-to-rail voltage, which is from 0V to 3.3V. Because of this, it is necessary to construct a circuit that decreases the amplitude and applies a DC offset to the input signal. In other words, we need an an attentuator with a positive DC offset. We have decided to use a non-inverting op amp with a summing circuit on the positive pin.
As shown in Figure 10 in Appendix A, R1 and R2 form a summing circuit, thus adding Vref to the input signal. The sum is then passed into the non-inverting circuit, and the output is low passed to remove noise. Let A be the desired gain, and b be the desired DC offset. Then the following relations can be obtained:
In order to convert the signal range from (-2V, 2V) to (0V, 3.3V), we need to set A = 3.3 / (2 - (-2)) = 0.825, and b = 3.3 / 2 = 1.65V. By picking R1 = 10kΩ and Rf = 20kΩ and Vref = 2V, we get R2 = 10kΩ and Rg = 30769Ω. In the end, we have decided on the following values:
Lastly, we have included a first-order low-pass filter to remove noise before sampling.
The PIC32 on-chip ADC does not have enough resolution to provide CD quality, and our ADC needs to support two channels in order to create stereo sound. As such, we have chosen to use the LTC1865L, a 16-bit, dual-channel ADC. This chip is capable of supporting up to 150k samples per second, more than enough for our 44.1kHz requirements.
Similar to our ADC constraints, our DAC also needs to support two channels and CD quality. From our research, it seems that DACs that achieve CD quality often require more fine control. As a result, we aimed for a chip that is simple to work with. We have chosen the CS4354 under these constraints. This chip required the I2S protocol for digital data input, which the PIC32 conveniently provides. It also requires a separate 5V source, which we have provided using USB power. The output of each DAC channel is then low-passed to remove noise.
Software Design top
Transmitter Software Architecture
There are four main components within the transmitter program: ADC Reading, compression, ring buffer and data transmission. This program consists of the main loop and an ISR that handles audio input. The ISR is timer driven and reads the current audio value from the ADC at a 44.1kHz sampling rate. At the same time, the main loop waits until we have a continuous range of 256 samples from both the left and right ears and then it transmits them in a batch.
Receiver Software Architecture
Similar to the transmitter side, there are four main components within the receiver program: data reception, ring buffer, decompression, and DAC output. Data reception is executed within the main loop, and blocks on SPI read. Once a data packet is read, it is then unpacked and pushed into the ring buffer. A separate I2S-based ISR handles decompressing the data and writing to the DAC.
Adaptive Differential Pulse Code Modulation
Our ADPCM code is heavily based off of Microchip's sample code in their AppNote. As mentioned in the math background, we keep track of both the predictor and the quantization step for both the left and right ear individually. This is because the audio streams are independent and we don't want to correlate them together.
On the encoding side, we take in a 16 bit audio sample and generate a 4 bit encoded value. The most significant bit of the four bits determines the sign of the delta. If it is 1, then are current audio sample is smaller than the previously predicted sample. Otherwise, the audio sample is higher than the previously predicted sample. The remaining 3 bits represents how many quantization steps the difference is. If the remaining 3 bits is greater than or equal to 0b100, then we want to increase our quantization step. Otherwise, we decrease the quantization step.
Figure 5: ADPCM Encoding
The decoding side does the same as the encoding side but in reverse. It takes in a 4 bit encoded value and outputs the 16 bit decoded audio sample. It achieves this by recalculating the delta off the predicated value and adding that the predicted value. The quantization step change is calculated in the same way as in encoding.
Figure 6: ADPCM Decoding
We used the SPI protocol in order to communicate with our ADC. We use the transmitter MCU as master and the LTC1865L as slave. As the LTC1865L provides stereo input, we alternate between requesting for the left and right audio signal. In order to tell the ADC which side to read from, we send two configuration bits and then read the response. As per the datasheet, 0b10 is sent to request the left channel which 0b11 is sent to request the right channel.
We read audio data on a timer interrupt set to 88.2kHz. The timer is set to twice the audio sampling rate in order to support both left and right side. On timer interrupt, we alternate sides and then compress and store the audio data into a ring buffer. We compress before storing in order to reduce the amount of memory required. One caveat of this approach is that the left and right audio signals will be desynchronized by just over 11µs. However, such a small lag is completely undetectable by the human ear.
We used the I2S protocol in order to communicate with the DAC. The PIC32 supports this protocol as an extra setting on the SPI channels that it provides. We use the receiver MCU as master and provide SCLK, MCLK, and LRCLK entirely from the MCU. While the CS4354 chip is able to produce an internal SCLK, we discovered that the internal SCLK often caused desynchronization issues.
As per the I2S specifications, we used an LRCLK with a frequency of 44.1kHz. We also used the REFCLK output of the PIC32 to act as MCLK, generating an 11.2896 MHz clock. On every LRCLK edge, we pull the next audio signal from the ring buffer, decompress it and send it to the DAC. Once again, we alternate left and right resulting in the same audio desynchronization as in the ADC. Because the DAC we use supports up to 24 bits resolution, we left shift our audio signal by 8 in order to mask it as a 24 bit number.
We utilize a ring buffer in order to continuously buffer the audio input and output of the chip. The ring buffer acts as FIFO queue that allows us to maintain the ordering of the audio data stream. This also allows us to buffer a large list of samples before we transmit them all in a single packet. On the receiving side, the ring buffer allows us to buffer the packets that come in to later be outputted to the DAC.
We used the SPI protocol for data transmission between the transmitter and receiver. The transmitter MCU is set as master, and the receiver MCU is set as slave. Each data packet contains 256 bytes, where each byte correspond to a single unit of compressed data containing information from both left and right audio channels. In order to identify packet boundaries, we standardized on a 4-byte sync word, which is constructed by taking a single-byte constant and repeating it four times. The sync word is not strictly necessary in our current setup since the receiver can pick up any consecutive 256 bytes and view that as a packet. However, we designed the system with potential wireless usage in mind, and the sync word very roughly mimics a UDP packet header. The transmitter continuously transmits packets without any acknowledgement from the receiver, since doing so may accumulate delay over time without much performance benefit. On the other hand, the receiver does not proceed to push to the ring buffer until a packet header is recognized.
Sound Quality and Accuracy
In order to isolate the performance of our compression scheme, we must ensure that the audio input circuit produces a signal that has nearly the same waveform as the audio input, with the exception of amplitude and offset. Figures 7 shows that the circuit operates as expected for both left and right channels while audio is playing, and Figures 8 demonstrate the expected DC offset while there is no audio.
Figure 7: Typical audio signal comparison for left channel (left) and right channel (right). CH1 is the audio source, and CH2 is the output of the op-amp audio input circuit.
Figure 8: Signal comparison for left channel (left) and right channel (right), with no sound. CH1 is the audio source, and CH2 is the output of the op-amp audio input circuit.
Figures 9 shows the comparison between audio input and the final output of the system. Note that very frequently, ADPCM causes the output to overshoot, but quickly returns to the correct value. This is the adaptive part of the algorithm at work: the system tries to maintain the previous increment, but the input drastically changes, and in response, the algorithm changes the increment by a large value to compensate for this. Another observation is that the scale of the output is set to 100mV per grid, instead of 500mV per grid, so amplitude is not preserved.
In terms of sound quality, overall the output very closely resembles the audio input. There is a slight detectable noise in the background, which may be due to oscillations in the power line. Another bug is that when the system starts, the output is very quiet. Over time, the volume suddenly boosts up at sporadic times. This is related to the difference in amplitude scale mentioned above. Due to time constraints, we were unable to investigate into this issue, but we suspect that it could be related to overflow or underflow in the ADPCM algorithm implementation.
Figure 9: Typical audio signal comparison for left channel (left) and right channel (right). CH1 is the audio source, and CH2 is the final system output.
Speed of Execution
There is no audible delay between the audio input and audio output, so we were able to reach our performance goals. We have also allowed the system to run for a long period of time in order to test whether there are small delays that accumulate to larger delays over time, but there is still no audible discrepancy. As shown in Figure 9, the right channel has a slight millisecond delay, but this is not audible.
Our final result did meet our original expectation of being able to transmit compressed audio. We were also able to meet the goal of transmitting CD quality sound at stereo 44.1kHz, 16 bit resolution. Our compression scheme yielding a compression ratio of 4:1 with very little noticeable degredation of sound quality. However, we were unable to extend the transmission protocol to the wireless domain due to time constraints.
Further work that can be done in this project include exploring different audio compression methods. We want to be able to compress audio in real time on a relatively low processing power chip. To do so, we can look at other methods of real time audio compression. For example, Opus is currently one of the leading real time audio codecs. Boasting a low algorithmic delay of around 25ms, Opus is currently being used in many commercial VOIP programs such as Skype, Mumble, Teamspeak, etc. Additionally, Opus utilizes subcodecs, CELT and SILK, in order to compress music and speech audio, allowing it to maintain quality for both music and speech. If Opus proves to be too algorithmically intensive to run on a PIC32, we can look further into either CELT or SILK.
Because our transmission protocol utilizes a packet based transmission scheme, we can easily extend this to work across other types of channels. For this project, we transmit packets over SPI to another PIC32, but we could swap the SPI protocol out for either Bluetooth or WiFi. These protocols would enable us to transmit our audio data wirelessly between two microcontrollers, creating an end to end wireless audio communication protocol.
Intellectual Property Considerations
Excluding the ADPCM code, the code used for this project was entirely written by us. While we used a few pieces of sample code from Microchip as reference, we ultimately wrote everything ourselves.
A portion of the ADPCM code was taken from the AppNote from Microchip. However, in order to get it to work on the PIC32 we were using, we had to modify the sample code. The AppNote for ADPCM has a Software License Agreement that states that the sample code is intended to only be used with products manufactured by Microstick. We have abided by this license as the PIC32 was manufactured by Microstick.
We designed this project keeping in mind the IEEE Code of Ethics.
We did not utilize any harmful materials in the construction of our project. Additionally, our final product was constructed keeping in mind human safety. Neither the transmitter nor receiver were designed to cause any hard to humans. All of our hardware runs at low current at either 3.3V or 5V. The current will not harm a user even if he touched the exposed wiring. If we were to construct this device to be sold rather than as a prototype, we would remove the use of whiteboards to reduce the wiring exposed to the end user. Additionally, as the receiver and transmitter modules are self contained, we could add a casing for both modules for further safety.
Our project is designed to be able to be used by anyone. This means that even people with disabilities will be able to utilize our project.
All work in this project stated as independent work was done entirely by ourselves. We have also including references to all of the datasheets, figures, and additional information we have utilized in the Appendices and References sections. We certify that we have openly disclosed anyone else's work we have used in this project.
There are no legal consideration as far as we know of. There are not legal regulations associated with either the SPI or I2S protocols.
Figure 10: Audio input schematic
Figure 11: ADC Schematics. Source: LTC1865L datasheet. Note: the LTC1865L schematic is identical to the LTC1864, with the exception of the IN+ and IN- pins replaced by CH0 and CH1.
Figure 12: DAC Schematics. Source: CS4354 datasheet. Actual component values differ slightly due to lab supply limits.
B. Cost & Parts List
|MCP6242 Op Amp
|5V Power Connector (USB)
|Assorted Resistors, Capacitors and LEDs
C. Distribution of Work
|Hardware Design & Implementation
||Software Design & Implementation
|Audio Input Design & Implementation
||ADPCM Research & Implementation
D. Code Listing
Ring Buffer code:
We would like to thank Bruce Land for teaching this fantastic class. We would also like to thanks the lab TAs for helping with debugging and keeping the lab open for us throughout the semester.