ECE 4760 16-bit Stereo Wave Player

Introduction top

This project aims to implement a cost-effective wave player based on AVR (ATmega / ATiny Series) with CD-Audio Quality, which can play 8-bit/16-bit Mono/Stereo standard RIFF (Resource Interchange File Format) wave files. This project can be applied into many applications such as bus / subway auto-annoucing system, elevator voice indication system.

Current solutions to these kinds of “announcing system” are limited by the OTP (One-Time Programmable) voice chip with small capacity (normally use EPROM as storage media), not to mention the relatively expensive price (the price of OTP voice IC is determined by it’s capacity – voice recording time, normally 10s~200s). For instance, the OTP chip AP89010 (manufactured by APlus) is $0.48 (10 seconds OTP), and AP89341 is $2.43 (150 seconds OTP), and one announcing system may use multiple voice chips (2-10pcs) which also results in the complexity of hardware design. On the contrary, the price of standard SD card is going cheaper with the rapid development of storage technology, 512MB SD card (Kingston) only needs $3.65. Compared to OTP chip, 512MB storage can hold as many as 128 songs with MP3 format, can contain 10 lossless songs (WAV format), and the key point it that SD card can be easily formatted, songs or recorded files can be modified at will – as long as you have a PC supports FAT filesystem.

As mentioned above, the announcing system using OTP chip solution has a complicate circuit – different audio announcements need to be played from different chips, the control logic is quite a burden which definitely increase the circuit complexity including the PCB area. Quite the opposite, our system only need 6 main wires (SPI bus: 4 wires, PWM output channel: 2 wires) to play the songs or any recorded wave files. AVR chip can be purchased as cheap as $4 (ATmega328). So the whole system can be very cost-effective.

Final foot and wrist modules

High Level Design top

Project Idea

There are many MCU-Based MP3 player, however, these kinds of MP3 player solution not only needs extra hardware decoder, but also needs DAC chip, and the circuit is complicate. How to design a chip music player while reserve the audio quality? To substitute the MP3 decoder, we can simply use WAVE file which format is simpler than MP3, and it can be easily decoded by MCU. To substitute the DAC chip, PWM can be used to replace it with simple R-C filter, such as the Cricket Call Generator. Based on this kind of idea, WAVE player based on AVR wais born.

Background Knowledge

Since SD card is used as the storage media, except SDIO, the only way to commuinicate with it is SPI. Generally, SPI has four working modes.

To operate SD card using SPI, the working mode must be specified. In "SD Specifications Part 1 Physical Layer Simplified Specification", Section 7.2 mentions part of the SPI protocol, but doesn't specify any details. After several experiments, the suitable SPI working modes for SD card communication is MODE0 and MODE1.

Meanwhile, instead of reading RAW data from SD card, FAT16/32 file system is implented, so that WAVE files can be easily stored into SD card via any PC with SD card reader socket. The FAT file system is open source "Petit FatFs" which is a simpler version of popular "FatFs" developed by ChaN.

With knowledges above, I still needs to get familiar with the format of WAVE file which is actually a subset of "RIFF" format developed by Microsoft. RIFF audio format will be introduced in detail in software section.

Petit FatFs Open-Source File System Module

Logical Structure

At a high level, the logic structure is much simpler. It contains "Storage Module" and "PWM-DAC Module. With embedded hardware SPI interface and PetitFatFs module, WAVE files stored in SD card can be retrieved. Then ATmega1284 will parse the WAVE file and get the pure audio data which is used as the PWM output (4 channels). Four 8bit PWMs are divided into two groups (corresponding to audio left channel and right channel), and combined separately to two 16bit "PWM-DAC" outputs. These two output channels are finally connected to the loudspeaker.

System Logical Structure

User Experience (UE), Hardware (HW) and Software (SW) Trade-offs

There always exists trade-offs in UE, HW and SW. This project actually shows how to appropriately handle these trade-offs appropriately.

UE and SW

There exists two methods to retrieve audio data from SD card. We can write RAW audio data into SD card with many 3rd-party software tools. In this way, the software design is pretty easy: all MCU needs to do is just read audio data via SPI interface, and forward the audio data stream to PWM channels. However, it will be very annoying to replace audio data in SD card.

The other method is to use standard file system so that SD card can be operated normally in any PC with FAT support. With this method, it is very convenient to replace / update audio files, the difficulty is that a lightweight File System must be implemented so that MCU is able to retrieve data via standard FAT16/32.

HW and SW

MP3 is a popular audio format with smaller file size and good audio quality. If we use MP3 files as the audio source, then a hardware decoder must be implemented. ATmega1284 or any other low-cost 8-bit MCU is not able to guarantee the playback quality since it needs to parse the complicate compressed MP3 file. On the contrary, the format of WAVE file is easier: it is composed by a simple file header followed by RAW audio data which can be directly output to PWM channels (16-bit audio data needs simple subtraction operation).

Another advantage to use WAVE file is that it is lossless! As long as we can guarantee the playback quality (output audio data in exactly the same frequency as sample rate), theoretically speaking, we can achieve very high audio quality. Even though WAVE file is much bigger than MP3 file, but it would not be a problem: SD card is so cheap these days.

Standards

SD Specifications Part 1 Physical Layer Simplified Specification (version 4.10);
Microsoft Extensible Firmware Initiative FAT32 File System Specification (version 1.03);
Microsoft New Multimedia Data Types and Data Techniques (version 3.0);
ANSI-C Specification (ISO / IEC 9899).
GCC Manual (version 4.9.2)

Relevant Copyrights

Petit FatFs system is a lightweight Fat File System which supports FAT12/16/32, and it is developed by ChaN. Petit FatFs is specially designed for those MCUs with low RAM capacity.

Hardware top

Hardware Overview

As mentioned before, the hardware is pretty simple. The whole schematic is shown below.

System Schematic

Hardware Design: SD Card SPI Interface

As mentioned in schematic note. SD card can only accept 3.3V operating voltage level. Even though ATmega1284p can also work at 3.3V, it may not be able to work at 16MHz ("Atmel-8272-8-bit-AVR-microcontroller-ATmega164A_PA-324A_PA-644A_PA-1284_P_datasheet", pp 336, Figure 28-1). To ensure system stability, ATmega1284p is powered by 5V while SD card is powered by 3.3V with a cheap, linear regulator AMS1117-3V3 (not shown in schematic).

To interact with SD card correctly, resistor dividers (3.3K and 1.8K) are used to guarantee AVR's SPI signal level would never exceed 3.3V. Note that, for MISO signal, 3.3V is acceptable by AVR, so it does not need extra resistor divider.

AVR Maximum frequency vs. VCC

Hardware Design: PWM Combination

Single PWM output can use one output pin to generate analog signals with 8-bit resolution (as we learned in LAB2). However, we can simply create another analog signal via another PWM output channel. Then we can make this second analog signal represent lower order bits via selecting correct summing ratio.

In this project, the PWM Combination circuit is composed of two resitors: 1M and 3.9K. And the summing ratio is about 8:1. The larger resistor is connected to LSB PWM output channel while the smaller resistor is connected to MSB PWM channel. In this way, when MSB PWM outputs 0V while LSB PWM outputs 5V, we can calculate the minimum resolution is about 0.01942V. Assume resistor accuracy (tolerance) is 1%, then the actual output should be: 0.0191V < V < 0.0199V. The deviation range is only 800uV. It is obvious that if the resistor accuracy can be up to 0.3%, then it is able to achieve full 16-bit resolution.

Software top

Software Overview

The whole software design is implemented using AVR Studio 6.2.1502 - Service Pack 1. The project architecture is designed with clear classifications: "app" (Application) layer, "bsp" (Board Support Package) layer, "drv" (Driver) layer and "PetitFatFs" (File System) layer.

Software Design: Top Level

From the perspective of top level, the whole system can be divided into 7 states, and a simple state machine can be created to guarantee the system functionality (See Figure Below). All 7 states plus one state for debugging purpose is clearly organized, but it is still required to change the low-layer Petit-FatFs interface to cooporate with this state machine. Specifically, we must change the low-layer data retrieving function in order to implement special data processing functionalities.

Top Level Software Flow Chart

Software Design: Data Retrieve and Consume

The low-layer read data function is named as "disk_readp". File System layer calls this function to initialize FAT, read directory, and read specified files etc. The function check the type of the SD card to see if it is required to convert the address (Logical Block Address to Block Address). Then CMD17 (READ SINGLE BLOCK) command is sent to SD card, once valid response is received (0xFE), data stream is created to forward specified data into assigned buffer. Finally, it skip the "un-aligned" data stream (the required data volume may not be an integer multiple of 512 bytes). All of the supported SPI commands of SD card are listed in Appendix C.

Since the function "disk_readp" needs a pointer to the target buffer, we can manipulate this pointer in order to embedd our own function. For example, if the pointer is valid, it will execute required read operation. On the contrary, if it is invalid, say it equals to 0 (NULL pointer), then it execute our specified function. In this way, we can manipulate the data flow to cooperate with our own functions without hurting the normal operation of Petit-FatFs. With this modification, the core functionality turns into buffer data flow control.

Since we can know all information by parsing the header of the wave file, we can set hardware parameters based on the parsing result, such as the configuration of "Sampling TIMER" which is used to forward audio data to PWM channels. In this way, the function of file system layer retrieve data from SD card to specified audio buffer while the "Sampling TIMER" consumes data from the audio buffer. A ring buffer is implemented to balance the Retrieve / Consume relationship. The diagram of this idea is shown below.

As it is shown in the diagram, the ISR is designed to be short and simple in order to guarantee the accurate interrupt interval.

Diagram of Retrieving and Consuming Data Stream

Software Design: Wave File Parsing

As mentioned above. The format of WAVE file is pretty easy to parse. As a subset of RIFF, it has exactly the same structure defined in "Microsoft New Multimedia Data Types and Data Techniques (version 3.0)". According to this specification, the RIFF WAVE file format is shown as below.

RIFF WAVE File Format

Based on this format, we can design appropriate software to parse the wave file and get essential information we actually need. Note that the total bytes of "Chunk ID" and "Chunk Size" field is always 8, and considerring that the upper layer function "pf_read" will internally increment the data pointer, we can actually fetch 8 bytes every time, and do further process based on "Chunk ID" information (This kind of coding style is called FCC - Four Character Coding). Note that the information is stored in "Big Endian" format, we can either convert the original data or convert normalized ID for comparison. For instance, we can load four bytes directly since it is efficient, and compared this word to reversed Chunk ID. This file parsing information is implemented in the function "App_Wave_ParseHeader". The software diagram of this function is shown below. Please note that, the judgements of "LIST" (play list) chunk, "DISP" (display) chunk, and "fact" chunk are also added to guarantee compatibility, and we just need to skip these chunks if it happens. Meanwhile, for both debugging purpose and software robustness, different return values are defined. Based on the audio information we get from the audio file, we are able to check whether the file is corrupted or not and return corresponding error information.

Diagram of Wave File Parsing Function

Now that we know the format of header file, we still need to know the audio data arrangement of different WAVE files. It is known that there exists four kinds of WAVE files: 8-bit Mono, 8-bit Stereo, 16-bit Mono and 16-bit Stereo. Only with this understanding, can we appropriately design efficient buffer structure and data retrieving functions. Simple code in ISR means extra work needs to be done outside the ISR. Based on RIFF file format of WAVE file, the data arrangement is shown below.

Audio Data Arrangement

Software Design: Audio Buffer

Based on the audio data distribution of different wave files, four independent buffers are created according to the format of 16-bit Stereo Wave File Format, since it needs 4 bytes per sample. All buffers are of the same size: 256, and equipped with the same "Head" and "Tail" 8-bit pointer. With 8-bit pointers, ring buffer operations becomes simpler: simply increment head or tail pointer without checking buffer size limitations, since it will turn into zero once its value exceeds 255. Another reason is that 8-bit addition is faster that 16-bit addition. It is better to design 4 independent buffers rather than creating a really large buffer.

Software Design: Preparation Before Playing

Before we play specified wave file, we need to configure the sample TIMER correctly, and assign appropriate variables to make the data process efficiently. Note that in "Software: Data Consume and Retrieve" section, we use a function pointer to call specified function to forward data stream into corrsponding buffers. Based on the data arrangement of different wave files, all we need to do is to make sure that each data is put into exactly the same location as the data arragement. As for mono wave files, we just simple put the same value to buffers for left / right channels. The function pointer prototype is "typedef void (*memProcFunc)(uint8_t)", and a variable of "memProcFunc" type is defined. This function pointer is initialized when the audio information is correctly parsed from the header.

Four independent functions are implemented for different data buffer process. The audio buffer arrangement of these four functions are shown below.

Audio Buffer Arrangement

Software Design: Play Wave File and Key Control

The Wave Play function is pretty simple, it firstly process the audio data after the header since the size of the data volume is less than 512 bytes. After that, this function process the remaining audio data in the unit of 1024 bytes until the size of remaining data volume is less than 1024 bytes.

The key scan subroutine is embedded into Wave Play function. To remove key glitches, a scan subroutine must be implemented. Now that we read 1024 bytes every time (except the data following the header). Based on the speed of SPI interface, we can roughly calculate the time interval between read operations. In this way, we can take advantage of this processing interval to execute key scan routine.

Time Calculation

The software flow chart of "App_Wave_Play" and "App_Ctrl_KeyScan" is shown below.

Softwave Flow Chart of Wave Play Function

Softwave Flow Chart of Key Scan Function

Testing and Results top

Hardware Test

The key point of hardware test is the accuracy of 16-bit Combination composed of high-precision resistors. It is pretty difficult to measure its functionality. High-Precision Muti-Meter is required to achieve rigorous validation. To logically verify this test, a test firmware is created to evaluate the hardware. This test firmware will generate PWM in Fast PWM mode, and the PWM value is incremented every 200ms. With Proteus (version 8.1 sp1) and the schematic mentioned in "Hardware: PWM Combination", the oscilloscope outputs is shown below. Note that, the horizontal scale is adjusted until the "Climbing Steps" can be clearly observed.

Proteus Simulation Result

From the waveform we can see that 16-bit PWM output has nearly invisible "steps" compared to 8-bit PWM output. However, software simulation is still not a good way for evaluation and it does need improvement.

Software Test

It is also a big problem to test the audio quality since it is very subjective. What's more, human ear is not as accurate as machine. There actually exists two ways to evaluate audio quality, even though it is not so professional. One way is to record the audio data output (via LINE-IN audio port) and save the audio data to the same wave format as the original wave files. Then, use "Adobe Audition" to compare these two audio files. The other way is to capture the play time and compare it with original wave file.

Software Test: Sample Rate

For each wave file format, choose a different song with same sample rate (44100Hz) to implement this simple test. A simple VC program is created to capture the song start information (a simple "play!" string) output from hardware USART interface. Once the string is captured, a TIMER is started to record elapsing time until the end information ("over!" string) is captured. The result is shown below.

16-bit TIMER1 is used as the sampling TIMER, and the value for output compare register is (16000000/44100 + 0.5 - 1 = 362), "0.5" is an important way to guarantee the precision. Even though, the sample frequency in this way is 44077.135 Hz. That's why the deviation exists.

From the result we can also see that, as the bytes per sample increase, the deviation value becomes smaller. It is because the audio data is consumed faster with 16-bit stereo audio format. We can imagine that if the sample rate of the wave file goes a bit higher, then it is very possible that current system cannot play it normally because of the limitation of SPI speed (8MHz), and CPU overclock may be essential if we want higher audio quality.

Wave Play Time Test Result

Software Test: Audio Quality Analysis

It is difficult to compare two audio files intuitively even with Adobe Audition. However, we can merge the recorded wave file with original wave file to test Mono Wave Files. For instance, with Adobe Audition, we can put original wave file to left channel and copy the recorded wave file to right channel, then Adobe Audition is able to analyze it.

For 8-bit Stereo wave files, we actually need to extract audio data from each channel and repeat the procedure mentioned above. On the other side, for 16-bit Stereo wave files, we can simply use the high-order byte as the data source for analysis. It is safe because Adobe Audition only accumulate the audio samples at different frequency points, it is not relevant with the amplitude. In this project, I just focused on 16-bit Stereo analysis. However, I was not able to analyze all four kinds of wave files due to time limitation. But this test method can help to do cross-analysis to get more information.

The FFT diagram generated by Adobe Audition is shown below. All the data of FFT analysis can be exported for further analysis.

Then the analysis result is forwarded into excel to calculate the deviation. Part of the result is shown below. From the FFT waveform we can still see some noise which causes the deviation with original data. However, if a good low-pass filter was implemented, the result should be better. Even though, the average deviation is 0.46% which proves that the audio quality is still good.

FFT Analysis

Safety

This design does not currently have electrical components that are in direct skin contact, nor do we expect high current draw beyond a few microamps. On the contrary, further implementation will add ESD protection circuit to avoid circuit damage caused by electrostatics on human body.

Usability

This system is pretty easy to operate with two simple keys. A further implementation will embedd a display module into the system to improve user experience.

Conclusions top

Accomplishments and Further Extensions

The expected design and results for the core functionality went smoothly as planned. The system is reliable and able to process standard wave files with CD Audio Quality (Sample Rate <= 44100Hz). Even though it has no hardware filter, the audio quality is still good even with normal headset.

As mentioned in sections above, for further extensions, following ideas should be worthable to carry out.

UE （User Experience） Enhancements
A display module together with a simple low-pass filter will be implemented to enhance operability and user experience. Considerring the size of the circuit, a small OLED module maybe used to display essential play information.
Audio Quality Enhancements
To increase audio quality, a standard 11.2896MHz OSC may be replaced to guarantee the accuracy of sample rate. Also, a well-designed low-pass filter can also help to improve audio quality.
Test Improvements
As mentioned in Test section, cross-analysis for different wave files is very useful for further research. Meanwhile, use high-precision digital voltage multimeter (FLUKE 8846A) can also stronglly prove the practicability of the PWM combination circuit to substitute 16-bit DAC.

Intellectual Property Considerations

Except the open-source Petit-FatFs developed by ChaN, the rest code is my own.

Ethical Considerations

There are no known ethical considerations regarding the design of this project.

Legal Considerations

There are no known legal considerations regarding the design of this project since it is not implemented for commercial purpose. If this project was to be re-purposed for commercial purpose, then it should still be OK since the author of Petit-FatFs authorizes free usage (See Appendix).

Appendices top

Appendix A: Parts List and Costs

Part	Vendor	Cost/Unit	Quantity	Total Cost
Atmega 1284	Lab Stock	$4.67	1	$4.67
Button	Lab Stock	$0.00	2	$0.00
High-Precision Resistors	Lab Stock & AMAZON	$0.34	4	$1.36
SD Card (1GB)	Kingston	$4.95	1	$4.95
USB-USART module	Personal Stock	$6.65	1	$6.65
LT1117-3V3	Personal Stock	$4.52	1	$4.52
			TOTAL:	$22.15

Appendix B: Petit-FatFs Author Claims

/*----------------------------------------------------------------------------/
/ Petit FatFs - FAT file system module R0.03 (C)ChaN, 2014
/-----------------------------------------------------------------------------/
/ Petit FatFs module is a generic FAT file system module for small embedded
/ systems. This is a free software that opened for education, research and
/ commercial developments under license policy of following trems.
/
/ Copyright (C) 2014, ChaN, all right reserved.
/
/ * The Petit FatFs module is a free software and there is NO WARRANTY.
/ * No restriction on use. You can use, modify and redistribute it for
/ personal, non-profit or commercial products UNDER YOUR RESPONSIBILITY.
/ * Redistributions of source code must retain the above copyright notice.
/
/-----------------------------------------------------------------------------/

Appendix C: SD Command List (SPI Mode)

SD SPI Command List

Appendix D: Source Code

Please feel free to contact me via e-mail.

References top

Acknowledgements top

I would like to thank Bruce Land and TA (Eileen Liu) for all of their assistance through this tough semester. Bruce geve me a lot of suggestions and guidance on labs, ECE 4760 final project and M.Eng project. Eileen Liu helped me a lot on how to write lab report, and inspired me to move forward by giving useful suggestions. Also, I want to thank Cameron, Eilen Chuang and Julie Wang for making this beautiful web site layout.

ECE 4760: Final Project

AVR 16bit Stereo Wave Player

Simple, Cost-Effective, CD Audio Quality

Wancheng Zhou (wz233@cornell.edu)