ece5760 Final Project:
NES Music Player
Andre Heil (avh34)
Scott Zhao (sz296)
For our final project we implemented an audio system that emulates the Nintendo Entertainment System’s Audio Processing Unit, or APU. By combining our Verilog implementation of the APU with an SD card reader and software emulator of the 6502 processor, we use our system to read and play video game soundtracks and original chiptune covers. Using an incremental design and testing strategy, we developed a system which produces a sound accurate to and captures the charm of the original NES.
2. High-level design
The motivation for this project comes from our fondness for NES music, which is easily recognized by its pulse wave melodies, triangle wave bass lines, and static-like percussion. With these limited sounds to work with, video game music composers were compelled to alter these sounds in creative and expressive ways. The resulting tunes are memorable enough to evoke instant nostalgia in those who grew up with the console, and influential enough to inspire sounds in today’s electronica.
Our APU implementation is composed of four sound channels: two pulse wave channels, a triangle wave channel, and a random noise channel. It’s worth noting that the NES APU also contains a fifth channel, the delta modulation channel (DMC), which is used to play low-quality delta-encoded samples. Although we began work on this sound channel, we opted not to finish implementing it due to complications with reading from memory as well as the fact that many songs do not utilize the DMC.
Meanwhile, we use a NiosII processor to emulate the NES CPU in software. The NES CPU is based on the 6502 processor, which uses a variable-length instruction set with 56 different instructions. It is necessary to emulate the CPU in full, since Nintendo Sound Format (NSF) files contain raw CPU instructions as opposed to encoded audio. Specifically, the CPU emulator is responsible for reading through Nintendo Sound Format files and accordingly writing values to the APU’s registers.
Also implemented on the NiosII processor is the SD card reader. To make the Nintendo Sound Format files available to the NES CPU we use an SD card. Using the SD card slot on the FPGA we can forward the SD card pins to the NiosII processor using PIOs and implement the SD card reader in C. Using a C implementation of a SD card reader capable of reading the FAT16 format, we can select and which NSF files we want to play, and tell the NES CPU to use that file.
To provide the user a nice interface to manage what songs they want to play, a simple terminal program was created. This program allows the user list the available NSF files and select which file from the SD card they want to play. Once a valid file has been selected, they can start the select which song on the NSF file they want to start from and start the playback. On the FPGA itself there are four buttons that allow the user to skip to the next track, skip to the previous track, pause/play, and exit from the current NSF file.
We chose to implement the 6502 processor in software because our original plan design plan was to implement just the NES APU, however we found out that in order to be able to read the NSF file, we would need a 6502 processor since the NSF file data is just a stream of 6502 instructions and those instructions set the APU values to produce the music. Since it would take a while to implement all of the instructions for a 6502 processor in hardware and test it, we decided it would be easiest to take an existing implementation of the 6502 processor and tweak it so it would meet our needs.
A block diagram depicting the flow of our audio system is included below.
3. Design details
Our implementation of the APU closely follows implementation details available on the Nesdev wiki. For this reason, the discussion below will focus more on the high-level behavior of each channel/functional unit as well as differences between a faithful design versus our own--rather than repeat the detailed information given on the wiki.
The APU runs on a 894 kHz clock and consists of three unique channels that run independently of each other: (1) the pulse wave channel, (2) triangle wave channel, and (3) noise channel. Each channel reads from three or four dedicated registers and is composed of several functional blocks.
Common functional blocks are the (1) timer, (2) length counter, and (3) envelope unit.
Additionally, each channel takes lower-frequency clocks from the frame counter, which generates slower 120 and 240 Hz clocks for time-sensitive functional blocks like the length counter and envelope unit.
Pulse Wave Channel
Generates a pulse wave output with variable duty cycle (12.5%, 25%, 50%, or 75%). It contains a sweep unit which optionally produces a continuous bend from one pitch to another.
Notably, our implementation of the sweep unit ignores some corner cases which are checked for in the original NES: when the target period overflows, the channel is silenced. This effect occurs regardless of whether the sweep unit is enabled or not. We excluded these bound checks based on observations that the vast majority of songs to not utilize this condition of the sweep unit to silence pulse channels.
Triangle Wave Channel
Generates a 4-bit triangle wave output. The waveform loops through a sequence which counts down from 15 to 0, and from 0 back up to 15. Accordingly, the waveform sequencer is implemented as a lookup table with 32 entries.
An issue that came up during development was the observation of popping in the triangle wave output, which we resolved using a solution on the Nesdev forums: the sequencer was being restarted for every new note, when instead the sequencer should be kept constant between notes. This way, the triangle wave output transitions smoothly between notes and does not make sudden jumps in output value.
Generates a psuedo-random noise at 16 different frequencies. The raw timer period entries read from register are mapped to 16 preset timer periods using a lookup table. The pseudo-random sequence is produced by a 15-bit linear feedback shift register. In mode 0, feedback is calculated as the XOR of bits 0 and 1. In mode 1, feedback is calculated as the XOR of bits 0 and 6. The output of the noise channel is taken from bit 0 of the shift register.
Delta Modulation Channel
As mentioned in the High-level Design section, we also began work on the delta modulation channel but did not finish implementing it. It is used to play 1-bit delta encoded samples, commonly used for percussion in combination with the noise channel or for other sound effects. Although we succeeded in getting the DMC to work in isolation and play a preloaded sample, when fully integrated the DMC should instead read samples from ROM byte-by-byte. We realized that integrating the DMC into the rest of our system would require some tedious signaling between the APU and NiosII in order to accomplish the necessary reads from memory. This would also potentially cause some issues with timing, since the 6502 emulator is barely fast enough to satisfy the 60 Hz play frequency for more elaborate songs.
The 4-bit outputs of the four sound channels are scaled and combined to produce a 16-bit signed output, which is played through the Wolfson WM8731 audio codec at a 48kHz sampling rate. We experimented with nonlinear mixing as well, following the lookup table implementation described on the Nesdev wiki. However, the resulting triangle and noise channel outputs sound tinny as if high-passed filtered. We opted to go back to our shift-and-add mixing scheme both because it is significantly simpler to implement and because the resulting output sounds better.
While the APU itself is implemented in hardware, we also need a way to control the APU so that it can be used to play back NES music. The APU is controlled by the NiosII CPU. The NiosII CPU is intellectual property that we can generate and use through the use of Qsys. With our custom NiosII system we can program it to control the APU through the use of PIOs.
There are several PIOs that provide the connection between hardware and software. The PIOs that are used in this project are the SD card pins, the buttons pins, and an address, data, and write enable pin for the APU.
The NiosII processor that is synthesized onto the FPGA is capable of executing C code. Using this we can implement several different components of this project in C. The NiosII in this project is responsible for two main functions, reading the SD card and emulating the 6502 processor.
SD Card Reader
The SD card reader is used to get the NSF files from the SD card into local memory. The SD card is used to store the NSF files. Using an SD card allows us to write whatever NSF file we want onto the SD card using a computer. After we write the NSF file to the SD card we can put the SD card into the FPGA in which the SD card reader can read the files. The SD card pins are available to the NiosII because we forwarded the SD card pins to the NiosII using PIOs.
To implement the software for the SD card reader, we used an example that we found on the DE2-115 CD. This example included an SD card reader library that we adapted to be used for this project. This library allows us to do several operations to the SD card such as browsing for a file, opening a file, reading a file, etc. Using this library we can implement all of the functions that we need for this project.
Nintendo Sound Format
The SD card reader software gives a way to read data from the SD card. The files that we are interested in particular are NSF files. NSF files mainly consist of a set of 6502 instructions which when executed write to the NES APU register addresses. The values that it writes to the APU registers will playback the music for the NSF file.
NSF files also contain a header which consists of all of the metadata about that file. It includes information such as the title, artist, copyright, etc. It includes the total number of songs on that particular file. Typically NSF files will include the entire soundtrack from a game.
Once we have used the SD card reader to open and load in a NSF file into local memory, we need a 6502 processor to execute the 6502 instructions available to us. To do this we emulated the 6502 processor in C. After doing some research online, we found that there are several implementations of the 6502 processor in C. We used one that we found most closely aligned to our needs. In this borrowed implementation all the instructions were already implemented, the only functions left for us to implement were the memory functions. In this case we want our 6502 processor to read the NSF file that we loaded in. The NES also has RAM that we provide the 6502 processor. The write function acts similarly but since the NSF file is considered to be a ROM, we cannot write to it. If the 6502 processor wants to write to the address in which the APU registers are stored we then forward that information back to the hardware through the PIOs. It’s through this connection that the 6502 processor is able to control the APU hardware.
Now that we have a feasible way to read the NSF files and to control the APU registers, we need a way to control the playback of the NSF files. To do this we implemented a simple terminal program that allows a user to perform several different actions such as listing the available NSF files and selecting which file from the SD card they want to play. Once a valid file has been selected, they can start the select which song on the NSF file they want to start from and start the playback. On the FPGA itself there are four buttons that allow the user to skip to the next track, skip to the previous track, pause/play, and exit from the current NSF file. The buttons are implemented using interrupts. The buttons themselves are connected to the NiosII using PIOs. The PIOs are then registered to send an interrupt to the Nios II on any positive edge.
Each of the APU’s sound channels and the functional blocks composing them were developed and tested individually following an incremental design approach. Each of the channel’s functions (i.e. the pulse channel’s sweep unit, the triangle channel’s linear counter, the noise channel’s different modes and frequencies) were tested by writing explicit values to the corresponding APU registers. Once tested in isolation, we then integrated each channel with the NSF player in order to observe its behavior during song playback.
This latter method is actually the most reliable way to test each channel, since fast writes from the 6502 to APU Registers and diverse combinations of notes that are the most telling test of a sound channel’s accuracy. Moreover, it requires a good ear and some creative debugging since one line of a song functions like a combination of many different test cases, let alone a song in its entirety.
The sound outputs of each channel are accurate in nearly all respects, particularly pitch, note duration, and waveform shape. These results can be confirmed not only by listening to our system’s sound output, but also by inspecting waveform captures of each channel’s output:
Pulse Wave Channel
This pulse wave channel output corresponds to a note frequency of 782.2 Hz. From this waveform capture we count periods to measure a frequency of 780.4 Hz, which is within 0.3% of the desired pitch. By counting the number of samples we obtain a note duration of 113.7 ms, which is within 3% of the desired duration of 116.7 ms. Note also the decreasing envelope that incrementally ramps down the sound channel’s volume output.
Triangle Wave Channel
The triangle wave channel output above corresponds to a note frequency of 165.0 Hz. Our measured frequency of 163.9 Hz is within 0.7% of ideal. By counting the number of samples we obtain a note duration of 133.6 ms, which is within 0.3% of the desired duration, 133.3 ms.
The noise channel output above corresponds to a playback rate of 1174.4 Hz. Given the duration of the note (83.3 ms), this should correspond to roughly three periods. Although this is difficult to confirm, we at least show that the note duration we measure, 79.8 ms, is within 5% of the desired duration. Observe also that, qualitatively, the noise waveform is visually random.
The real demonstration of our system’s accuracy and performance involves playing full-fledged songs. Included below are line-in captures of our system’s sound output during the playback of three different songs. Directly by listening to these captures, it is apparent that song playback adheres to the intended pitches, note durations, special effects, and overall sound of the original compositions.
This project was a lot of fun to work on and we were able to meet almost all of our expectations and even reach some milestones that we didn’t anticipate. We set out to implement the NES APU and we were able to do so with the exception of the DMC channel. We were also able to successfully emulate the NES CPU which we did not even anticipate creating in the first place. In end we were able to successfully play and enjoy the NES soundtracks we enjoy so much.
Even though we went beyond our expectations in some areas, there is still a lot of future work to be done on this project. Given more time, we would have enjoyed implementing the 6502 processor in hardware, rather than emulating it in software. Getting the SD card reader into hardware would also be beneficial as it would allow us to phase out the NiosII processor entirely and provide an easy way to provide the DMC channel with the sample data. There are also several chip expansions that provide even more sound channels such as additional pulse wave and sawtooth channels. These chip expansions give the composers more flexibility when creating the soundtracks.
As well, some implementation details would benefit from changes that bring our system closer to the original NES. To list a few: (1) Currently, each functional block in the APU polls the channel’s registers continuously in order to detect a new write. However, this causes a bug when two notes with exactly the same pitch, duration, and effects are played back-to-back. In this case, our method of detecting writes would fail to detect the new note. (2) Some functional blocks like the sweep unit currently overlook some corner cases that should be implemented in order to produce a more accurate sound. During playback of certain songs, these simplifications can cause unintended audio glitches, particularly when the composer intentionally exploits some known corner case of a sound channel. (3) The playback rate for songs is currently slightly faster than ideal: 62.5 Hz instead of 60 Hz. This increase in tempo is perceivable over an extended period of time, especially when the system plays a piece side-by-side with another NSF player. The changes above would all help our system to produce a sound more faithful to the original NES.
Intellectual Property Considerations
All registered trademarks used, including “Nintendo Entertainment System” and the “Nintendo” brand, are the property of Nintendo. We are in no way associated with Nintendo.
Patents associated with the NES have expired in the past decade and so are not a potential concern for infringement.
Copyrights associated with the NES are valid until about 2090. However, our implementation of the NES APU falls under the fair use doctrine for nonprofit educational purposes.
Similarly, the compositions we use are the properties of their respective artists and are used solely for demonstration purposes and not personal gain.
The list below credits each of the songs used in preceding demonstration:
Attached below is the source code for our project, including the Verilog implementation of the APU and its sound channels as well as the C implementations of the SC Card Reader, NSF Player, and 6502 Emulator.
Division of labor
This project involved heavy collaboration between the two of us, but each one of us focused on a different aspect of the project. Andre focused mainly on the software side of the project while Scott mainly worked on the hardware.