"An interactive virtual alto saxophone that uses direct digital synthesis, pushbuttons, and a microphone to play music."
Our ECE 4760 final project was to create a virtual saxophone that uses Direct Digitial Synthesis (DDS) to synthesize the output notes. Pushbuttons are connected to a PVC pipe to mimic the saxophone's mechanical structure, and a microphone that detects noise is used to determine whether or not the user is blowing into the device.
We decided to do this project because we enjoyed the ECE 4760 Lab on DDS and wanted to explore it further. In addition, we were intrigued by some of the previous ECE 4760 projects that created virtual instruments, and decided to create a saxophone because we have previous experience playing this instrument. This project was very enjoyable because it combined the knowledge gained in previous labs with our musical interests.
High Level Design
The rationale for this project design is to facilitate an easy, user-friendly mechanism for users to learn or practice playing the saxophone on this virtual device. The desire to learn to play a musical instrument is common for people of all ages, but the significant cost of purchasing an instrument can be a deterrent for doing so. The cost of a new alto saxophone can range from five hundred dollars to thousands of dollars, in addition to the cost of maintenance and any necessary repairs. This device provides a fun alternative, allowing interested musicians to learn a new instrument at a fraction of the cost. This idea was inspired by a previous final project, “Recorder Hero”, in which buttons were used to form the key fingerings and a microphone was used to detect input. Although our project is similar in spirit, the virtual saxophone involved new Direct Digital Synthesis (DDS) to create a saxophone voice, a more complex input system that allows the user to accent each note rather than just control the volume, and a unique hardware system to resemble an alto saxophone in both look and feel.
The saxophone voice was created using additive synthesis with ten harmonics, where the amplitude, rise, and fall time were individually tuned to create the appropriate voice. Nineteen different buttons were glued on to our device to allow the user to control the frequency of the note. Although a standard alto saxophone usually has twenty buttons, the button we excluded is used as an alternate fingering for some notes, so it does not affect the frequency range of the device. The saxophone is a member of the woodwind family of instruments, which means sound is created when the player blows air through a wooden reed. In order to simulate this process, a microphone was used to detect whether or not the user is blowing into the instrument. By allowing the user to blow into a microphone, the noise in the microphone is detected to adjust the volume of the output. In addition to simply blowing into the instrument, a saxophonist is able to modify the articulation of each note through a process called tonguing. Tonguing refers to modifying the syllables used when blowing into the instrument. During this process, the tongue momentarily comes into contact with the wooden reed, blocking airflow into the instrument. When the tongue loses contact with the reed, a constant airflow enters the instrument into a continuous stream. In order to simulate this process, anytime the user’s tongue momentarily blocks the flow of air, the reduction in noise across the microphone is detected, allowing the sound to naturally decay. Once the tongue loses contact and air flow is resumed, the air flow across the microphone increases, causing the notes to be reset to maximum amplitude. This produces a similar sound output as the alto saxophone, allowing the user to put an accent on each note if desired.
The logic structure of the overall design was relatively straightforward. The Analog-to-Digital Converter (ADC) is used to read in noise from the microphone. If the noise level is above a specified threshold, the current button configuration is polled to output the correct frequency. Once the noise level goes below the specified threshold, the note naturally decays. The output is filtered and sent to portable speakers to play the synthesized note. The DDS process is described in much greater detail in the software design section, and the hardware involved, including the buttons, microphone, amplifiers, and filters are described in the hardware design section.
Although there are no relevant patents to this project, the process of instrument synthesis is a complex technique, and some methods used are not shared with the public. The DDS used in this project is based off Wind Instruments Synthesis Toolbox, included in the references section. This is a Matlab / GNU Octave toolkit that uses a complex additive synthesis technique to read music score files and synthesize the output using ten different possible instruments. The synthesis used to create a saxophone voice was greatly simplified and was modified to work on the microcontroller by replacing any built-in Matlab functions with corresponding C code, and replacing computationally expensive operations such as divides with faster operations such as shifts. While the provided toolbox uses over thirty harmonics for some notes, our DDS was reduced to ten terms without sacrificing the voice quality. The amplitudes in the Matlab toolbox were used as a basis for choosing the amplitude of each term in our synthesis process.
The hardware portion of this project can be broken down into two main subcomponents: the physical design of the saxophone and the analog circuitry. The physical design of the saxophone involved obtaining a PVC pipe that resembled the shape and size of a saxophone. Careful measures were taken to attach buttons onto the PVC pipe to represent the different keys of the saxophone. The spacing and location of the buttons were approximated to be as close to a real alto saxophone as possible so that an experienced player could pick it up and start playing immediately. The mouthpiece was built by extending the microphone wires through the PVC pipe and attaching the microphone to a kazoo. A bottle cap was added below the octave pushbutton on the back of the saxophone as a thumb rest. A hook was also added to attach a neckstrap to the device. A pouch was used to store away the white board, the microcontroller, and the rest of the analog circuitry. The speaker was attached to the side of the PVC by easily inserting it on the open end of the pipe.
The analog circuitry of this lab consisted of the microphone input, the PWM output, and the pushbutton circuit. The microphone input first needed to be passed through an amplifier circuit to amplify the signal from millivolts to volts so that the signal was easier to work with. This amplifier circuit was built by passing the microphone input through the positive input terminal of the LM358 and building a feedback loop through the output and negative terminal. The circuit schematic for the amplifier can be seen below.
Microphone Amplifier Circuit
The output of the amplified microphone signal was then passed through an op-amp comparator to generate fast, logic-level output swings. This basically pushed the microphone signal to the VCC and ground rails whenever noise that passed a certain threshold was detected on the microphone. The schematic of this op-amp comparator circuit can be seen in the figure below.
The square-wave signal produced by the op-amp comparator, however, needed to be modified since it featured very fast high-low transitions whenever an input on the microphone was detected. Instead, a consistent high-signal was desired so that the device would be “ON” when the microphone was blown into. To do this, the circuit was extended to pass the signal through a diode and then a resistor and capacitor in parallel. This allowed the signal to be maintained as high whenever fast high-low transitions were detected. This filtered signal was then finally sent to pin A0 of the Mega644 for an analog to digital conversion. To protect A0 (the analog input), a 1k resistor and two 1N914 diodes were connected, as shown below, to constrain the voltage between 0 and 5V.
Protection for ADC
Pin B.3 was used as the pulse width modulation (PWM) output from the Mega644. This needed to be connected to the 3.5mm audio input to the speakers. The PWM was passed through a low-pass filter before sending the signal to the speakers in order to reduce/eliminate any noise. The low-pass filter was built using a simple RC circuit. Since the PWM’s sampling frequency would be about 8kHz, the cutoff frequency should be no higher than 2kHz. If the frequency was greater than 4kHz, there would be aliasing. The resistor value had to be greater than the Thevenin resistance of port B.3 but smaller than the input impedance of the sound amplifier and so a good choice was 10kOhms. Using the desired frequency constraints given above and choosing the resistor to be 10kOhms, the following equation was used to determine the capacitance: 1/(2*pi*τ) < 2000 = 1/(2*pi*R*C) < 2000. A .1uF capacitance satisfied this equation since 1/(2*pi*10000*.1E-6) = 159Hz. A picture of the PWM connection to the 3.5mm socket via the low-pass filter is given below.
Lowpass Filter for PWM Output
The pushbutton circuit was pretty simple. One end of the pushbutton was connected to ground while the other was connected to a pull-up 10k resistor. The output of the pushbutton was detected at the resistor-pushbutton terminal. This formed an active-low switch as the output was high when the button was not pressed (open circuit to VCC) and low when the button was pressed (direct connection to ground).
Direct Digital Synthesis Outline
The core of the software design involves synthesizing each note using additive synthesis. The sample code provided for additive synthesis for Lab 3 was used as a starting point, and the code slowly evolved into our final design. Timer 0 runs at the full rate without interrupts in fast PWM mode, causing the timer to count from 0 to 255 by incrementing by one every cycle. Thus, this is essentially allowing Timer 1 to run at 16,000,000 / 256 = 62,500 Hz PWM mode. Timer 1 was set to run at 8000 Hz, and the clear-on-match interrupt was enabled. During the ISR for Timer 1, the amplitude for each exponential and sine term are updated to slowly decay each note. In order to speed up the execution time of the ISR, the discrete differential equation was used to approximate the exponentials. Likewise, a lookup table for the sine values was created in the initialize code to avoid another lengthily calculation. Once the final value for the output was produced, it was assigned to OCR0A, which corresponds to output pin B.3 that produces that sound.
In the provided sample code, a button push sets a variable pluck, which resets the exponentials to their original (maximum) value in the ISR. This essentially causes a new note to be played. To modify the existing framework, we keep track of two variables: pluck and play_note. The variable pluck functions in a similar manner to the provided code, resetting the amplitude of each note to their maximum value. The value of play_note is equal to one if a note is currently being played and zero otherwise. Keeping track of this prevents pluck from getting constantly reset while the user is blowing a constant stream of air, as this would create a drumming noise.
Once the switches were wired up to the PVC, a basic test was performed to ensure each key was working. The challenging part was to determine which pin each button was connected to since we had over twenty wires coming off of our device. Rather than attempt to follow each wire back to the source button, we used the UART to print which pin was activated when each button was pressed, and used this information to map the buttons to their pin number in software. This test allowed us to catch a few wiring mistakes, such as a missing ground connection on one button.
After the buttons were configured to the appropriate microcontroller pins, every possible note frequency and the corresponding button configuration was encoded. The thirty-two notes supported by this device range from A#2 to F5, the same frequency range as a typical alto saxophone. A mapping from each note to the corresponding frequency is shown below. Some notes have duplicate fingerings, so there were actually thirty-six total fingerings to detect. Since each button has two states (either pushed or not pushed) and there are nineteen total buttons, there are a total of 524,288 possible states of the system to detect. Checking every possible state is unreasonable, so we used a binary encoding to check for the thirty-six key configurations corresponding to the notes supported by the virtual saxophone. This was done by setting three char variables to an eight digit binary number, where each binary digit corresponds to whether or not a particular key is pressed. This allowed us to quickly check three char variables in a conditional statement to determine which frequency to output, rather than have to explicitly check every pin value in every conditional statement.
Once the fingerings were tested and verified to be correct, we began creating a task to interpret the microphone input and set the value of pluck accordingly. The output of the microphone circuit was fed to Pin A.0, corresponding to channel zero of the ADC. The analog voltage output was measured and compared to the reference voltage of the board, which was set to 5V. In this design, 8-bit accuracy was sufficient in measuring the noise on the microphone, so the raw ADC output was a number in the range 0-255. After printing the raw ADC output to the console and verifying the measurements were functioning properly, we experimentally determined an appropriate threshold value for which the device can be considered off or on. The microphone input is constantly compared to this threshold value, and as long as the input is above the threshold, the synthesized note will be output. In addition, in order to allow the user to accent their notes, if the ADC input quickly drops below the threshold and then immediately back above, the amplitudes are reset to their maximum value. This causes the note to sound as if it was articulated, as it quickly gets louder and then decays normally. An example of the raw microphone output and the final signal fed into the ADC is shown below.
Microphone Signal to Fed to ADC
The last phase of the software design was to refine the DDS process to ensure the amplitude, rise, and fall parameters were reasonable. The final DDS used is composed using additive synthesis with ten harmonics. As mentioned earlier, the synthesis was based off of the saxophone voice produced by the Matlab code in the Wind Instruments Synthesis Toolkit. Although the code was drastically simplified and modified in order for it to work on the microcontroller, the amplitudes for each sine value in the provided Matlab code served as a solid basis for generating our voice. For most instruments, the contributions of higher term harmonics are usually negligible as their amplitudes are small compared to the lower order terms. However, we discovered that for the alto saxophone, the eighth, ninth, and tenth harmonics actually had weights similar in magnitude to the first and second harmonics, so it was necessary to use at least ten harmonics to produce an accurate saxophone voice. In addition, we noticed that the amplitudes must change for different frequency ranges, so we chose from four different sets of amplitude vectors when synthesizing the output based on the current key configuration.
When assigning the output value to OCR0A to synthesize each note, any overflow in this value would result in an unpleasant clipping noise. To avoid this, the output was divided down by an experimentally determined scaling factor to prevent the possibility of assigning OCR0A a value higher than 255. Additional work was done to accurately control the volume of the device for varying frequencies. If every note is scaled by the same amount when computing the output, higher frequency notes will naturally sound softer. In order to give the appearance of a constant volume as the user plays notes from the bottom to the top of the scale, the higher frequency notes were scaled with a higher multiplier than lower frequency notes in order to compensate for the natural volume decay for higher frequency notes.
Overall, the virtual saxophone was a success, as it was able to successfully synthesize notes with all of the correct alto saxophone frequencies, and the synthesized voice definitely resembled that of an alto saxophone. The microphone also worked well in determining whether or not the user was blowing into the device. The sampling frequency of 8 kHz was sufficient in generating high quality notes, and ten harmonics was enough in order to tune the voice to sound like a saxophone. Timer 2 was used to measure the CPU load, resulting in about 75% CPU utilization. This shows that even more harmonics could have been added without affecting the execution of the program, but it probably would have had a negligible effect on the output.
Safety was an important consideration, as users are directly interacting with the device. Since longer wires were soldered to the wires coming off the buttons, metal contacts were exposed on the device. To mitigate this problem, all the wires at the contacts were snipped as short as possible and electrical tape was used to secure the wires to the PVC pipe and prevent the wires from coming in contact with the user when playing. A neck strap was added so the user doesn’t need to support the weight of the device while playing and risk dropping it.
The usability of the device was reasonable, as it was easy to get the correct fingerings on the device and trigger an output by blowing into the microphone. The buttons were aligned as close as possible to the actual alignment on an alto saxophone, even accounting for the slight curve in one’s hand. This allows experienced players to quickly transition to the virtual saxophone without much of a learning curve. The kazoo was added for users to blow into rather than a straw because the shape of the kazoo resembles that of a saxophone mouthpiece. At times, the blowing in the microphone was a little too sensitive, so it definitely takes a few minutes for a new user to get accustomed to how much air is required to trigger an output. One usability item that could have been improved was the transitioning between notes. Since the button fingering is constantly polled, some transitions between notes are not as clean, because an intermediate fingering might be activated while transitioning between one note and another. This could be improved by polling the button configuration only every 50 ms or so, preventing any intermediate fingerings to produce an incorrect note during a transition. However, in some songs (especially jazz pieces), this effect is desirable so limiting the frequency that the buttons are polled could hinder the performance of the device. This is a tradeoff that we explored thoroughly when designing the device, and determined that it was best to constantly poll the buttons, even if it does cause undesirable transitions between a few select notes.
One of the biggest user-friendly aspects of the device is its portability. Once most of the hardware of the device was complete and the software was thoroughly tested, we decided to extend our original design by allowing our device to be carried anywhere by the user, rather than being limited to where AC power is located for the power supply and the speakers. A 9V battery was added to power the microcontroller and portable speakers powered by rechargeable batteries were used to output the sound. A pouch was used to hold the whiteboard and protoboard so they can be carried with the device. Lastly, the switch for turning on and off the microcontroller was extended so the user does not need to reach inside the pouch to power the device.
Below are selection of pictures and a video clip of our final product.
Final Circuit Design
Chris Playing the Sax
Sury Playing the Sax
The Final Design
Overall, our final product met our expectations in both sound quality and usability. We were very pleased with the final synthesis output, and the device was easy to use for both new and experienced players. In addition, we were happy with the portability of the device so we are not constrained to using it where there is AC power.
If doing this project again, there are a few things we would have done differently to improve the final design. Rather than solder short wires to the buttons and solder longer wires to extend them to the flat wire, we would have removed the short wiring and directly connected longer wires to the flat wire. This intermediate step was not necessary and added additional time to the mechanical design process. Another problem we encountered was that hot gluing the buttons to the PVC pipe did not always produce a reliable connection, as some buttons consistently fell off. A more reliable technique would be to drill holes into the PVC pipe and send the wires down through the center of the pipe. We avoided this process during our original design to avoid the risk of cracking the PVC pipe, but this approach would have been ideal.
Another item that could have been improved was controlling the volume of the device. Our design had two different threshold values to determine the output value, but the difference was hard to perceive. Rather than use only one channel to feed into the analog converter, multiple channels could be used with different capacitor and resistor values to create input channels with different thresholds. These could be used to get a more accurate representation of the air flow against the microphone and adjust the volume of the device with greater precision.
As described earlier, the Direct Digital Synthesis code was based off a combination of Bruce Land's additive synthesis code and the additive synthesis saxophone voice provided by the Wind Instruments Synthesis Toolbox. In addition, the idea was inspired by Recorder Hero, a previous ECE 4760 project that used pushbuttons and a microphone to create a virtual recorder.
Ethical and Legal Considerations
All design decisions and actions taken in this project adhered to the IEEE code of Ethics. In terms of safety, we followed all recommended lab protocols and always asked for help if we were unsure if our circuit had a chance of harming the microcontroller or others. If other members of the lab asked for help during the project, we assisted them to the best of our ability and referred them to others if we could not assist them. We continuously promoted the learning process and aided in the development of others while embodying the IEEE Code of Ethics in all of actions during the design process.
We believe this project was an appropriate level of difficulty such that we were able to continuously learn and overcome challenges throughout the design process yet not attempt to complete task which is above our technical competence. If we felt we could not complete a task with our level of training and experience, assistance was sought. The final design does not discriminate against any persons, including by race, gender, age, or disabilities. All documentation in this report is accurate to the best of our knowledge. Lastly, we’ve appropriately acknowledged and referenced all previous works that contributed to our final design.
A. Source Code
final_project.c (23KB) – code containing ISRs, initialization, and main program logic, including video reading, processing, and sound generation
All other aspects of the final project were done together, including the majority of the physical design. This includes building the protoboard, all necessary soldering, and wiring the buttons to the correct microcontroller pins. The testing phase was also done in unison in order to quickly identify and resolve any errors found in the design.
This section provides links to external reference documents, code, and websites used throughout the project.
We would like to thank Bruce Land and all of the 4760 TAs for their assistance and continuous support throughout the project. Specifically, we'd like to thank Pavel for his guidance throughout semester and his ability to help us overcome various technical issues.
We'd also like to thank Ben Harris and David Bu for providing the PVC pipe, and Derek Lougee for providing several essential parts for our project, including the pouch, 3M hook, and neckstrap.
Lastly, we'd like to thank Gautam Kamath and Dominick Grochowina for the final project website template.