Results
Our pitch shifter was able to change the pitch of the input audio signal quite accurately. When speaking into the microphone, the voice was played back with the modified pitch with no noticeable delay. The result of humming a sine wave into the microphone is shown below. The input signal is the top waveform and the output is the bottom waveform.
Figure 8: Pitch shifting up by a factor of 2
Figure 9: Pitch shifting down by a factor of 2
Our technique does introduce some artifacts, which can be seen in the pictures above. These result from the discontinuities created by mirroring/truncating the signal during time stretching. These artifacts were reduced somewhat by our 3 point averaging filter, but they still compromised the quality of our output. The percent error for each shift is shown below:
Shift Factor |
Input Frequency |
Expected Frequency |
Measured Frequency |
Percent Error |
None |
318 Hz |
318 Hz |
321 Hz |
0.94% |
2x increase |
345 Hz |
690 Hz |
710 Hz |
2.90% |
4x increase |
320 Hz |
1280 Hz |
1270 Hz |
0.78% |
2x decrease |
318 Hz |
159 Hz |
137 Hz |
13.8% |
4x decrease |
561 Hz |
140 Hz |
182 Hz |
30.0% |
From the table it is obvious that performance is much worse when downshifting. This is caused by our time stretching algorithm, which truncates the input blocks when we are shifting down. Oftentimes, this results in a loss of data and discontinuities in the output. Because no data is lost when repeating an input block, performance is better for upshifting. Note that since it was difficult to measure the exact frequency using the oscilloscope, the percent error might be significantly different with each trial.
Our pitch shifter is easy to use. The user only has to toggle 3 switches to change the shift factor. The hardware takes care of the rest.
Conclusions
Overall, we are pleased with the results of our project. We are able to pitch shift a stream of audio data from the microphone and play it back at a fairly high quality, especially when upshifting. Pitch shifting in real time is inherently difficult because of the timing contraints and computational limitations. Initially, we attempted to fully implement the time stretching algorithm detailed in the "Theory" section, which is also explained in DAFX: Digital Audio Effects. As we progressed, we felt that we would be unable to successfully use this algorithm and instead decided to use our own simplified version. While using this algorithm is not ideal, we made a lot more progress once it was implemented. Time stretching the signal proved to be the biggest challenge by far.
There are a few improvements that we would've liked to make if we had more time. Allowing the user to shift by finer amounts instead of just powers of two would've been nice, and we could have also created a better user interface to support this. Also, we could've spent more time trying to eliminate some of the artifacts introduced by our time stretching technique.
We learned a great deal about audio processing techniques and the difficulties of performing DSP in real time. In particular, the fact that we were using strictly hardware made performing complex mathematical calculations difficult. We were often forced to compromise and use "tricks" to process the audio signal in real time. There is of course room for improvement, as it would likely be possible to implement the original time stretching algorithm by optimizing it for use on an FPGA. This algorithm can be performed easily in MATLAB, but does not translate well to Verilog. Lastly, the frequency domain approach could also be taken, but this is again quite complex.