Cornell University ECE4760
FM sound synthesis

Pi Pico RP2040

FM sound synthesis -- serial input, video parameter display, spectrogram
FM synthesis is a fairly simple method of producing interesting sounds, including sounds that approximates musical instruments. The sound can be can be string-like, or drum-like, or just plain strange. You basically have to experiment to find the sound you want. The basic waveform generation equation is:
wave = envelopemain * sin(Fmain*t + envelopefm*(sin(Ffm*t)))

For each of the amplitude envelopes, I used a simple attack, sustain, decay system, with linear amplitude increase during the attack, constant amplitude during the sustain, and quadratic decrease during the decay of the main wave, and with linear decay for the modulation. In the code, the FM envelope is called modulation depth because the envelope scales the size of the frequency changes. The actual sin waves are generated by direct digital synthesis DDS. All arithmetic during synthesis is carried out s19x12 fixed point. The way I scaled amplitudes required being able to represent integer values up to about 500,000. Sound is produced via a 12-bit DAC.

To use this program, you need ot hook up a video mionitor, SPI DAC plus speakers, and a serial terminal. Connections are given in the lenghty comment at the start of the source code. The serial input supports eight settable parameters for a sound, plus help, play, stop commands:

Two examples are shown below. The top of each frame are the parameters, the bottom the spectrograms.
The first image is a stirng-like sound, the second a drum-like sound. Click on either to expand.

There is no systematic way to compute a set of parameters from a desired sound, although there are some guidlines. Strings have a fast attack if plucked, and slow if bowed. Bigger modulation depth tends to make harsher, more metallic sounds, or noise. There are several online guides: Digitone, Javalin Art, U Wash.

Code, ZIP of project


Older examples:

Simple serial input and PWM output
The default serial output of gpio 0,1 is used.
To use the program you specify nine parameters using the serial interface:
Fout -- the main frequency in Hz.
The main waveform attack, sustain, decay (ASD) in seconds.
Fmod -- the fm modulation frequency in Hz.
The modulation depth, a float, typically in the range of 0-1.0
The modulation attack, sustain, decay in seconds.
The progam will then repeat the sound when you hit return (twice)
This program uses a PWM channel on gpio3, with output filtered with a 2K resistor and 10nf capacitor.

The number of sounds you can make is bewildering, but a good place to start is a string-like sound by
setting Fout=220, main ASD to {0.001, 0.001, 2.0}
then Fmod=660, depth=0.25 and modulation ASD to {0.001 0.001 1.5}

Code, Zip of project


FM synthesis -- with muscial scale -- better quadratic decay --
This example adds a tuned musical scale output to better evaluate the sounds. I chose to implement eight notes from C3 to C4 but in the code below, you can pick the octave. The code also adds a smoother decay function option. The linear decay is a very crude approximation (the first term of the Taylor expansion of a decaying exponential). Linear was used because its duration is finite and easy to specify, however the relatively slow decay, coupled with the sudden termination, sounds strange for some sounds.

Exponential decay is easy to calculate with one multiply, but suffers from an infinite duration, so getting a clean cutoff is hard. The long, slow exponential tail also leads to some numerical accuracy problems. A quadratic decay can be arranged so that it approaches zero amplitude with zero slope and falls more rapidly than linear initially. The image shows amplitude vertically, and audio sample number horizontally. The attack phase is blue, sustain is yellow, linear decay is red, exponential decay is purple, and quadratic decay is green. Attack in this example is 50 samples, sustain is 100 and desired decay time is 500 samples.

If you adjust the exponential time constant to one-fifth of the decay cutoff, you get the first plot. The amplitude has decreased about 150 fold but the initial fall rate is fast. If you adjust the exponential time constant to match the initial slope of the quadratic, then the exponential has dropped to about 14% at the desired cutoff time, while quadratic and linear have reached zero. It is much easier to use the quadratic for a desired decay cutoff time, with a reasonalbly smooth sound near cutoff.

Actually computing the quadratic envelope requires fitting three constraints to the general quadratic form ax2+bx+c=0 to determine a, b and c. If we require that the quadratic be zero, with zero slope at the decay cutoff time, and to have the sustain amplitude when it starts, then a little algebra gives:
a = (sustain_amplitude)/(decay_time)2
b = -2*(sustain_amplitude)/(decay_time)
c = sustain_amplitude

Since we are computing one audio sample at a time we can do a numerical integration to simplifiy the required arithmetic at each time step. Specifically, we are adding the derivitive of the quadratic to the current amplitude to perform Euler integration of the quadratic:
new_amplitude = current_amplitiude + 2*a*(fall_time) + b
Some care is needed with operation order to maintain accuracy in fixed point. When running in an ISR, the worst case timing is 3.8 uSec with quadratic decay and 2.5 uSec with linear decay.

Example scales:
The program plays eight notes, at four different fm depths. The depth values are
{0 .25 .5 .75}*(maximum_fm_envelope_depth)
The program inputs are now:
-- An octave number from 1 to 6, e.g. choosing 2 plays C2 to C3.
-- Amplitude attack, sustain, and decay times in seconds (floating values)
-- Chose linear or quadratic decay -- enter 1 for linear, 0 for quadratic
-- the ratio of the modulation frequency to the main frequency
-- the maximum fm envelope depth (a float between 0 and 2)
-- fm envelope attack, sustain, and decay times in seconds (floating values)

Code, zip of project


FM synthesis with specrogram and joystick-driven menu interface
With all the parameters necessary to specify the synthesis, I decided to build a menu system for changing the ten, or so, parameters. Also, the sounds produced are so complex that it is nice to have a spectrum and spectrogram (see the top of this page) so that the systhesis can be compared to real sounds visually. Adding a cursor and joystick to Hunter's VGA driver was an interesting exercise. The joystick is a HiLetgo Game Joystick Sensor Game Controller from Amazon. It consists of two potentiometers (x and y stick position) and a push-click button attached to the stick. The button only works when the stick is centered. Reading the state of the joystick is easy, but the potentiometers are not very well calibrated, so I decided to use the position of the joystick to determine cursor x and y movement speed, as opposed to position. Also, because of noise, and for ease of control, the speed in each direction was quantized to a few speeds in each direction. In this example cursor movement is limited to the menu region of the screen.

On every pass through the main graphics thread loop, the event loop, the cursor is erased, then redrawn at the current location specifed by adding the joystick speed to the old position. Erasing means copying the saved image pixels back to the display buffer. Drawing means saving the image pixels and replacing them with the cursor color. The cursor position is then decoded into a menu item location. A single joystickclick in a menu item enables the x potentiometer to control the menu value, as determined by a couple of constants for that item. The menu-change condition is indicated by a small red dot in the menu item. Another click, or movement of the y potentiometer, cancels the menu change mode. Double-clicking the joystick freezes the GUI and allows new value input from the serial command line. The serial menu-change mode is indicated by a small green dot, and is canceled when the user hits <enter> on the keyboard.

The spectrum analysis takes place at the sound synthesis rate of 40 KHz. The FFT buffer size is 2048 samples, so the FFT window size is about 50 mSec and the FFT fundamental frequency is 19.5 Hz. This resolution cannot resolve individual musical notes until about C4 (523 Hz), but seems like a good compromise between time resolution and frequency resolution.

The image below shows the FM synthesis menu, with the FM depth item highlighted for change. Top trace is the current short term spectrum, midde trace is log(spectrum) and at the bottom is the spectrogram. You can see the requency steps of the scale, with a couple of harmonics. The harmonics taper off before the fundamental because the FM envelope is shorter in duration. The fast attack time, tapering harmonics, and harmonic spacing tend to make a string-like sound.

This is a big example, so it needs an outline of the code:

There are connections to the RP2040/PICO for VGA, SPI DAC, and the joystick.
The summary is:

Code, ZIP of project


FM synthesis with spectrogram and rotary-encoder driven menu interface
It turns out that using the joystick, above, for menu control is a little too senstive and hard to control. Using an Adafruit rotary encoder with direction arrows and a center-button is easier. In the image below, each of the four drirection pushbuttons is indicated by a small triangle near the edges. The center button is a slightly raised circle in the middle. Pushing anywhere on the circle of small dots engages the rotary encoder. Dragging clockwise and counter-clockwise changes the value of a counter. To run the FM synthesis menu, the up/down buttons change the item to be selected, and the center-button selects/deselects an item. When selected, the value of an item can be changed by scrolling the encoder. Pushing the center-button again deselects the item. Double-clicking the center button freezes the GUI and shifts value entry to the serial interface. Serial entry returns control the the GUI when you hit the <enter> key. All other functionality of the above example is the same. There is one more thread running on core0 compared to the previous code which watches for an edge on the decoder switches and determines rotation. And, of course, the joystick is disconnected and replaced with encoder connections to seven gpio ports, plus two ground connections. The Adafruit breakout board connections are on the left of the list, the PICO connections on the right.

Code, ZIP of project


Copyright Cornell University November 18, 2023