design
intro
conclusions
results
schematics
source
work log
budget
Michael Karpelson, Julian Chang, ECE 476 Spring 2004

Rationale

Anyone who has ever visited Wisconsin’s House on the Rock has seen its stunning array of musical machines – intricate electromechanical assemblies that incorporate almost every type of musical instrument. Though slightly less awe-inspiring, the musical machine developed in this project draws its roots from such old creations.

We set out to create a vaguely enjoyable programmable device to play back melodies, and eventually settled on a xylophone due to the straightforward nature of the mechanical component of the system (hit the keys). Surprisingly, neither of us knew much about music, relying on the consultations of friends to discover the basics of music theory, and inventing the rest.

Wishing to use the microcontroller to its full potential, we also decided to explore the possibility of a pitch detection mechanism for either recording melodies or interfacing with the device, and a random melody generation algorithm.



Logical Structure

The system incorporates a mechanical component, consisting of a frame to hold the xylophone and position the solenoids underneath the proper keys. The solenoids require appropriate control circuitry, isolated from the MCU circuit to prevent damage to the latter from inductive spikes. Power for the solenoids is provided with a step-up switching voltage regulator, which augments the input voltage to ~16V.

The pitch detection mechanism consists of a microphone power circuit, a low-noise precision op-amp to amplify the signal, and an off-chip 8-bit ADC (the onboard ADC of the Mega32 microcontroller used in this project is not sufficiently fast for the high frequencies produced by xylophone keys). Continuously sampling with the ADC, the pitch detection algorithm waits for the input to exceed a certain threshold, at which point it records a number of samples and performs zero-crossing analysis.

The MCU code, written in C, handles the user interface, the serial programming interface, the pitch detection mechanism, the solenoid control code, and an algorithm for random melody generation. And there’s a very few bytes of RAM left over.

Hardware and software block diagrams are given below.



Hardware block diagram


Software block diagram


Hardware/Software Tradeoffs

While the actuation provided by solenoids is highly convenient when impacting the keys of a xylophone, some problems are also introduced. The positioning of the solenoids relative to the keys is critical, and even a 1mm misalignment causes vast differences in the produced sound (an important consideration when mounting the solenoids using a precision tool known as a power drill). Moreover, the solenoids are noisy, with the cores falling loudly on the wood supports after being pulled up to strike a xylophone key from below. Costlier spring-loaded solenoids should present a better alternative.

The output of the microphone used in the pitch detection mechanism is amplified by a factor of about 500 to ensure accurate waveform sampling. Unfortunately, this has the effect of making the microphone overly sensitive and causing it to respond to sounds not produced by the xylophone. Although we have tried to address the problem of accidental activation by unrelated noise both in software and hardware (see detailed design notes), this effect could still occur. A better microphone with a more defined pickup pattern would solve this problem, but would unfortunately exceed the budget.

In terms of software, the pitch detection algorithm stores 1500 samples in RAM when it is operating, leaving little memory for everything else. Other algorithms which make use of large arrays, such as the random melody generator, are forced to share memory with the pitch detector. As a direct result of this, the serial interface currently requires the user to enter one note at a time when programming the device, since low RAM space makes the implementation of a large input storage buffer difficult.

Storing the programmed melodies in EEPROM is useful, since these are preserved when power to the MCU is shut off. However, the device life cycle is limited to the 100,000 or so EEPROM rewrite life, after which the MCU may require replacing.



Relevant Standards

The device make use of the RS232 serial communication standard to enable a user to program a melody using a PC. This is an asynchronous serial protocol with two possible line voltage states (marking and spacing, or on and off). Transmission of a byte is initiated by a start bit (spacing state), followed by the byte, starting with the least significant bit, and terminated by the stop bit (marking state).



Patent and Copyright Considerations

We employ Hyperterm, which is a proprietary piece of software, in our serial interface scheme. To avoid licensing issues in the final product, we will look to the open source community for a similar program to achieve the same goals. Any terminal program will suffice.






Mechanical Component

The mechanical system consists of a simple wood frame with adjustable supports to hold the xylophone. Solenoids are affixed to the sides of the frame so that they are able to impact the keys of the xylophone from underneath. Foam padding is attached beneath each solenoid, so that the core does not make a noise when it falls back down on the wood. The solenoids are wired to a standard 40-pin IDE cable, which plugs into a header on the solderboard. The microphone is positioned in the frame underneath the xylophone using a high-precision fixture (duct tape). Pictures of the frame and the frame with the xylophone inside are given below.





Serial Interface

We chose to use a command line user interface to program the device through a serial connection. Through a program such as Hyperterm, a user can interact with the system. This serial communication interface allows the user to program up to fives songs of 200 notes each into the system EEPROM.

The UART interrupt is used to call the command line interface function. This way, connecting a serial cable to the device and hitting the elusive “any” key in a terminal program will initiate the interface. However, since the program is either in playback mode of program mode, and not both at the same time, we chose a polling instead of interrupt driven serial communication. As the system calls the command line interface, it turns off all interrupts, turning them back on as it exits.

The command line interface provides the user with several commands, including program, display, and quit. In program mode, the user chooses the slot to program into, and then enters the song note by note. The program mode terminates when an empty string is encountered. The system validates each note as it is entered, returning an error message for each invalid note. It ignores each invalid note, allowing the user to continue entering notes. In display mode, the system prints out the song stored in a slot specified by the user. Finally, the system terminates the command line interface on the quit command. The device has the capability to play both single and multiple notes (chords). The user enters notes into the system using the following encoding, with several pre-defined chords available to the user (defining new chords is trivial and requires modifying a lookup table, but requires Flash reprogramming).

Chord

Duration

Note

M – Major Chord

F – Full Note

A – Lower Octave A

N – No Chord

H – Half Note

B – Lower Octave B

 

Q – Quarter Note

C – Lower Octave C

 

E – Eighth Note

D – Lower Octave D

 

 

D# - Lower Octave D Sharp

 

 

F – Lower Octave F

 

 

F# - Lower Octave F Sharp

 

 

G – Lower Octave G

 

 

G# - Lower Octave G Sharp

 

 

a – Upper Octave A

 

 

b – Upper Octave B

 

 

c – Upper Octave C

 

 

_ - Pause

After the user enters a song, each note is converted into an unsigned char. The program then stores this null-terminated array of unsigned chars into one of five EEPROM arrays. Although reprogramming the MCU causes the arrays to be cleared, this disadvantage in using EEPROM arrays is only present during development. For a final product that no longer requires programming, EEPROM arrays are able to store the melodies indefinitely.

Upon exiting, the command line interface returns MCU control to the main program.



Solenoid Control and Power

13 solenoids are used in the project to play the notes A, B, C, D, D#, E, F, F#, G, G#, A, B, C on the xylophone. The solenoid power circuits each involve a TIP31C transistor and a power diode to sink inductive spikes, and each is isolated from the MCU output using the ILQ5 phototransistor output optoisolator from Vishay Semiconductor.

The push-type solenoids acquired for this project are rated at 24VDC continuous duty. In practice, at least 12V were found to be desirable for sufficient impact strength. The UC2577-ADJ adjustable switching voltage regulator was used in conjunction with a high-current 100mH inductor to step up the input voltage from a 9V power supply to ~16.1V. The MCU is powered by a separate 9V battery through an LM323 voltage regulator. For more details, see the circuit diagram and the related data sheets.

On the MCU side, I/O ports A and C were dedicated to solenoid control. Functions playnote(), playchord(), and playmusic() were defined. The playnote() and playchord() functions take a note encoded in the format described in the Serial Interface section and convert it to an I/O port output using a simple lookup table. The playmusic() function parses one of the EEPROM or RAM arrays containing a stored piece or generated random melody, and plays one note at a time. The timer0 ISR is used to control note timing. For more details, consult the commented program listing.



Pitch Detection Mechanism

The hardware aspect of the pitch detection mechanism consists of a simple electret microphone and its powering circuit. The microphone output is fed into an OPA227P precision low-noise op-amp from Texas Instruments, set up to produce a gain of ~500. The amplified microphone signal is routed to a MAX166 8-bit parallel analog-to-digital converter, controlled by the Mega32 microcontroller. See the complete circuit diagram and the MAX166 datasheet for more information.

The main part of the software algorithm consists of the listen() function. When invoked, the system stalls until the volume from the microphone exceeds a threshold value (this would occur if a xylophone key were struck). The algorithm waits a certain amount of time using the delay_us() function until the waveform stabilizes, and then acquires 1500 8-bit samples at a frequency of roughly 125kHz. A zero crossing algorithm is performed, consisting of:

  • Finding the minimum and maxim values of the waveform
  • Expanding the waveform to fill the 0-255 range
  • Converting it to a square wave by setting all entries above a threshold to 255 and the rest to 0
  • Averaging the distances between falling edges
  • Converting the distance into an encoded note output

If one of the defined frequencies is observed, the listen() function then returns the encoded note, and zero otherwise. Five notes are collected by the higher-level dplay(), and if a sufficient quantity are found to exist at the start of one of the stored melodies, the stored melody is played back (note that if no new input is observed for two seconds after the last input, dplay() times out and returns to the reset state). Otherwise, the five collected notes are used as basis to generate and play back a random melody. For more details, consult the commented program listing.



Random Melody Algorithm

The algorithm relies on Markov chains to define the rules by which random music is generated. As discussed previously, the first few notes recorded by the pitch detection mechanism serve as the starting point for the algorithm.

Since two variables can be controlled – the tone and the duration of a tone, at least two probability matrices must be defined. As discussed previously, the tone must be one of the 13 that the device can play, and the available durations are a full, half, one fourth, and one eighth of a tone. The probability matrices should therefore be 13x13 and 4x4.

For example, consider the matrix below:

 

Full

Half

Fourth

Eighth

Full

0.8

0.2

0

0

Half

0.1

0.8

0.1

0

Fourth

0

0.1

0.8

0.1

Eighth

0

0

0.2

0.8

The numbers represent transition probabilities. That is, if the current note is a full, the next one has an 80% chance of being a full and a 20% chance of being a half. These probabilities are used at the generation of every new note to decide on the properties of the next. A 13x13 matrix containing transition probabilities between notes helps decide which note is being played.

By manipulating the transition probabilities, it is possible to influence the music to gravitate towards certain tones or a certain rhythm. In other words, it is possible to make the music sound better. Our more musically inclined friends inform us tone-deaf slobs that notes forming certain musical scales sound well together. The algorithm thus uses transition probability matrices that trend towards the scales:

  • A, B, C, D, E, F, G#, A (minor)
  • B, D, E, G, A, B (pentatonic)

Using the initial tones a starting point, the algorithm uses a set of probability matrices (one for tone, one for duration) to generate a random melody 200 notes in length, which is played back immediately.



Hardware Interface

A mechanism was needed to adjust the tempo, and a pushbutton was recycled from an old PC and connected to I/O port D to that end. At the present time, tempo can only be adjusted when a melody is not playing.

Rather than running the pitch detection mechanism continuously and risking unintended activation resulting from the microphone picking up unrelated noise (an unfortunate consequence of the high gain required to accurately sample waveforms), another pushbutton was added to trigger the pitch detection mechanism. The same pushbutton is used to stop the playback of a melody in progress.