ECE 4760 Final Project

Tian Gao(tg293@cornell.edu)



In this project, I built a video game controlled by people’s voice. The game is about jet fighters. People can play the game by themselves or with friends. The system recognize the command by distinguish “ahh” and “Biu”. The fighters shoot a bullet when the players say Biu and rise when people say ahh. If players keep silent, the fighters will be falling. The AT Mega1284 was used as the processor for voice recognition and video output, two 3.5mm phone ports were set for microphones and another 3.5mm port was introduced for NTSC output.

High Level Design


The whole project can be divided into three parts: the voice receiving and recognizing part, the game execution part and video output part. There are two microphones for voice input, and an external circuit to filter and amplify the signal for MCU to recognize. For each microphone, there's an ADC channel to read the value. In that way, I can get the voice input signal in MCU. After that, a method is required to distinguish between ahh, biu and nothing. When the input commands are recognized, the MCU need to compute the game. For example, what's the location of the fighters, are they hit by bullets or obstacles, what else will be generated etc. At last, the MCU should draw on a screen buffer and output that for a LCD screen in NTSC system.

Voice Recognition

In the voice recognition part, I'm interested in voices' fundamental frequency. Usually the fundamental frequency of people when they're speaking is about 50-350Hz, so I can concentrate on this range. I filtered the voice using a band-pass filter so I can filter the DC signal and the high-frequency harmonic wave that I don't need. Also, since the signal that microphone generated is too small for MCU to read accurately, I used an opamp to amplify that. In this way, I can use a simple zero-crossing method to calculate the fundamental frequency of a voice, because the amplitude of the waveform is high and the noise is low.

Game Playing

In the game, the fighters are near the edge of the screen and they can only rise or fall in one demensional way controlled by the voice ahh. When the players say biu, it shoot a bullet to the other side of the screen.

There are two game modes: single mode and VS mode. In single mode, the player controls the fighter to avoid or shoot the obstacles that are coming randomly from the right edge of the screen. Once the obstacle hit the plane, the game will be over and the score is how many obstacles the player has shot. This game mode is really simple but classic, it's easy to start with.

In VS mode, the game is more complicated. There is no obstacle and two players are set on each side of the screen. The players shoot to each other and try to avoid bullets from the opponent. To increase the fun of the game, there are several buffs that can improve or reduce the fighters' ability. The buff can increase/decrease the size of the bullet, accelerate/decelerate the speed of the bullet, give extra HP or multiple bullets etc. The buffs appear in the middle of the screen and randomly move to left or right. In this mode, every player has 16 HP initially, and the HP decreases by the size of the bullet. When either player's HP becomes 0, the game is over.

Video Output

For video output part, I used the part of the code from lab3 of ECE 4760 where the resolution is 160*200. After the calculation of game playing, the MCU draws everything on the screen buffer and pop that out to the LCD. I used NTSC system in the project.

Low Level Design

Hardware Circuit Design

To implement the microphones, I designed a circuit as figure 1. C1 and R2, R3 works as a high pass filter whose cut off frequency is about 32 Hz. Also, R2 and R3 provide a 2.5V DC offset for the output to allow a larger waveform amplitude. However, the amplitude of the output is really small that I need to amplify that for accuracy. Also, the components of the human voice are complicated because of the harmonic waves, so I need to use a filter. I implement the band pass filter as figure 2. The amplification factor is about 50, and the band is between 16Hz and 312Hz. After the filtering, we can get a much nicer waveform for MCU to recognize. I used LM358 for opamp and it has two amplifiers on one chip, which is perfect because I have two microphones. Figure 3 gives a picture of the part of the board of audio inputs. The final outputs are connected to port A0 and A1 of the MCU for ADC.

Figure 1

Figure 2

Figure 3

For video output, I used port D0 and D1 to generate the signal for NTSC. In NTSC system, there are three steps of voltage: 0V for sync, 0.3V for black and 1.3V for white. However, MCU can only output 5V, so a circuit as figure 4 has designed to generate the required voltage for the LCD.

MCU Settings

I used timer1 to drive the LCD. Timer1 was set in CTC mode and the ISR executes every 64us. Also, for synchronization, I have to introduce another ISR to sleep the CPU right before the video output ISR. Port D0 and D1 are set as output port for video. For popping out the values in screen buffer, I used USART in MSPIN mode, which has a double-buffered register and can send out values fast and accurately. Additionally, I need ADC to convert the input value. I set the prescalar of ADC for 16. The clock frequency of ADC is 1MHz which gives a relatively good accuracy and acceptable speed. Usually the ADC provides a 10-bit value, but I just need the most significant 8 bits. The reason is twofold: 8-bit value can be saved in char which saves me time and space, and the accuracy in this prescalar is not as good as 8-bit.

Voice Receiving and Recognition

After the filter and amplifier, the signal of the voice goes into port A0 and A1, which are two channels of ADC converter in MCU. First I need to sample the signal. I sampled once in every ISR, which gives a 15625Hz sample rate. It's good enough for basic processing. However, there is only one ADC converter in MCU, so I can't sample two signals in parallel. I sample one signal at first, then change ADMUX for another channel, and sample again. In this way, I can get two samples from different channel in one ISR, which makes two microphones possible. Since the signal has been through a band pass filter, the high frequency component has been wiped out. Thus, the signal is relatively clean that we can use zero-crossing to calculate the frequency. I implement a zero-crossing judgment in ISR. The frequency counter increase by 1 in every ISR and if I get a zero-crossing, I record the counter and clear the counter. I can get a not bad frequency calculator in this way. Also, I take an average value for the latest four frequencies, which make it more accurate. Besides, I can get the amplitude of the waveform by calculating the difference between the maximum and minimum value in one period.

When I get the amplitude and frequency of the voice, I can recognize it. First of all, the amplitude should be larger than a certain value so that I can ensure that it's not just noise. I set the threshold to be 40. Then, when people say biu, it's usually a clean sine wave with stable frequency. Hence, I recognize it as a biu when it gets consecutive four samples with similar frequency. Also, when people say biu, the frequency is usually higher, so I set a frequency boundary for the judgment.

Game Playing

In game playing part, we need a random number generator which the MCU is not able to generate directly. The random number generator should be efficient, while the accuracy is not that important. Thus, I introduced pseudo-random sequences that are generated by MATLAB. I increase the seed in ISR and after I get a random number. The function is below:

int GetRand(int MIN,int MAX)
int dis=MAX-MIN+1;
int incr=pgm_read_word(RAND+RandomNumIndex);
return MIN+incr%dis;

There are four structs built in the project, each represents an object in the game.

struct S_Player
unsigned char Num;
unsigned char Plane_X;
unsigned char Plane_Y;
signed char HP;
signed char b_speed;
signed char b_size;
unsigned char b_num;

struct S_Bullet
unsigned char x;
unsigned char y;
unsigned char size;
signed char x_speed;
signed char y_speed;
char en;

struct S_Obstacle
unsigned char x_left;
unsigned char x_right;
unsigned char y_up;
unsigned char y_down;
signed char x_speed;
signed char y_speed;
unsigned char en;

struct S_Buff
unsigned char x;
unsigned char y;
signed char x_speed;
signed char y_speed;
unsigned char Type;
signed char Value;
unsigned char en;

Objects' characters are defined in the struct. For bullet/obstacle/buff, variable en means whether it's enabled. If en=1, the object is valid, otherwise it's invisible. There are six types of buff, defined as

#define BUFF_DOUBLE 5
#define BUFF_LIFE 6

Also, there are minimum intervals for every two bullets/obstacles/buffs defined.

In the game, for every screen refreshing, the program recalculates the locations of the objects and judges for whether a bullet/obstacle/buff hits the fighter. If so, modifies the objects.

The fighter rise 2 pixels every step if the player says ahh, and falls 3 pixels if player says nothing. The player's character is initialized for 16HP, 2 bullet speed, 1 bullet size and 1 bullet number. If the fighter is hit by a bullet, the HP decreases by the size of the bullet. The maximum character for a fighter is 16HP, 6 bullet speed, 7 bullet size and 3 bullet number.

There are three interfaces in the game main menu, single and VS. In the main menu, players can control the fighter up/down by say ahh in high/low frequency to choose the mode and say biu to get into the mode.

In single mode, the fighter is fixed on the left edge of the screen. The obstacles are generated from the right side of the screen and the height values of them are random. Players control the fighter to avoid/shoot the obstacles. If any obstacle hit the fighter, game will be over and the score is how many obstacles the player hit.

In VS mode, two fighters are fixed on each side of the screen, they try to shoot each other and get good buffs to improve their fighters. The buffs are generated in the middle of the screen with random height, random speed, random type, random value and random direction. If either of the players runs out of HP, the game is over.

Video Output

Graphics is the major factor of a game. In the original code, it gives a function video_pt() to write in the screen buffer, but it's too slow for a game. If I draw on the screen point by point, the fps would be really low. Hence, I used some bitmaps for graphics. Instead of drawing points, I directly write an 8-bit char in MCU. For example, half of the fighter is painted as

00000001 11100000
00000011 11110000
00000011 00110000
01100111 00110000

11110110 00110011
11011110 00111011
11011111 11111111
11111100 00000111

If I directly write the buffer, it takes less than one fourth of the time than drawing points.

The fighter icon is 16*16, and the bullet's size varies from 3*8 to 15*8. It's almost the same as drawing fighters. The only different is that the bullet moves. Thus, instead of writing one char for 8 bits, it needs to write two for 8 bits. Take an easy example, if there's a 1*8 bullet like 11110000, and when it moves, you might need write two char as 00000011 11000000. This method is also utilized for buffs.

Anyhow, writing on screen buffers directly for 8 bits is much faster than writing point by point. The final effect is like figure 4

Figure 4


In this project, I built a video game controlled by voice. It's a whole package game with two 3.5mm microphone inputs, a standard NTSC video output and it needs a 9-12V power supply. Players say ahh or biu to control the fighters on the screen to hit obstacles or the other player.

For hardware part, I designed the circuit for audio input, including band pass filters, amplifiers and the microphone drivers.

For software part, I made a playable game with several interesting characters based on the video code by Bruce Land, Shane Pryor and Morgan D. Jones. Also, I designed a faster way for video output.

And finally, this game is a healthy, safe video game that can relax people in all ages. The original idea was from an Apple app named "Pah". I modified that to a two-player game and add several details in it. The NTSC system video output code is from lab3, basically I used the popping out part and a little bit character function. I hope I can improve the code after the course is over.


Parts List

Item Name Unit Cost Quantity Total Cost
Solder Board 2.5 1 2.5
Power Supply 5 1 5
Mega1284 5 1 5
LCD TV 5 1 5
Header Socket 30 0.05 1.5
DIP socket 0.5 2 1
Microphone 0 2 0
3.5mm Phone Plug 0 3 0
R & C & Wire 0 Several 0
Total     20




All kinds of things in ECE 4760 webpage

ATMega 1284 datasheet

ECE 4760 Lab3 code by Bruce Land, Shane Pryor and Morgan D. Jones

The origin idea, "Pah"--An app in Apple app store