We implemented an optical microphone which converts distant vibrations, including sound, to an audio signal. Measuring the reflection of a laser beam from windows or glass, it possible to hear sounds near the target. The system also includes frequency spectrum analysis with a real-time display and user-configurable ranges, and it also has an output for headphones to monitor the signal. This may be useful, for example, to the hearing-impaired for listening to a lecturer from a distance in an auditorium. Apart from the listening to speech, this can be used to get an audio signal out of anything that can vibrate and reflect a laser beam. This includes measuring vibrations in buildings for resonance feedback, which would show a peak in the spectrum.
High Level Design
Optical Microphone and Spectrum Analysis
The transmission of the exploratory beam and the reception of the informational beam reflection is handled in analog hardware, which helps to filter the signal, reduce noise, and amplify it. The microcontroller is used in conjunction with the hardware to display a spectral analysis of the acquired signal on an LCD. The acquired signal, if used for an audio application (which is the only case handled in this report), can be connected to speakers or headphones to listen, as well as to the microcontroller to display the spectrum.
Although we considered a diverse array of ideas for final projects, we chose this design because it has several practical uses, includes a large analog component, and is amenable to modular development. Additionally, the project posed a large challenge due to the unknown characteristics of the laser reflection signal and the unexpected limitations which could appear. The concept of a laser-based audio system was known to us previously, although we cannot cite a specific source of inspiration.
Of course, the user must point the beam at the target and align the reflection with the receiver. The transmitter is simply a laser mounted on a steady post or tripod, with an on/off switch. The receiver must also be steady and adjustable, but includes the majority of the components. The phototransistor is the key to the receiver, and must be aligned well to ensure accurate operation. Laser light is used for its low divergence. Whereas normal light spreads out in three dimensions, laser light travels in a tight cone, so the reflected beam is still strong. Also, laser light is coherent, so the energy is concentrated at a single frequency, making it easier to detect.
The main hardware/software trade-off was deciding which functionality would be implemented in hardware and which in software. Although digital filtering was a possibility, we decided that analog filtering should be sufficient. Also, the signal could be recorded to a computer and post-processing would be more practical from a full computer rather than a microcontroller.
To understand more about FFTs, refer to the section below.
ANSI 136.1 provides laser standards. Relevant lasers are in Class II for less than 1 mW and Class IIIa for less than 5 mW. Based on a blink reflex of 0.25 seconds, Class II lasers are considered safe. With restricted beam view and careful handling, Class IIIa lasers, if altered by optical instruments, may seriously damage the retina if stared at for two minutes. These are typical for laser sights in firearms or laser pointers. We use a laser rated at 4 mW, making it Class IIIa. The safety implications are discussed later. We do not believe that any other standards are relevant to our project.
We did not discover any relevant patents, trademarks, or copyrights. Similar systems have been implemented before, so the base technology is not new, although we have not found any optical microphones involving a microcontroller.
In order to output the spectral analysis, a Fast Fourier Transform is performed on the incoming audio signal. For peak microcontroller performance, we only perform the forward FFT using real-valued input. Before we get into more detail about this, we will first discuss the Fourier Transform and its implications.
The Fourier Transform is a transition from the time domain to the frequency domain that can be computed in both continuous and discrete time. As we are using a microcontroller to perform a series of analog to digital conversions, we will talk about the Discrete Fourier Transform, or DFT.
A DFT performs computation on, and outputs, a series of discrete, imaginary numbers. In our case, the set of discrete numbers is the array of regularly-spaced samples from the ADC. The output will have the same number of values as the input, each representing 2*F/N, where F is the highest frequency and N is the number of samples. The reason why it is 2*F/N rather than F/N is because the final output is symmetric. Several different FFTs exist; the one we use is decimation-in-time. This algorithm efficiently performs the DFT, taking advantage of several properties of the DFT. The main advantage is that signals can be split into half, creating two signals. Then, each of these signals can be split in half, into even and odd terms, and so the DFT can be performed on the new signals, as in the equation below (from Professor Delchamps)
If N is a power of 2, this computation is very efficient due to its recursive nature.
Each output value is referred to as a bin, and under the results section we show the peak frequency represented by each bin in our final representation of the code. In order to calculate the magnitude, or power, of this result we take the sum of squares of the real and imaginary parts.
In order to calculate F, we refer to the Nyquist sampling theorem, which states that it is necessary and sufficient to sample at twice the maximum frequency of the signal. Under the Program Design section, we discuss what sampling frequencies we chose for each mode of operation.
The algorithm for a “classic” DFT has rather high time complexity, and so we use a Fast Fourier Transform adapted from Tom Roberts and Malcolm Slaney by professor Bruce Land of Cornell University. The time complexity of this algorithm is much less, allowing us to take a higher number of samples to improve our accuracy.
This algorithm performs an FFT using 8:8 fixed point arithmetic, again as a speed increase. We expect the values of our ADC to be between -0.5 and 0.5 before performing the FFT, and thus need to handle a certain level precision after the decimal point. Floating point arithmetic is too time intensive, and again to simplify our time complexity this avoids it.
Microcontroller and Software
Our program is split into three logical steps: ADC conversion, FFT calculation, and LCD display of the spectral analysis. The results of the ADC are stored in a buffer the size of the number of required samples to perform the FFT. We finalized our implementation with taking 32 samples. It is important to note that we need to format the result of the ADC into fixed point form to be compatible with the fixed point FFT algorithm. Knowing that the results of an ADC conversion are between 0 and 255, we subtract this result by 128 to center it around 0 and store this in the lower 8 bits of an integer.
We have the following ADC sample frequencies for the different modes of operation. For the 500Hz range, we sample every 0.756ms, or 1142.85Hz, for the 1200Hz range, we sample every 0.375ms, or 2666.67Hz, and for the 4000Hz range, we sample every 0.125ms, or 8000Hz. We see that this stays true to the Nyquist theorem (discussed above). Switching between these frequency ranges is completed using a single button in a 3 state state-machine. At each button press we move from 500Hz -> 1200Hz -> 4000Hz then back to 500Hz. The press of a button triggers an interrupt which handles this state transition. Additionally, we have 3 LEDs one of which is illuminated in each state to notify the user what frequency range they’re in.
Once we have 32 ADC samples in fixed point format, we’re ready to run the fixed point FFT algorithm*. Running the FFT algorithm performs the math listed above, and returns a real and imaginary array both the size of the number of samples (32). The last step in the FFT process is to calculate the magnitude of the result by summing the square of the real and imaginary parts. This produces the power and allows us to perform a spectrum analysis.
In order to display the spectral analysis, we first needed to make a new library of bars to output to an LCD using lcd_lib**. To do this we defined 9 different bars (ranging from size 0 to size 8). We then wrote a function that takes a series of 16 inputs ranging from 0 to 16 and outputs the correct 16 bars using the 32 characters on our LCD display. It does this by putting the necessary two bars on top of each-other to create “two-level bars.” For example if a bar is intended to be size 13, it is represented by a bar of size 5 on top of a bar of size 8. Once we receive our magnitude from the FFT algorithm, we need to scale the result into 16 uint8_t’s ranging from 0-16. In order to improve the performance of our system, we only update a character if it is different than it was in a previous timestep.
We decided to remove the symmetry from our system by only displaying the first 16 bins of our 32 results from running FFTfix. We made this choice to maximize our usage of the LCD display as it is only 16 characters wide, and outputting the symmetric result limits the precision of each bin.
Our main function performs these 4 routines in an infinite while loop: ADC sample, FFTfix, FFTmag, and draw_fft. As soon as one is done, we immediately move to the other.
In the source code (at the end), we explain in comments each method that we used in our final program.
* The fixed point FFT algorithm is from Professor Bruce Land which was adapted from code by Tom Roberts (11/8/89) and Malcolm Slaney (12/15/94)
** lcd_lib is from scienceprog.com (2007), made available to us through Professor Bruce Land
This system has very direct relationships between the analog parts and the digital parts. The analog hardware acquires and prepares the signal to be sent to the microcontroller, but the digital side doesn’t control the analog circuitry beyond indicator LEDs and the LCD.
The hardware was designed to perform the overall goal of implementing an optical microphone using several stages. The first stage supplies the correct current to the laser transmitter to produce a sufficiently bright beam. The next and most complex stage acquires the reflected beam, amplifies it, and filters out noise. Next, the cleaner signal is sent to two different stages. One amplifies the audio signal for headphones, and the other amplifies the signal to prepare it for analog-to-digital conversion in the microcontroller. This latter signal is analyzed by the MCU for frequency content, as described in the previous section, and the resulting spectrum is displayed on the LCD by sending characters from data output pins. Also, the frequency range can be switched between three presets using an input button, and LED indicators show the current state.
The laser transmitter simply emits a beam which can be pointed at reflective, vibrating surfaces. The desired functionality includes compatibility with available phototransistors, sufficient and adjustable brightness, and long range. The laser module we selected has a wavelength of 650 nm (red) and less than 2 mrad beam divergence, meaning that it is practical for long ranges (so after 100 ft it grows 5 inches). Using the simple circuit shown in the diagram, a potentiometer controls the brightness of the laser, from dim to very bright. The resistance values were found empirically - with the knob turned all the way down, the 120 ohm resistance gives a bright beam, drawing around 30 mW total. With the knob up all the way, the 120+620 ohm resistance gives a dim yet visible red spot and draws around 10 mW. This works with a typical 9-volt battery. The laser is mounted to a small tripod to aid in alignment.
The laser subsystem
Receiver and Amplifier
The signal amplifier circuit was designed to handle a weak and noisy signal. The reflected beam spot must be aligned to hit the phototransistor to generate the current which creates the audio signal. The phototransistor chosen has a peak wavelength sensitivity at 800 nm, but at 650 nm, which the laser emits, the sensitivity is 70%. This is more than adequate sensitivity, and using infrared light near 800 nm would be much more difficult to align. Using green light near 500 nm, the sensitivity would only be 10%, and other phototransistors have very similar sensitivity profiles. Additionally, the rise and fall times are 4 microseconds according to the datasheet, so it is more than fast enough to handle audio signals, which are closer to the order of 1 ms.The phototransistor is arranged in a common emitter topology, which was found to work well. The 2 kohm resistor was found to provide acceptable results compared other nominal values tested.
After the common emitter stage, a decoupling capacitor separates the DC bias of the signal from that of the next stage. The capacitor is also part of a high-pass filter, together with the 11k resistor, with a cutoff frequency of 145 Hz (1/2*pi*RC). The ideal pass-band of this stage is 100 Hz - 1 kHz, because human voice can range as wide as 60-7000 Hz, but generally up to only 300 Hz. The cutoff is so high to combat 60 Hz and 120 Hz noise, which can be very strong in a room filled with dozens of computers, equipment and fluorescent lights (not to mention sleepy engineers). The LF353 chip contains two operational amplifiers, both of which involve a low-pass and high-pass part (together band-pass) and gain. This overall topology is based on a common design used in many 4760 projects in the past. The low-pass filter involves the 2.2 nF capacitor and the 100 k resistor, with a 724 Hz cutoff. This pass-band of 145 - 724 Hz removes the DC bias, much of the 60 Hz noise and high frequency noise. The gain is controlled by the ratio of the 100k resistor to the 5.1 k, so it is 20 for each of the two stages. However, the volume potentiometer controls this gain, and so it can adjust the audio output from silence to extreme loudness. Because this system uses a 9-V battery, there are no negative rails available for the op-amps. Instead, the rails of 0 V and 9 V are used, but the input terminals are connected, through resistors, to a separate voltage level of approximately 2.83 V. This level is maintained using several diodes in series. The 8.2k resistor ensures that about 1 mA is going through the diodes, stabilizing the voltage. And thus this stage acquires the audio signal from the reflected beam, amplifies it, removes noise, and isolates the phototransistor from the next stage. (source: http://en.wikipedia.org/wiki/Human_voice)
The amplifier/filter subsystem
The amplified signal is then sent to an audio amplification stage, which is used to drive the headphone load. The LM386 amplifier from TI is ideal for amplifying the signal further and running from a battery due to low distortion and low power. The topology used is suggested by TI and provides a gain of 20. The potentiometer adjusts the volume. Even though there is also an adjustable gain in the previous stage, both knobs may need to be adjusted to obtain ideal volume and minimum distortion. Specifically, the first knob should be adjusted to be high enough for an audible sound, but not higher. Then, the second knob can be adjusted. In this circuit, the 1000 microfarad capacitor stabilizes the 9 V signal relative to ground.
The audio amplifier subsystem
The amplified and filtered signal (not from the stage for the headphones) is sent to a stage to prepare the signal for processing by the microcontroller. First, a decoupling capacitor isolates the DC bias. Then, an inverting amplifier setup is used. The gain is the ratio of the 510k resistor to the resistance from the inverting input to the signal input. The potentiometer allows this gain to be adjusted. This is desirable because some signals will be stronger than others based on alignment or distance from the reflecting surface, but the signal amplitude should be optimized to be on the order of 5 V, but not higher, for analog-digital conversion. Because the signal is composed of sine waves, the inversion of the signal at the op-amp output is not an issue. The LM358 was chosen because it has a large voltage swing. Again, the rails of 0 V and 9 V are used, and the noninverting input is tied to the fixed voltage level of around 2.8 V created by the amplification stage. After this amplification, the signal is biased to 2.5 V, by the decoupling 10 microfarad capacitor and the 100k voltage divider. The two diodes help prevent the signal from going above 5 V or below 0 V. The 5 V level for Vcc comes directly from the microcontroller. The output of this stage is connected to pin A0 on the MCU, which uses an ADC to read in the signal.
The ADC subsystem
The LCD used to display the frequency spectrum of the audio signal is the Lumex 1602D/A. It is a 16x2 character display, meaning that it has two rows with 16 characters on each row. To graph the spectrum in real-time, the LCD is constantly updated. The characters are sent on pins of port C as shown in the diagram and explained previously in the software design. The LCD is also connected to ground and Vcc (5 V). A potentiometer controls the contrast of the screen.
The LCD subsystem
A simple interface is included to allow user input. A button toggles between three possible frequency ranges for displaying the spectrum. The switch closes a circuit connecting pin D7 to to ground via a series resistor. This change is sensed and handled in software as described previously. The current state is also displayed to the user. Either pin B0, B1, or B2 is high at any time, and so the corresponding LED, as shown in the diagram, stays on to indicate this. The series resistance was found to provide adequate brightness without wasting power.
The LED subsystem
What didn't work
We were compelled to make several modifications to our earlier designs. First, we investigated displaying the spectrum on an OLED screen. One such display was available, so we first attempted to display something (anything) on it, but we were unsuccessful. The documentation was poor and the part was discontinued, so we decided it would be a drain on time to pursue it. The cost of other OLEDs was relatively high, and there was no guarantee that they would be any easier to use. We then decided to use the portable NTSC TV used in Lab 3 due to our experience using it. Knowing that it is CPU-intensive, we projected that we could use a second microcontroller to handle the NTSC interface since our budget was expected to be low. However, the reason we decided against using the TV was due to power constraints. We hoped to make the system portable, and the TV used a prohibitive amount of power, such that 9-V batteries would not be sufficient. Finally, we decided to use an LCD. Even though they are slow, have low resolution, and are meant for displaying characters, not graphs, we found it to be ideal in many ways. It is possible to define new characters, and the only thing we would be displaying would be bars. The 16 characters in a row would give a decent resolution of tens of hertz depending on the range setting. Also, it was possible to refresh several times per second, meaning that a real-time display was possible.
Additionally, when in the early stages of designing the software, we set out to write our own FFT algorithm in order to accommodate the data of format we would be expecting. This did not turn out to be fruitful, as the many multiplications and other operations were posing a serious time constraint on the rest of our system. After scanning through the fixed point FFT algorithm, we decided instead to format our data to be compatible with this algorithm, which produced much better results.
The speed of execution is limited by the output to the LCD display. It takes a few milliseconds to draw each character, of which we have 32 (two rows of 16). Thus, the output of the spectral analysis is in real time, but the frames displayed per second is lower than desired (about 4-5). However, because we don’t send characters if the slot will be blank, the typical refresh rate is approximately 20 per second.
Because we squeeze whatever results we get into the range 0-16, we lose some precision, although it is still very usable for the purposes we expected. It responds quickly to changes in the audio signal, and when high-pitched sounds appear, the spectrum shifts as expected.
The following graph is representative of the maximum frequency in each bin for each mode of operation. We found these results by only sending a sine wave to our spectral analyzer and noting for what frequency the result was entirely in one bin (at highest magnitude) rather than being half in one bin and half in another.
As seen in the graph above, the graph is linear with frequency, confirming that the bins properly display the Fourier spectrum.
sample1.mp3 - Bruce explains why some entire towns have people with six fingers
sample2.mp3 - Lift the chorus, speed it onward, loud her praises tell
There are several safety concerns in this project. First, the electrical signals are rather weak to cause bodily harm. A 9-V battery is considered safe. However, the optical signal can potentially be an issue. As stated previously, the transmitter uses a Class IIIa laser. These lasers are generally considered safe without prolonged eye contact. It is unlikely, if not impossible, that the beam could hit someone’s eye in a way to cause permanent damage. First, the laser is only to be switched on while in use. A safety warning on the device warns the operator to clear the area before turning on the laser. Also, the opposite side of windows being targeted must also be checked for potential issues. Finally, the reflection off of the surface or other surfaces must be considered. The power of the laser should be adjusted to low during alignment and only turned high while measuring. Users should avoid using the beam at eye level. In the event that someone does walk into the beam’s path, the blink reflex is quick enough that no damage should be done. We have declined to test this, though. Also, the reflected beam has diverged, and so the larger spot size means that the intensity is much less.
We do not use any RF devices and do not expect to interfere with other projects. With the laser beam under control, the other portions of the project do not create any hazards (unless swallowed).
Testing and Development
While developing the laser system, we tested it by reflecting off of a thin piece of glass placed in front of a speaker. We generally had an endless loop of Videonote playing so that we could tune our system to the signal we’d receive if someone was talking behind a window. As soon as we were able to get an audio signal, by listening to it with headphones and analyzing the signal on the scope, we could make the necessary adjustments until our received signal sounded its best. Several of the nominal values of resistors and other circuit elements were found experimentally (although based on theory first).
While developing the spectral analysis system, we broke the software into three modules: outputting bars to the LCD display, ADC, and FFT. We developed them in this order allowing us to slowly add complexity over time. First, a “fake”, hard-coded signal was stored and displayed on the screen to demonstrate that the graph would display correctly and quickly. Then, the ADC values were displayed, almost like an oscilloscope, to ensure that the conversion was occurring. Finally, the FFT was tested using a function generator as an input. Since a sine wave has very sharp frequency content, we could see a single peak move across the screen as frequencies were scanned. To test the system as a whole, we sent it signals from the function generator to allow us to verify that frequencies were showing up in the correct bins. Finally, the digital system was integrated with the optical microphone system, and there were no issues.
We find the end result to be very usable considering the limitations. The final design is soldered to a board for long-term sturdiness. The knobs and button allow for simple user input. The indicators from the LEDs and the LCD display are useful feedback on the signal being monitored.
The drawbacks in usability include alignment and volume sensitivity. Over long distances, small changes in the laser transmitter’s direction will change the reflected beam’s position greatly. For example, simply tapping the laser may move the reflected spot several centimeters. There is no simple way to fix this, but we alleviate it by mounting the transmitter to a tripod to make adjust the angle and position easier. The volume is controlled by two separate knobs, which may be inconvenient to some, but in this case it provides better control of the signal. The first is necessary to provide sufficient gain to the signal, but too much gain can cause distortion. The second knob controls the volume to the headphones. It may be necessary to go back and forth between the knobs to optimize for a particular signal. This is not especially difficult, but has room for improvement.
The potential use case of someone hard-of-hearing listening to someone from far away demonstrates that the system is designed to help overcome traditional difficulties in hearing. Whereas sounds attenuate with distance at a fairly high rate, optical signals with low divergence can go long distances. Taking advantage of this, there is a certain distance at which the signal from the optical microphone is stronger than the sound reaching the observer. For some, this threshold may be closer.
We believe our results met our expectations, even though initially we had some concerns about the feasibility of our task. It was a large accomplishment to reach the point of getting a noticeable audio signal out of the phototransistor, and finally getting a comprehensible signal was very encouraging. We did not expect to get an especially clean signal, but the result, when aligned correctly, was much clearer than we had believed we could achieve. The signal tends to be clearer when listening through two headphones rather than the recordings to the computer listed above.
Additionally, the spectrum analysis performs exactly as we had hoped. Because such a large portion of this project relied on the analog portion, it was necessary to get a proof of concept (an acceptable audio signal) before we could commit ourselves to developing the software. This limited the amount of time available to us, yet we were still able to meet our expectations, and even surpass them by allowing different modes of operation.
If we were to revisit this project, there are several modifications to the design and approach that would be more efficient. First, the design has not been optimized for power overall. In many parts some attention has been devoted to avoid wasting power, but many parts of both the analog circuit and the microcontroller software could be altered to accomplish the same functionality with less power.
We have not used any intellectual property for this project other than the fixed point FFT algorithm and lcd_lib, which is adapted from Tom Roberts and Malcolm Slaney, and scienceprog.com (2007), respectively. The fixed point FFT algorithm was modified by Professor Bruce Land, and and both libraries were made available to us via the ECE4760 course website (http://people.ece.cornell.edu/land/courses/ece4760/). Even though the project demonstrated itself does not seem to be patentable, alterations to the design for specific functionalities may make it a pursuable option. The project may be publishable in a magazine for electronics enthusiasts.
We do not anticipate any legal issues being involved with the design or operation of the device for its intended use. It has the potential to be misused to invade privacy rights by recording conversations in closed, private rooms. However, the amount of difficulty in doing so would discourage all but the hopelessly immoral. Also, the laser spot is visible, so it can be detected by potential victims.
Our decisions and actions in this project have been consistent with the IEEE Code of Ethics. First, as we mentioned previously, the safety hazards are minimal, and those which exist have been disclosed to the user through a warning label. We have avoided conflicts of interest by not involving external parties in the project. To the best of our knowledge, all reported results and data are accurate. We have not fabricated any results or deceived anyone reading our report. We have neither accepted nor given bribes. The technology used in this project is not new, and so its appropriate application and potential consequences can be be projected to have minimal issues. As mentioned, there is potential for misuse, but we acknowledge this and cite limitations minimizing its occurrence. We have not claimed expertise in any fields beyond our knowledge. We have credited the contributions of others to our work, and we plan to accept criticism thereof and correct any errors discovered. The system does not discriminate based on prejudicial factors. We have not harmed others during this project, and we do not expect the system to do so. During the completion of this project, we have supported our colleagues when necessary. At this time, we would like to acknowledge the help of our Mandarin-speaking colleagues who helped us change the oscilloscope back to English from Chinese one time.
A. Source Code
- fft.c – Contains all methods necessary for a fixed-point Fourier transform calculation
- fft.h – header file
- laserFFT.c – Contains main method responsible for overall program structure
- lcd_lib.c – Contains methods for displaying characters on LCD
- lcd_lib.h – header file
|Custom PC Board
|*used for demo, but not needed for final design
The receiver box
The laser transmitter
The target board which we built and did not fully complete by the demo
D. Division of labor
Both team members worked on all parts of the project, although in order to increase productivity by working on separate parts in parallel, some sections of the design were handled in greater part by a single team member. The digital spectrum analysis, ADC, and LCD display implementations were guided by Jon. The optical signal, analog amplification and filtering, and assembly of the circuit onto a permanent board were handled by Brian. Both worked on building the target board and assembling the final product into a package.
E. Datasheets and references
We would like to acknowledge Bruce Land and the 4760 TAs for all their help and support.