By Tiankan Li (tl245)
Ji Zheng (jz232)
Wednesday 4:30pm
Braille A to Z
BlindAid is
a portable tool that reads Braille and signals close objects. It is ideal for those unfortunate people who just
turned blind and have not mastered Braille reading and blind cane usage. It can
also be used as a learning instrument that helps the user decipher Braille
without constantly going to the Braille dictionary.
Rationale
After
browsing through the website on previous year’s projects, we believed that it
would be a noble act of applying the knowledge we learned from class to help
those who are in need, instead of just developing a toy that only satisfies our
technology savvy.
Since most
Braille we see on the street has standard size.
Instead of image sensing, we decided to use 6 push buttons in a 2x3
matrix. When the buttons are pressed
against the Braille, the buttons corresponding to the bumps on the Braille will
be pushed.
The push
button design not only makes our project more simple and elegant, it also makes
it more affordable to the blind people.
Since our
product is targeted to blind people who didn’t master the Braille reading, we
assumed they are new to the blind walking stick as well. Therefore, we attached
an IR sensor to detect whether there is any object close to the user, in hope
to reduce the chance of any unfortunate collisions.
the
horizontal and vertical spacing between dot centers within a Braille cell is
approximately 0.1 inches (2.5 mm); the blank space between dots on adjacent
cells is approximately 0.15 inches (3.75 mm) horizontally and 0.2 inches (5.0
mm) vertically.
Since it is
hard to recognize word by just hear its spelling. We implemented our BlindAid
such that when a full word is inputted, it will pronounce the word when “speak”
Button is pressed.
In order to
increase the range of words the reader is able to pronounce, we decided to use
allophones for speech generation instead of pre-recorded voices.
Instructions on Use
Hardware Tradeoff
When
expanding the memory size for the larger dictionary, the time it takes to load
entries from the dictionary onto the chip also increases proportionally.
Fortunately, once the data is written on to eeprom, it remains there so the
data from the dictionary only has to be written there once.
Software Tradeoff
One of the
issues was speed vs. coverage. We wanted
our Braille reader to be able to handle as many words as possible but the
search time increases as the dictionary size increases, because of our linear
search method. We felt coverage was the
most important aspect of our project, which was one reason we decided to expand
from the 32K flash memory to the 2MB eeprom memory.
Relationship of design to IEEE, ISO,
ANSI, DIN, and other standard
Our project
conforms to all IEEE standards to the extent of our knowledge.
Existing patents, copyright, and
trademarks
Our project
does not violate any existing patents, copyright and trademarks. The designs are all our original designs and
any code or data used were open to the public for research purposes and are
referenced in the appendix. The
“BlindAid” image above was made by us using Photoshop.
Main Components
Protoboard –
This is the skeleton of the BlindAid. It holds the Mega32 and supports links to
other components.
SpeakJet
–This is the messenger between the microcontroller and the user. Whenever a button is pushed or a Braille is
read, the SpeakJet will generate a robotic voice and deliver a message to the
user to make sure the user is updated with his/her surrounding.
Mega32 – The
heart of BlindAid. This microcontroller receives data from Braille sensor and
generates the sound output at amazing speed. (16Mhz) It also controllers
different parameters of the BlindAid such as Volume.
Headphone –
In ear design, so clear sound can be transferred to the user without too much
lost from noise.
IR Sensor – The user’s guard dog. It generates a warning message to the user when the user gets too close to a wall in front of him/her.
Braille Sensor – A combination of 6 NKK buttons. This is like user’s eye. It converts Braille to Binary data. (B2B) So the MegaL32 can process the data.
Memory – We
are using a 24AA1025. Each of them is a 1MB eeprom that is used to store words
and their corresponding allophones.
Design
We picked SpeakJet for our project because initially, we attempted to avoid the usage of external memory. Since SpeakJet contains all the allophones required for pronunciation, we don’t have to stack the onboard eeprom with coefficients for allophone generation. We planned to use onboard memory for storing data required for speech generation.
The SpeakJet chip can be setup up in two ways. One for Demo/Test Mode and one for serial control.
In demo
mode, the chip outputs random phonemes, both biological and robotic. We initially setup the chip to run Demo Mode
in order to test it out. Afterwards, we
connected the chip to the serial interface.
We found that the lowpass filter indicated were unnecessary because the
signal was clean and the lowpass filter only reduced the volume. Serial Data is the main method of
communicating with the SpeakJet. The
serial configuration to communicate with it is 8 bits, No-Parity and the
default factory baud rate is 9600.
These are the same settings used to communicate with the STK500
board. The Serial Data is used to send
commands to the MSA unit
As the project progressed, we
realized that in order to make a quality speech generation system, a vast
library of words and their pronunciation is needed. The onboard flash memory is
only 32kB. Assuming each word takes 8
bytes and its corresponding allophone takes 16 bytes, we can fit only about
1330 word. This is definitely not enough for speech generation. So we decided
to use a 2MB eeprom. With that, we can store about 22182 words, which should
cover most of the word we use in daily speech and some uncommon words.
By the time
we bought the 1mbit eeprom, we already had SpeakJet setup and functioning. For
the sake of time and simplicity, we decided to keep the usage of SpeakJet
instead of storing allophone in the external eeprom and use that for speech
generation.
Each eeprom
is 1mbit. It is made of two blocks of
2^16 bytes. The 24AA1025
eeprom uses Twin Wire Interface(TWI) for data transmission. As its name implies, it uses only two
bi-directional bus lines, one for clock (SCL) and one for data (SDA)
We used a
2000K Ohm resistor on each of those wires for pull-up purposes.
Read
Write
TWI requires
frequent MCU/EEPROM communication. Every
time a signal is send from MCU to EEPROM, an acknowledge signal from EEPROM
back to MCU is required.
Writing to EEPROM
takes a long time for writing. Luckily, we will never have to write to EEPROM
once the dictionary is established. Therefore the long writing time doesn’t
affect our project.
We borrowed
most of our TWI code from AVR library. AVR came with a neat TWI library called
“i2c.h”, which we used for byte write and byte read.
We decided
to use 2 separate power sources because the IR sensor drains a lot of currents
from the power source. Thus creates a lot of noises for the SpeakJet and
worsens the sound output.
In order to
pronounce the words that are not in our library, our MCU will divide the input
word into existing words. It would break the unknown word down by finding the
longest existing word from our library that matches with the prefix of the
unknown word.
For example:
Input Word: “appletree”
The Braille reader will pronounce: “apple” + “tree”
For data
storage, we put all words in a single character array with “*” between words to
indicate where words start and stop. We picked this design over double
character array because double character array requires same length for each
word. So the shortest word “a” will have
to take the same amount of space as the longest word in our dictionary.
However, the single array made binary search impossible. We have to use linear
search.
In order to
save time with searching, we arranged the words in our dictionary in
alphabetical order and recorded the position for each of the 26 letters in the
alphabet. So instead of traverse through the entire library for searching, it
will only search words that match the first letter.
For timing
purposes, we enabled timer0 of the clock with a pre-scalar of 64 and a
cmp-match of 250 to create 1ms intervals.
To aid us in
developing our speech, we used the Phrase-A-Lator software provided on the
Magnevation website. The software
allowed us to test out the sounds, phrases and different control functions for
the SpeakJet. The software allowed us to
adjust the volume, pitch, speed, and bend.
It was also used in helping us verify that the output control signals
that we were using were correct by comparing the output signals from our
program to that of the Phrase-A-Lator.
The software was easy to use. To
use the software, an audio amplifier must be connected to PortD.1, and the chip
to RXD port of the RS232. After
selecting the correct serial port connected to the STK500, the program was
ready to go.
The site
also provides a small dictionary of approximately 1,200 words with their
corresponding allophones and control signals for the SpeakJet. Initially, we planned to use this dictionary
because it was quite accurate and fit snuggly into our Atmel32 chip, using only
about 48% of the 32K flash memory provided.
We later realized that this dictionary was insufficient because it
lacked many of the common words in the English language.
After some
research, we discovered the CMU dictionary provided by Carnegie Mellon
University for research purposes. The
dictionary contains over 125,000 North American English words with their
transcriptions. The dictionary uses 39
phonemes in its transcriptions.
Phoneme Example Translation |
|
------- ------- ----------- |
|
AA odd AA D
AE at AE T
AH hut HH AH T
AO ought AO T
AW cow K
AW
AY hide HH AY D
B be B IY
CH cheese CH IY Z
D dee D IY
DH thee DH IY
EH Ed EH D
ER hurt HH ER T
EY ate EY T
F fee F IY
G green G R IY N
HH he HH IY
IH it IH
T
IY eat IY T
JH gee JH IY
K key K IY
L lee L IY
M me M IY
N knee N IY
NG ping P IH NG
OW oat OW T
OY toy T OY
P pee P IY
R read R IY D
S sea S IY
SH she SH IY
T tea T IY
TH theta TH EY T AH
UH hood HH UH D
UW two T UW
V vee V IY
W we W IY
Y yield Y IY L D
Z zee Z IY
ZH seizure S IY ZH ER |
The 39 phonemes
used in transcribing the CMU dictionary do not directly correspond directly to
the 72 biological phonemes provided by the SpeakJet so some conversion was
needed.
CMU =>SpeakJet |
||
|
Rather than
manually write all 125,000 words in the CMU dictionary and their phonemes into the
array format that we need. We wrote a
program in JAVA using the I/O functions to read and convert the
dictionary. We used the PrintWriter and
PrintReader class to parse the dictionary line by line. The Scanner class was then used to break up
the line into tokens. The format of the
dictionary was the same throughout the dictionary. The first token of each line was the word in
the dictionary and all other tokens on that line were the transcriptions. The words and sounds were all separated into
different arrays and stored in different files based on their first
letter. This resulted in 52 different
files. These arrays could then be loaded
into the eeprom one by one.
The
dictionary was too large to fit into our 2 eeprom chips so we decided to only
place words less than 6 letters long into it.
This resulted in 22,182 words and it almost completed filled our 2
eeprom chips. The words took up about
93.97% of our first eeprom chip. We
felt that this, with our concatenating algorithm would be sufficient to cover
most cases.
We learned a
lot from this project. If we had more
time, instead of using the SpeakJet chip, we could have implemented the
allophones into the flash memory or eeprom instead of requiring a separate
chip. This would have been
cheaper. One mistake we made was
placing all the devices into a single small container. This resulted in the devices shorting each
other out. We should have insulated each
device from one another better. For easy
of use, we would have liked to image processing to read the Braille. This would have cost more but would have made
the device for flexible in use. This
would also have saved a lot more money than buying a retail Braille reader
which costs $1,400.
1. To accept responsibility in making decisions
consistent with the safety, health and welfare of the public, and to disclose
promptly factors that might endanger the public or the environment;
We added the IR distance
sensor into help prevent potentially serious accidents. We set the alarm to set off at a distance of
40cm so the person would have sufficient warning to move out of the way.
2. To avoid real or perceived conflicts of
interest whenever possible, and to disclose them to affected parties when they
do exist;
There were not any conflicts
of interest in this project. As far as we know, our project was quite unique
from any other groups. We were happy and
willing to discuss any similarities our project may have shared with another
group’s.
3. To be honest and realistic in stating claims
or estimates based on available data;
All our data and estimates
are accurate to our knowledge. Any
errors were completely unintentional.
4. To reject bribery in all its forms;
The subject of bribery
never came up but we would have rejected it if it did.
5. To improve the understanding of
technology, its appropriate application, and potential consequences;
We were able to better
understand the subject of speech synthesis and the SpeakJet chip and its
potential use in aiding the less fortunate.
6. To maintain and improve our technical
competence and to undertake technological tasks for others only if qualified by
training or experience, or after full disclosure of pertinent limitations;
We tried to improve our
understanding of our subject by doing research and reading datasheets. We
did not attempt anything we felt we were unqualified to do.
7. To seek, accept, and offer honest
criticism of technical work, to acknowledge and correct errors, and to credit
properly the contributions of others;
We consulted with Professor
Land many times throughout the course of the final project for help and also
for guidance. He made many suggestions
that improved the quality of our projects.
We credited all of the figures in this report that were taken from other
websites.
8. To treat fairly all persons regardless
of such factors as race, religion, gender, disability, age, or national origin;
We do not discriminate
against any race or group. We were happy
to aid anybody who needed our help or advice.
9. To avoid injuring others, their
property, reputation, or employment by false or malicious action;
We tried our best to avoid
injuring person or group. Our device was
intended to help and not hurt anybody as long as it is properly used.
10. To assist colleagues and co-workers
in their professional development and to support them in following this code of
ethics.
We aided our colleagues
whenever we could and pointed out any potentially dangerous actions that they
pursued.
Costs
Part |
Quantity |
Cost |
NKK pushbuttons |
6 |
Sampled |
Custom protoboard |
1 |
$13 |
Sharp GP2Y0A02 IR sensor |
1 |
$12.50 |
9V Battery |
2 |
$4 |
LED, capacitors, wires, resistors |
|
Lab Supply |
Single ear headphone |
1 |
$1 |
SpeakJet chip |
1 |
$15 |
Large push buttons |
3 |
Lab Supply |
24AA1025 1mbit Eeprom |
2 |
Sampled |
Speaker |
|
$1.50 |
Total Cost |
|
$47.00 |
Code
Our Code
The Java Program used to convert the dictionary
Download Our Source Code
References
CMU dictionary
http://www.speech.cs.cmu.edu/cgi-bin/cmudict
SpeakJet Datasheet
http://www.magnevation.com/pdfs/speakjetusermanual.pdf
Phrase-A-Lator Software and Phonetic Dictionary
http://magnevation.com/software.htm
Mega32 Datasheet
http://instruct1.cit.cornell.edu/courses/ee476/AtmelStuff/full32.pdf
GP2Y0A02YK Datasheet
http://document.sharpsma.com/files/GP2Y0A02YK-DATA-SHEET.PDF
Graph of IR Sensor and
ethical standards codes