We implemented naturally recursive Lindenmayer systems (commonly known as fractals) on an Intel Altera board to graph on a VGA screen. Our system proved to be faster than pure software programs written in both Python and C.
Introduction
There are dozens of L-Systems in existence, and for this project we implemented seven. Each was configurable via the number of iterations the system performed, the axiom to begin with, the size and color of the lines, and the starting coordinates of the system.
- Dragon curve
- Two versions of the Sierpinski arrowhead
- Two versions of the Koch curve
- Cross
- Tessellated triangles
This project was composed of two subsystems: Verilog code for the FPGA to calculate the L-Systems and graph the result, and the C code for the HPS to interface with the user. The FPGA has a faster system clock frequency and, considering the generation of L-Systems is computationally intensive, it made sense to offload that onto the faster, hardware-based part of the Altera board. Graphing on the VGA screen isn’t something that’s limited to the FPGA itself, but bandwidth would’ve been a limiting factor when transmitting large chunks of data from the FPGA to the HPS. With this in mind, graphing was kept to the FPGA, where the results were already being calculated. The HPS was instead utilized for its terminal interface, where the user was able to configure the L-System to be graphed. Any system settings were communicated to the FPGA via PIO ports to use in its computations.
High-Level Design
The formal definition of a Lindenmayer System (or L-System) is “a parallel rewriting system and a type of formal grammar.” In simpler terms, L-Systems generate fractals, which are never-ending complex patterns that are self similar at different scales such that zooming in or out will still show the same fractal pattern.
There are three main components to every L-System: an alphabet, an axiom, and rules. These configure and manipulate the strings of characters that define how the visualization should act, such as moving forward, turning 90 degrees to the right, or branching out from a previous state. The alphabet is the set of valid symbols that can be included in these strings; for example, an alphabet of {A, B} restricts any valid sentence to only contain A and B. The axiom sets the initial value for the system’s string; working within the system with alphabet {A, B}, an axiom could be “A.” Rules change the string recursively, generating new strings repeatedly and providing actions to control the movements of the growing visualization. They’re structured to take in a predecessor and a successor—the former is the input and the latter is the output after having the rule applied to it. A rule to our {A, B} system with axiom A could be “A → AB”; this would take our initial state and create the string “AB” for the next state.
The alphabet for every phase of our design, no matter the coding language, consists of the following symbols, as shown in Table 1. Each symbol corresponds to a specific action, from graphing (which are consistent across L-Systems) to buffer characters for rule generation (specific to each L-System).
Table 1: Examples of actions associated with L-System symbols.
Though there is an infinite number of combinations of axioms and rules within the given alphabet, the seven L-Systems we developed have preset values for each. We followed these so as to ensure the graphed results would match what was expected. Table 2 shows the axioms and rules for each L-System.
Table 2: Parameters for the L-Systems in our implementation.
Baseline Design
Considering L-Systems have multiple moving parts, the most straightforward way to understand their inner mechanisms comes from our baseline designs.
Python
This Python code is an off-the-shelf implementation that we extended to incorporate the axioms and rules for
our specific L-Systems. It uses a dictionary of rules—key-value pairs where the key is the character to apply
to rule to and the value is the output string for that character—and locally declared axiom
,
iterations
,
angle
, and step
variables to iterate over a starting axiom and graph the resulting
L-System using pygame. This
is accomplished with several functions, each of which helped build the foundations for our C code and
eventually our Verilog design.
applyRule
is responsible for taking in a single character and converting it to the output string
as specified
in the rules
dictionary. It uses a for loop through every entry in rules
to see if
any key matches the input
character. If a rule exists for this specific character, the result string is set to that rule’s value,
otherwise the result string is the original character. This exemplifies how not every character will be
expanded through a rule—at the very least it’ll just be copied into the resulting output string.
def applyRule(input): output = "" for rule, result in rules.items( ): # applying the rule by checking the current char against it if (input == rule): output = result # Rule 1 break else: output = input # else ( no rule set ) output = the current char -> no rule was applied return output
The next function is processString
. Rather than take an individual character as input, it
receives an entire
string to process. Every character must be handled in order to see if a rule can apply to it, so a for loop
runs over every character and sends each to applyRule
. The new string is built up with each call
to applyRule
,
creating a string that will have grown in length from the original.
def processString(oldStr): newstr = "" for character in oldStr: newstr = newstr + applyRule(character) # build the new string return newstr
createSystem
puts the previous two functions together to bring iterations into the mix. It takes
in axiom
and
numIters
as inputs, where the former is the starting string for the L-System and the latter is
the specified
number of iterations to process the string by. While processString
does grow the string as it
loops over every
character, createSystem
is responsible for repeating this process multiple times from iteration
number zero to
iteration number numIters
-1. With each iteration, it takes the string from the previous iteration
and puts it
into processString
, storing the result so that the next iteration can use it.
def createSystem(numIters, axiom): startString = axiom endString = "" for i in range(numIters): # iterate with applying the rules print("Iteration: {0}".format(i)) endString = processString(startString) startString = endString return endString
With the L-System string built through createSystem
, drawTree
is able to graph the
L-System using pygame. It
takes in the L-System string and the starting coordinates from which to graph. It then uses a loop to look at
every character in the string to determine what (if any) graphing functionality it needs to provide. If it
sees an ‘A’ or ‘F’ it draws forward by the length determined by step
. If it sees a ‘+’ it
increments the
orientation of the system by angle
, and similarly decrements the orientation of the system by
angle
if it sees
a ‘-’. ‘X’ and ‘Y’ aren’t graphing characters so they, and any other characters not equal to ‘A’, ‘F’, ‘+’, or
‘-’, are skipped over.
def drawTree(input, oldpos): a = 0 # angle i = 0 # counter for process calculation processOld = 0 # old process newpos = oldpos color = (255, 255, 255) linesize = 1 for character in input: # process for drawing the l-system by writing the string to the screen i += 1 # print process in percent process = i * 100 / len(input) if not process == processOld: # print(process, "%") processOld = process if character == 'A': # magic happens here newpos = polar_to_cart(a + angleoffset, step, *oldpos) pygame.draw.line(screen, color, oldpos, newpos, linesize) oldpos = newpos elif character == 'F': newpos = polar_to_cart(a + angleoffset, step, *oldpos) pygame.draw.line(screen, color, oldpos, newpos, linesize) oldpos = newpos elif character == '+': a += angle elif character == '-': a -= angle
C
Since Python is an interpreted, high-level language, our next step was to synthesize and graph L-Systems in C code to be run on the FPGA. This did require a Quartus project to be compiled and loaded onto the board, but it wasn’t something that we changed at all. Specifically, we used the “GPU with FAST display from SRAM” project from the ECE 5760 Avalon Bus Master page.
The program layout of the Python code corresponds almost entirely with that of the C program, though it only
focuses on the dragon curve, not all seven L-Systems. First there is the applyRule_DragonCurve
function, which
is responsible for receiving a single character and outputting a character pointer to a string. This string is
built by putting the input character through a switch-case statement and seeing which condition it matches.
char* applyRule_DragonCurve(char input) { char tmp[1000000]; switch(input) { case 'X': { strcpy(tmp,"X+YF+"); break; } case 'Y': { strcpy(tmp,"-FX-Y"); break; } default: { strcpy(tmp, (char[2]) { (char) input, '\0' } ); break; } } return tmp; }
processString_DragonCurve
follows the same logic as processString
in the Python. It
takes in a
character
pointer to a string and then iterates on every character to put it through applyRule_DragonCurve
.
The result
string from each call to applyRule_DragonCurve
is concatenated together in memory.
char* processString_DragonCurve(char* prev) { int i = 0; char *tmp; char *check; int length = strlen(prev); for (i = 0; i < length; i++) { check = prev; tmp = applyRule_DragonCurve(prev[i]); strcat(prev, tmp); } return prev; }
createSystem_DragonCurve
needs the number of iterations and the starting axiom to function. For
every number
in the range [0, numIters
-1] it calls processString_DragonCurve
on the string from
the previous
iteration.
char* createSystem_DragonCurve(int numIters, char* axiom) { char start[1000000]; strcpy(start, axiom); char end[1000000]; *end = ""; int i = 0; char *check; for (i = 0; i < numIters; i++) { check = processString_DragonCurve(start); *start = end; } return start; }
Finally, the dragon curve is drawn through the aptly named function draw_DragonCurve
. It loops
through each
character in the character pointer of the L-System string and checks to see what graphing functionality that
character possesses. This leans on VGA drawing functions written by Bruce Land.
void draw_DragonCurve(char* input, int old_x, int old_y) { int a = 0; // 0 degrees is straight up vertically int length = 10; int new_x = old_x; int new_y = old_y; int i = 0; char *check = input; printf("GRAPHING STRING: "); while(*check!='\0') printf("%c",*check++); printf("\n"); for (i = 0; i < strlen(input); i++) { if(input[i] == 'X') { continue; } else if (input[i] == 'Y') { continue; } else if (input[i] == 'F') { if (a % 360 == 0) { VGA_line(new_x, new_y, new_x, new_y - length, red); new_x = new_x; new_y = new_y - length; } else if (a % 270 == 0) { VGA_line(new_x, new_y, new_x - length, new_y, yellow); new_x = new_x - length; new_y = new_y; } else if (a % 180 == 0) { VGA_line(new_x, new_y, new_x, new_y + length, red); new_x = new_x; new_y = new_y + length; } else if (a % 90 == 0) { VGA_line(new_x, new_y, new_x + length, new_y, yellow); new_x = new_x + length; new_y = new_y; } } else if (input[i] == '+') { a = a + 90; } else if (input[i] == '-') { a = a - 90; } } }
Full-System Design
Structure
Boasting a faster internal clock and configurable hardware, the FPGA is the core computational power for the
system. This power is directed towards two main functionalities: calculating the L-System and graphing the
L-System. The former relies on string manipulation, while the latter uses enable and address signals to write
to the VGA screen. Since this system serves two distinct purposes, it was useful to split each into a separate
Verilog file: rules.v
for the calculations and DE1_SoC_Computer.v
for the graphing.
This modular approach not
only made it simpler to visualize how each module connects, but it also made the testing process easier to
take step by step.
The C code running on the ARM side establishes the serial input control for the L-System’s settings, such as the L-System in question and the starting coordinates for graphing. There was the opportunity to use this for graphing as well, but the bandwidth and PIO port size limitations would’ve hindered our speedup. There was the additional thought of housing the L-System calculations on the ARM side, but this was also scrapped due to the FPGA’s faster clock frequency and opportunity for large amounts of data to be stored through M10Ks.
Bridging the gap between the ARM and FPGA is the Avalon bus, visualized and configured using QSYS. The ARM
takes in settings from the user and outputs them to the FPGA through output PIO ports, while the FPGA sends
over timing information through an input PIO port. These ports have to be programmed on both ends; the ARM
code memory maps their address spans, and the FPGA instantiates wires to connect the ports of the ARM module
(called Computer_System
in the Verilog) to relevant inputs/outputs in Verilog modules.
The interweaving of these two technologies and the emphasis on each system’s strengths led to the development of responsive, visualizable L-Systems. Figure XX below shows how each connects via input and output signals and wires.
Figure 1: Block Diagram of signal flow between modules.
Computation
The bulk of the code used to create these L-Systems is found in rules.v
across four modules:
dual_clock_ram
,
signed_mult
, rules
, and create_system
.
dual_clock_ram
dual_clock_ram
is the memory unit of our system that allows our system to be recursive. Each
instantiation of
the module is composed of eight M10k blocks in a 8-bit by 8K configuration, with each block holding one of the
eight blocks. The module performs one cycle writes and two cycle reads. Thus, in the cycle that a memory
operation is called, a write would finish the following cycle, and a read would finish after the second
positive edge of the clock. However, it is possible for these memops to still take the same amount of time
because the read and write operations are processed on different clocks, both of which are taken as inputs
(clk1
, clk2
). The other inputs are write enable (we
),
read_address
, write_address
, and write data (d
). The
sole output is read data (q
). When graphing, this module is the limiting factor for the number of
iterations
which can be performed. With a fixed memory size, at some point, we will run out of space to write a new
iteration. We settled on 8K memory locations because we calibrated our system to fit a 10-11 iteration dragon
curve.
signed_mult
This module performs 11.21 fixed point signed multiplication. This means that there are 11 bits of integer and 21 bits of decimal in the inputs and output. Our reasoning behind 11.21 for the fixed point representation was that there were ample integer bits, so as to avoid overflowing during computation. For the purposes of this project, we felt the ranges of integer and floating point provided good resolution. This module is instantiated in our top level code for use in calculating the necessary vertical distances for diagonal lines.
rules
rules
is a clocked module which applies a specific update rule to an input character.
Specifically it takes
in an 8-bit character, and, depending on the specified L-system, will convert the input to an up to ten
character output string. To interpret the character inputs properly, the alphabet of all valid characters for
the available L-systems is defined with local parameters. The module operates by sitting in an initial wait
state until a valid signal is asserted, indicating that the input is valid. The module then transitions to the
appropriate update state depending on the input L-system signal. Within the specified state, only the rules of
that system are available. If there exists a rule for that character, the corresponding conversion will be
written to the result. If not, then the character is written to the result with 72 zeros appended to the end
to fit the ten character buffer size.
Figure 2: Diagram of the Finite State Machine used to apply rules to each character.
This scheme makes it fairly easy to add new L-Systems. We merely have to expand the alphabet and add an extra state to handle this new case. Below are examples of some of the systems we generated. We currently have 7 systems encoded. They are the dragon curve, two versions of the Sierpinski arrowhead, Koch curve, Koch snowflake, cross, and tessellated triangle.
Figure 3: The Dragon Curve with 11 iterations (the largest possible in our implementation), with a length of 4, graphed in the center of the screen (300,300).
Figure 4: Sierpinski Arrowhead (2), with 6 iterations, a length of 7, graphed in the center of the screen (300,300).
create_system
create_system
in particular does the heavy lifting, instantiating two
dual_clock_ram
modules and
one rules
module in order to receive a 32-bit axiom, perform the calculations necessary to convert that starting string
into a set of characters representing a fractal of that requested L-System, and output the entire result
string byte by byte to the top level.
create_system
requires a plethora of input and output values in order to function. The typical
one bit clk
and reset
signals are inputs to the system, dictating when the internal FSM is allowed to restart
and at what
frequency it can run. system_input_string
is the starting axiom, limited to 32 bits due to the
maximum width
of the PIO ports sending over this string from the HPS. A three bit signal lsystem
chooses which
of the seven
L-Systems to calculate and graph, and four bit value iterations
provides the number of recursive
iterations
the L-System will be required to make. As the module hops from state to state,
system_output_string
is set to
a byte of the outputted result in order to somewhat parallelize the math being performed with the graphing
being done. iterations_counter
is another output, this time four bits; it’s updated to be equal
to the
iteration create_system
is on so as to let the top level know when it’s appropriate to graph.
The handshake interface between create_system
and the top level FSM for graphing worked with
four signals:
top_rdy
, top_graphing
, system_val
, and system_done
. Those
outputted by the top level—top_rdy
and
top_graphing
—were received as inputs, indicating that the top level was ready for the next byte
to be graphed
and in what stage of graphing the top level was in, respectively. system_val
and
system_done
are outputs, set
appropriately to show that there’s a new valid output from create_system
and that
create_system
has completed
its L-System entirely.
Two dual_clock_ram
modules are instantiated to hold the L-System as it passes through every
iteration. The
first loop through the L-System string is performed on the axiom, which is a hard-coded value set by the user
through the HPS. The expanded result string, created by running each character through the rules
module, is
then stored in an M10K block connected to one of the dual_clock_ram
modules (let’s call it A).
The next
iteration takes this processed result string from A, runs each character through the rules
module, and stores
the new result in an M10K block created by the other dual_clock_ram
module (designated as B).
From then on,
every iteration of create_system
will read the previous iteration’s result from one
dual_clock_ram
and write
the result to the other, alternating between A being read and B being written to B being read and A being
written. Considering that iterations_counter
—the register used to keep track of the
iterations—starts at zero,
here’s a simple rule of thumb: every even iteration writes to A and reads from B (except in the case of the
zeroth iteration, which reads from axiom), and every odd iteration writes to B and reads from A. The
dual_clock_ram
modules will be referred to as A and B from now on.
Aptly named rule
, an instance of the rules
module is connected to the appropriate
inputs and internal
registers. It needs the lsystem
input to choose the correct L-System rules to apply to each
character of the
L-System string, the character in question (stored in input_char
), and a valid signal
rule_val
to
indicate the
next character is ready to be processed. It outputs a maximum 80 bit value to rule_result
after
applying the
relevant L-System rules to input_char
, and a done signal rule_done
used to
acknowledge that it’s
done
processing.
The mechanism propelling the creation of the requested L-System is composed of 13 states, as seen in the FSM
below. The state logic and transitions are handled within an always block clocked on the positive triggering
edge of the clk
signal.
Figure 5: Diagram of the create_system Finite State Machine.
The reset
signal is active high and is sent across the PIO ports from the HPS once the user has
configured
all of the desired characteristics of the L-System. On reset, all of the internal registers (most notably the
read/write addresses and write enables of both A and B, and the system_done
signal) are cleared
and
system_input_string
is loaded into the axiom
register. The defined
RESET_SYSTEM
does this as well. The
difference between the actions triggered by the reset
signal and the RESET_SYSTEM
state is that
the former
only occurs once at the very beginning of the program, and the latter is what the FSM returns to after
completing an L-System and waiting for the next request from the user.
Both the reset
signal condition and RESET_SYSTEM
transition to
CLEAR_M10KS
(though the latter
waits until
reset is high again, indicating the new L-System characteristics have been sent from the HPS). Since the
L-System strings during and between iterations are stored in M10K blocks, it’s necessary to clear them before
embarking on the next L-System. Until the write address values for A and B have reached the last possible
memory address 0h1FFF (8191 in decimal), both write enable signals are set to one and the data values set to
zero. This will result in every byte of the M10Ks being set to 8’b0. The state stays within itself until this
process is complete, then moves on to GET_CHAR
.
GET_CHAR
is responsible for grabbing the next character in the string to pass into the
rule
module. However,
it must wait until the top level is ready (equivalent to the top level finishing graphing the previous
character sent over from create_system
) so it doesn’t set the value of input_char
for the rules
module until
the top_rdy
signal is set to one. When this condition is met, how the state proceeds depends on
what iteration
the system is on, shown in iterations_counter
. If the system is on the zeroth iteration, the
hardcoded axiom
is being processed so input_char
is the least significant (rightmost) byte of axiom
and the FSM can
immediately transition to COMPUTE_DRAGON
. If the system hasn’t reached the inputted value of
iterations, it
proceeds to READ_M10K
to read a byte from the string written to either A or B’s M10K memory. If
the system has
performed the required number of iterations, the FSM goes to DONE
.
READ_M10K
chooses which set of M10Ks to read from based on whether
iterations_counter
is even or odd. Based
on the setup of the dual_clock_ram
, there’s no read enable signal; instead it takes three cycles
to read the
requested address in the M10K. Originally, we thought this meant our FSM would require some number of buffer
states to ensure three clock cycles passed before the value was ready to be used, but the fact that our FSM is
13 states means that enough time passes between individual characters being processed to absorb those three
clock cycles. We take advantage of this by incrementing the read address for the appropriate
dual_clock_ram
module within READ_M10K
based on whether iterations_counter
is even or odd. Since
this is the
only state to
change the value of the read addresses, A and B will read at their corresponding incremented address and the
value at that address will be ready to grab the next time the FSM goes to READ_M10K
.
Whether from GET_CHAR
directly or READ_M10K
, the next state of the FSM is
COMPUTE_DRAGON
. Don’t
let the name
fool you, as this state is responsible for getting the resulting output from applying any of the seven
L-System rule configurations, not just that of the dragon curve. The rule_val
signal is set to
tell rule
that
another character is on its way, and rule_result
is stored in another register to be shifted in
future states
of the FSM. While the done signal rule_done
is equal to zero, the FSM stays in
COMPUTE_DRAGON
,
waiting for the
rule
module to finish processing the character. When this isn’t the case, it’s time to write the
result to the
M10Ks: A through state WRITE_M10K_A
if iterations_counter
is even, or B through
state
WRITE_M10K_B
if
iterations_counter
is odd.
WRITE_M10K_A
and WRITE_M10K_B
are identical, save for the fact that they’re dealing
with two different
dual_clock_ram
modules. As each memory location in the M10Ks is a byte wide, we take the most
significant
(leftmost) byte of rule_result_reg
(note that this is not rule_result
) and load it
into either A or B’s data
register as well as system_output_string
. Every rule applied to the characters can output a
maximum of ten
bytes from a single byte input, so it takes multiple cycles to write the entirety of
rule_result_reg
to the
M10Ks. The simplest way to keep track of which bytes had been written back and which ones hadn’t is by
shifting out each byte of rule_result_reg
after it’s written to its appropriate location in the
M10Ks. Hence,
we know the entirety of the output from rule is stored in memory once rule_result_reg
equals 80
bit zero; in
this case, the write enable is set low so as to not accidentally write to any extra memory locations,
system_val
is set low to indicate that there isn’t a character to be graphed by the top level,
and the FSM
transitions to NEXT_BYTE
. If there are more bytes in rule_result_reg
to write to the
M10Ks, the
write enable
is set high and the top_rdy
is checked. top_rdy
lets create_system
know
when the top
level has
finished
handling the previous character sent to it and is ready for the next. When top_rdy
is high,
system_val
is set
high in response to say “Hey, we have a new character ready for you!” and the FSM transitions to
INCREMENT_WRITE_A
(or INCREMENT_WRITE_B
when in WRITE_M10K_B
, writing
to the B dual_clock_ram
).
Otherwise,
system_val
is kept low and the FSM stays within WRITE_M10K_A
.
The goal for this val/rdy handshake is to not let one part of the system get ahead of the other. We don’t
want create_system
to continue chugging along with processing and storing the result string
before the top
level is done with the previous character. We also don’t want the top level to graph a character that isn’t
ready yet, lest it mess up the final design of the L-System.
Much like WRITE_M10K_A
and WRITE_M10K_B
, INCREMENT_WRITE_A
and
INCREMENT_WRITE_B
are the same except for what
dual_clock_ram
they’re interfacing with. This state is responsible for incrementing either A or
B’s write
address so that the next time the FSM is in WRITE_M10K_A
or WRITE_M10K_B
it knows to
write to the next
available location in memory. It’s also responsible for shifting rule_result_reg
. Since the
baseline Python
code looped across each string from left to right, so does this Verilog implementation by shifting the
leftmost byte out. The FSM immediately transitions back to WRITE_M10K_A
or
WRITE_M10K_B
.
After the entirety of rule_result_reg
has been stored in the appropriate
dual_clock_ram
module,
NEXT_BYTE
is
responsible for shifting axiom
in a similar fashion to rule_result_reg
. As described
earlier,
axiom
holds the
32 bit hardcoded axiom for the L-System and is the starting point for the zeroth iteration of
create_system
.
Once every character of axiom
has gone through the rules
module, it can be shifted
out because
there is no use
for it anymore—its result is all the system needs to remember about it, and that’s already been written to the
M10Ks. In any iteration besides the zeroth, this acts more like a buffer state than anything else, but it also
serves to hold the system_val
and system_done
signals low, which are easy to lose
track of.
The next state is INCREMENT_ITER
; as the name implies, it’s responsible for incrementing the
value of
iterations_counter
once the entirety of the string has been processed. There are many checks that
must be done
before this simple task can happen, since there are multiple cycles within the FSM and the iterations
themselves are cycles. First, system_output_string
must be zero, axiom
must be zero,
and
top_graphing
must be
two. system_output_string
being zero means that the last value of rule_result_reg
is
zero and the
entirety of
the string has been processed and shifted out, with system_output_string
lagging and grabbing an
extra byte of
rule_result_reg
in WRITE_M10K_A/B
while waiting to go to NEXT_BYTE
.
axiom
is zero once the zeroth
iteration
has reached its end, so this check makes sure that the system has at least gotten through that.
top_graphing
is an input from the top level showing what state of graphing the top level is in: 0 before a character is
graphed, 1 during graphing, and 2 post-graphing. A value of 2 means that the last character sent over by
create_system
has been fully graphed. The next check depends on iterations_counter
:
if it’s zero,
no
additional logic is needed; if it’s even, B needs to be reading a value of zero; and if it’s odd, A needs to
be reading a value of zero. This is a check to see if there are more bytes to be read and processed from the
M10Ks (hence why on the zeroth iteration this doesn’t require any logic) since all bytes in the M10Ks
following the last byte of the processed string will be zero (as promised by CLEAR_M10KS
). If
those hurdles
are cleared, it’s time for iterations_counter
to be incremented and the write addresses for both
dual_clock_ram
modules to be zeroed out. This ensures that the next iteration will write the
result from the
top of the M10Ks, overwriting the previous iteration and even going past that (since every iteration grows the
string in length). The next state will be ZERO_READ
if all of the conditionals are true,
otherwise no logic is
performed at all and the FSM heads back to GET_CHAR
to start the processing and storing of the
next character
in the string.
ZERO_READ
is also self-explanatory: it zeroes the read addresses for the
dual_clock_ram
modules.
The read
address for B is only set to zero if the system is on an even-numbered iteration, otherwise the read address
for A is set to zero. This state transitions to GET_CHAR
.
The last state of the FSM is DONE
. Its primary function is to set the system_done
signal to one
in order to
let the top level know not to expect any more characters from create_system
as it has completed
the requested
number of iterations over the L-System string. The read/write addresses for A and B are set to zero to prepare
to handle the next L-System after heading back to RESET_SYSTEM
.
Graphing
Our top level, DE1_SoC_Computer.v
, serves as the bridge between the user inputs and the
create_system
module.
In addition, it holds the graphing finite state machine.
Writing to the VGA screen utilizes three signals: vga_sram_write
, vga_sram_address
,
and vga_sram_write
is a one bit write enable signal—if it’s 1, the VGA will be written to. vga_sram_address
is a
32 bit value attributed to an address in the space of the screen, where the base address is 32’b0.
vga_sram_writedata
is an eight bit value corresponding to the data to be written at
vga_sram_address
,
specifically the color of the pixel located at that address.
The output PIO ports (from the ARM to the FPGA) allow for user interaction through the terminal to change the
lsystem
, axiom
, number of iterations, length and color of lines, as well as the
initial coordinates, start_x
and start_y
. There is also an output PIO port that sends the reset
signal, driven by
the HPS. In
addition, an
input PIO port sends timer_counter
from the FPGA to the ARM in order to calculate the timing
information of
the system.
We instantiate a module of create_system
to send over required values. Particularly, the
top_rdy
and
system_val
wires allow for a handshake between the modules, so that the top level FSM does not
start graphing
too early.
Figure 6: Annotated diagram of an equilateral triangle, to demonstrate our calculations for 60° diagonal lines.
There is also an instantiation of signed_mult
, called triangle_mult
, which
calculates the target pixel
required for the 60° based diagonal lines. Remembering trigonometry, we know that the horizontal step size is
length
/2, which can be calculated by shifting length
by 1. However, the vertical
step size is
sqrt(3)length
/2,
which cannot be so easily calculated with bitwise operations. Thus, we use 11.21 fixed point multiplication.
The inputs to this module are root_three
(sqrt(3) represented in 11.21 fixed point), and
x_triangle_length
,
the 11.21 fixed point version of length
>> 1. The output is y_triangle_length
. For
use in
graphing to pixels,
the integer bits [30:21], are used in calculations. These bits omit the signed bit in the fixed point integer,
which causes a limitation in calculation values off off the screen in the negative direction.
Figure 7: Diagram of the top level (graphing) Finite State Machine.
The 10 state top level FSM is triggered on the positive rising edge of the 50MHz clock. We consider this FSM
to be modular—as in, some states are chosen based on which L-System is being graphed.
timer_counter_reg
is
incremented in every state except for TOP_RESET
and TOP_DONE
, to provide timing
information for the whole
system.
The first state is TOP_RESET
, which is entered at the reset
signal from the HPS. In
this state,
angle_increment
is set based on what is required for the lsystem
entered. If the
reset
signal is
sent again,
the FSM will stay in this state. Otherwise, it will move onto TOP_WAIT
.
In TOP_WAIT
, the system_done
wire from create_system
is checked. If it
is high, the
FSM moves
into the
TOP_DONE
state. If not, there are two conditions. If the system_val
wire is low, the
top_rdy_reg
is set high,
and the FSM stays in TOP_WAIT
. If system_val
is high, top_rdy_reg
is
set low,
system_output_string_reg
is set
based on the system_output_string
wire from create_system
, and the FSM moves into
TOP_SETUP
.
The TOP_SETUP
state is used to make sure all relevant registers have the most updated values. In
particular,
top_char_reg
is set to system_output_string_reg
, so that graphing rules can be
applied to this character in
the following state. The FSM then moves on to TOP_TARGET
.
In TOP_TARGET
, either the target pixels (targetx_reg
and targety_reg
)
or the
direction of drawing (angle_reg
)
are set based on what top_char_reg
is. For the drawing characters (F and A),
targetx_reg
and
targety_reg
are
set based on angle_reg
; If angle_reg
is 0 or 180 (vertical) or 90 or 270
(horizontal), either targetx_reg
or
targety_reg
are incremented by length
. If angle_reg
is 60, 120, 240, or
300, a
diagonal line is needed, and
the target coordinates are instead incremented using the step sizes calculated using the
signed_mult
module
(x_triangle_length
and y_triangle_length
). If top_char_reg
is either +
or -, angle_reg
is incremented or
decremented by the angle_increment
value set in TOP_RESET
. In cases where the
resulting angle would be set to
360, it is reset to 0 for ease in future calculations. Next, the FSM goes into TOP_BOUND_CHECK
.
TOP_BOUND_CHECK
chooses which graphing state to go to based on what lsystem
is. It
also chooses
to skip the
graphing states if the target coordinates are outside of the screen, and go directly to
TOP_SHIFT
.
All three graphing states do an additional check on the bounds, and set the write enable,
vga_sram_write
low
in the case of an offscreen pixel. This allows for a line with target coordinates outside the graphing region
to draw up until the edges of the screen.
TOP_GRAPH_DRAGON
is named after our first successfully implemented lsystem
, the
dragon curve,
and is the
graphing state for purely horizontal and vertical lines. In this state, either x_reg
or
y_reg
is
incremented
in each cycle until the graphed coordinates match the target coordinates. To graph to the screen,
vga_sram_address
is set to the desired coordinates, vga_sram_writedata
is set to the
color chosen
by the user,
and vga_sram_write
is set high. Once the coordinates match, the FSM will move on to
TOP_SHIFT
.
However, if
either the x_reg
or y_reg
value exceeds (2^32)-1 (the maximum value that can be held
in the 32 bit addresses),
the FSM will immediately enter TOP_DONE
.
TOP_GRAPH_TRIANGLE_X
and TOP_GRAPH_TRIANGLE_Y
are the graphing states used for
L-Systems that require
diagonal lines. These states cycle between each other to graph 2 pixels horizontally and 5 pixels vertically,
and move on to TOP_SHIFT
once the graphed coordinates has passed or is equal to the target
coordinates. This
is a limiting way to graph diagonal lines, as it requires that the minimum side length is at least 7 to result
in a line that is reasonably diagonal. However, constrained by the pixels on the screen, we found this to be a
fairly accurate way to draw lines with our desired slope.
The TOP_SHIFT
state sets the top_graphing_reg
to 2. This is a wire checked by
create_system
before
incrementing iterations_counter_reg
. After this state, the FSM returns to TOP_WAIT
.
The final state in the top level graphing FSM is TOP_DONE
. At this point, all graphing for the
inputted
L-System has been completed, and the final value of timer_counter_reg
is used for the timing
calculations. The
FSM will stay in this state until the reset
signal is sent by the HPS to bring the FSM back to
TOP_RESET
.
User Interface
As previously mentioned, the HPS handles user inputs as well as printing valuable information to the terminal
window. This code can be found in graphics.c
. Here is an overview of the PIO ports and the
associated actions
done by the HPS.
lsystem
: before asking for a user input for this 3-bit output PIO port, the HPS prints out the
available
L-Systems and the numbers associated with them. If the user enters an invalid character, the prompt is
replayed.
axiom
: Once a valid L-System has been chosen, the HPS asks for a user input for the 32-bit
output PIO port,
axiom. Before asking for it, however, the HPS prints some key information to enhance usability, including: the
default axiom for the chosen L-System, the rule-making characters in that L-System, which characters enable
drawing for that L-System, and finally, all available characters for use in the axiom. This information allows
the user to either input the default axiom (without having to look it up), or create their own unique axiom.
iterations
: This 4-bit output PIO port holds the user inputted value for the number of
iterations. Table 3
shows our results for the largest number of iterations for each curve that can be supported by our design.
Table 3: Largest number of iterations for each curve that can be drawn in our implementation.
length
: This 5-bit output PIO port holds the user inputted value for the length of each line.
This value’s
bounds are checked before moving on to the next user input. The maximum value of length is limited by the size
of the PIO port, so it must be less than 31. While that is the maximum allowed, the user must be cognizant
that lower values are often needed to project the curve within the graphable region of the screen. The minimum
length varies based on the chosen lsystem
. For L-Systems that rely on purely horizontal and
vertical lines,
the minimum length is 3, in order to properly visualize squares. For L-Systems that contain diagonal lines,
the minimum length is 7, in order to properly visualize the slope of the diagonals.
color
: This 8-bit output PIO port holds the user inputted value for the color of the L-System.
The HPS prints
out the available 8-colors that have been hardcoded. If the user input does not match any of the options, the
system defaults to white (0xff).
start_x
and start_y
: These are 10-bit output PIO ports that hold the initial
coordinates for graphing.
start_x
is the horizontal coordinate, and is bounded by the left and right sides of the screen (1
to 638).
start_y
is the vertical coordinate, and is bounded by the top and bottom of the screen (1 to
478).
reset
: this 1-bit output PIO port allows the reset
signal to be sent to the ARM.
Once the screen
is cleared,
the user can press any key to toggle this signal and begin the graphing process.
timer_counter
: This 32-bit input PIO Port receives this value, which was incremented during
every cycle of
the top level FSM (except during the states TOP_WAIT
and TOP_DONE
). This value is
then scaled from the number
of cycles of the 50MHz clock to a recognizable time in seconds, which is printed to the terminal screen.
Once the L-System has been graphed, the user is given the option to either zoom in, zoom out, pan to new
coordinates, or begin a new L-System. The first three options modify either the length
or the
starting
coordinates on the C-side of the program, and send the new values to the ARM, where the modified curve is
re-calculated and graphed again.
Testing
In the past we’ve suffered from biting off more than what we can chew, aka building a large chunk of the project before testing. This wasn’t a trap we’d be able to fall into this time, especially considering that this project is our own ideation—there’s no reference code to rely on when debugging. With this in mind, we worked through every module carefully and tested thoroughly along the way. We also initially focused our efforts on solely the dragon curve, as the 90 degree increments made for more straightforward graphing.
The first hurdle was the rules
module. There were multiple ways to structure it: one module per
L-System to house
all of its specific rules, one module per character that would select between rules for each L-System, one
all-encompassing module with every single rule for every single L-System. Ultimately we chose the last design,
as we realized that it only required a single input wire to be able to toggle between rule implementations for
different L-Systems. Once that had been decided, we implemented the module as a case statement outputting to a
statically-sized register. Though we knew we wanted to utilize M10Ks for their storage capabilities, we’ve had
bad experiences with them in the past and therefore wanted to hammer down our core functionality before
complicating it with reading/writing to memory.
We used ModelSim to verify the outputs and intermediate values. The output register was simple to verify as
it simply needed to be compared to the expected output string of the L-System rule. This result wasn’t being
sent byte-by-byte either, so there were no timing issues to consider. We did run into the issue of the
characters not being accepted by the case conditions and instead falling into the default case, a behavior we
noted through displaying the rule_done
signal (which, rather than being a single bit value to
toggle between
zero and one, was a two bit signal that was set to a different value in every case). This was solved by
defining the ASCII codes for each character as strict eight-bit localparams and constraining the module to
only accept a single character byte at a time.
create_system
was originally broken into two modules: create_system
and
process_string
. Much
like the
baseline code, these separate modules iterated over a string and performed multiple iterations on a string,
respectively. We decided to combine them into a single module once we began to tackle the issue of graphing
via the top level. It wouldn’t be feasible to send over the entire result string from
process_string
to the
top level because we wouldn’t know how large it was. Sending the result string byte by byte, with the top
level graphing at the same time as the computations, was a viable option, but it would be redundant for both
create_system
and process_string
to be handling strings byte by byte. Even more
critically, we needed to be
reading and writing to the M10Ks with every iteration, but we weren’t sure about how to have multiple modules
safely interact with the same M10Ks in parallel. Therefore, we combined the two, resulting in a rat’s nest of
FSM states and signal wires that we had to debug in ModelSim.
Before the M10Ks were in use, the result string of the overall system was limited to two bytes, hardly enough
to graph a robust L-System. While our code was operational with those limitations, there was no telling what
could happen once we allowed it to perform multiple iterations on a larger axiom. Writing to the M10Ks takes
multiple cycles per character since we have to write each byte to its own individual address, and reading from
the M10Ks requires at least three clock cycles as a buffer. ModelSim was extremely useful when debugging the
WRITE_M10K_A/B
, INCREMENT_A/B
, and READ_M10K
states because we could
not only see what data was being
read/written but also the address at which this was occurring. Being able to see these signals informed us
that we didn’t need a separate buffer state for reading, as well as what logic should go into the transitions
out of WRITE_M10K_A/B
.
Initially the axiom was hardcoded in the Verilog to not have to worry about how the terminal serial input was
being sent over the PIO ports to the FPGA. After we began to connect the C code with the Verilog, a few
confusing bugs arose. First was the issue of the order of characters in the axiom: when hardcoded, we treated
the axiom as a normal English string and processed it from left to right, but when looking at the output
signals in ModelSim we realized that they were flipped with respect to our expected values. Somehow the 32
bits of the axiom are reversed before the FPGA starts its calculations (we’re still unsure as to why this
happens, but attribute it potentially to the way memory mapping happens in the C code). In the interest of
time, we didn’t dig into the root of this problem but instead patched it by grabbing the least significant
byte of the axiom and shifting each byte out to the right, effectively reading the character string from right
to left. Another fun behavior to debug was the appearance of unexpected ones in our axiom
register as we
shifted out bytes. At first we thought it could be the way we were shifting, so we concatenated a zero byte to
a substring of the register, but that didn’t fix it. Upon closer inspection of the binary values, we realized
that the ENTER presses needed to send the serial input was being included in any input string shorter than
four bytes (the maximum length, as dictated by the PIO ports), and these enter presses didn’t have an ASCII
code of zero. Our solution was to initialize the character pointer for the axiom to four bytes of ASCII code
zero (equivalent to four spaces) so that if a byte wasn’t overwritten by the serial input, it wouldn’t affect
the register’s overall ability to be equal to zero (key in most of our transitions).
At the start of development, the modular nature of our system made it conducive to testing since all of the
signals were separated from each other and provided a good deal of information gain on their own. The
introduction of the val/rdy interface between the top level and rules.v
complicated our ability
to use
ModelSim for debugging purposes since the wire values were held for too long. In the context of our full
system, ModelSim was useful for debugging the small issues that arose in individual modules but wasn’t a
viable tool when the system required more than one iteration on the string.
This is when we pivoted to SignalTap as our primary debugging tool. Though it was disadvantaged in the sense that a full compile of the Verilog was needed before SIgnalTap would read the corresponding signals, it was able to show the waveform behavior of the real system, including the Verilog and the C code. A caveat with this is the space required to store all of the hardware signals. Our design uses a good deal of memory through the M10Ks, so there’s a limit to how many data samples SignalTap can acquire. This was a detriment when testing multiple iterations, so we had to use shorter axioms and a smaller number of iterations. The discrepancies in string generation ended up stemming from smaller issues within the modules (such as the ASCII code problem) but not being able to see all of the clock cycles for more than four or five iterations made debugging more difficult.
This was especially relevant for M10K debugging, as there was a time when it seemed as though the string
being written to memory was much longer than the expected string, a problem that was exacerbated in higher
iterations. We weren't sure if it was a problem with our M10K states or M10K memory being cleared. After
noticing that this appeared on L-Systems after the first one (so not immediately after the entire board was
reset), we moved our CLEAR_M10KS
state from the end of the FSM (after the DONE
state) to the beginning.
Additionally, we realized that as we grew the M10Ks we didn’t adjust the last address to which the M10K needed
to be cleared, so CLEAR_M10KS
was only clearing part of memory (a pretty big problem since the
number of
iterations is incremented when an address holding a zero is reached in the M10Ks). This is a problem we had
throughout our usage of the M10Ks, as originally they were coded to have 256 memory address (meaning each
address needed eight bits) and when we expanded them to 512, 1024, and eventually 8192 we didn’t always adjust
the bit widths of the read/write memory addresses.
In regard to the top level, we were able to use the printouts to the VGA screen to debug our graphing;
particularly, this was helpful as we began to implement diagonal lines. We had decided that we would use
fixed-point signed multiplication to calculate our target pixel, and use incremented horizontal and vertical
pixels in an approximate ratio to draw lines that would appear diagonal. Our setup included separate states
for horizontal and vertical drawing, and they would alternate between each other based on a counter variable
(xcounter
and ycounter
). Other than a brief stint where we were accidently assigning
50-bits to one of our
32-bit fixed point variables (which caused diagonal lines to wrap all across the screen), this was mostly
smooth sailing. Eventually, we realized that our lines were not exactly the right slope for a 60° line, and
determined that we had swapped the bound check for xcounter
and ycounter
.
Overall, the focus on developing modular functionality made debugging a more approachable task. ModelSim and SignalTap also made looking closely at every single input, output, and intermediate value only slightly frustrating.
Future Work
There are several things that could be improved or expanded upon in future implementations of this project. First and foremost, we would refactor the top level coordinate calculations to include signed rather than unsigned variables. While graphing to the screen only requires positive pixel values, we found that calculations outside of the range of the screen would increase graphing robustness, as it would allow for the fractal to return to the screen if it curves back into the graphable region. Currently, our implementation supports calculations involving positive pixels outside the graphable region (i.e. the bottom and right side of the screen), but we encounter overflow for pixels in the negative direction (i.e. the left side and top of the screen).
Furthermore, in order to graph more robust diagonal lines, Bresenham’s line algorithm could be implemented. This algorithm would create smooth diagonal lines, and allow us to explore more L-systems based on different angle turns (such as 45° or 36°).
Figure 8: (A, top) Levy Curve, based on 45° angle turns. (B, bottom) Pentaplexity, based on 36°angle turns.
We would also be interested in exploring further user-customization of L-Systems. While the current system allows for customizable axioms, this could be expanded to allow users to create unique L-Systems with custom rules and customizable angle turns. This would provide ease for researchers (or other people interested in fractals) to develop new L-Systems.
Results
The following are some of the performance metrics of our system. Across the board, the FPGA exhibits a faster computation time than the baseline Python script. For the maximum number of iterations on each of our 6 L-systems, the worst speedup was 5.1x, and the best was 77.8x with an average of about 20.04x. This is due to a number of factors including the line length and growth rate.
Figure 9: Execution time per character vs. Number of Iterations. Comparing the performance between the Python and FPGA when graphing the Tessellated Triangle.
Let’s take a look at this graph above. This chart shows the execution time per character for the Python and FPGA calculating the tessellated triangle. It appears that for the FPGA, there is some initial overhead with the first few iterations. This overhead is nullified by the law of large numbers at higher iterations, allowing the FPGA to settle to around 200 us per character for both the length 7 and length 10 versions. However, this does not appear to be the case for the Python code. The length 7 code takes about 3.6 ms per character, while the length 10 code takes 17ms. Additionally, although this is merely an observation, generally, the less iterations it takes to fill up the M10K blocks, i.e. a higher growth rate, the larger the disparity between the Python and FPGA.
Figure 10: The Dragon Curve with 11 iterations (the largest possible in our implementation), with a length of 4, graphed in the center of the screen (300,300).
Figure 11: Execution time per character vs. Number of Iterations. Comparing the performance between the Python and FPGA when graphing the Dragon Curve
Figure 12: Sierpinski Arrowhead (1) with 5 iterations with a length of 10, graphed starting at (300,450).
Figure 13: Execution time per character vs. Number of Iterations. Comparing the performance between the Python and FPGA when graphing Sierpinski Arrowhead (1).
Figure 14: Cross with 4 iterations with a length of 10, graphed starting at (200,100).
Figure 15: Execution time per character vs. Number of Iterations. Comparing the performance between the Python and FPGA when graphing Cross.
Figure 16: Koch Snowflake with 6 iterations with a length of 7, graphed starting at (600,400).
Figure 17: Execution time per character vs. Number of Iterations. Comparing the performance between the Python and FPGA when graphing the Koch Snowflake.
Figure 18: Koch Curve with 6 iterations with a length of 7, graphed starting at (100,450).
Figure 19: Execution time per character vs. Number of Iterations. Comparing the performance between the Python and FPGA when graphing the Koch Curve.
Figure 20: Tessellated Triangle with 6 iterations (the largest possible in our implementation), with a length of 30, graphed in the center of the screen (300,300).
Figure 21: Execution time per character vs. Number of Iterations. Comparing the performance between the Python and FPGA when graphing the Tessellated Triangle.
The M10K blocks are a limiting factor in our generation of L-Systems. As shown in these graphs, not all L-Systems are equal in terms of how fast they grow. The maximum number of characters is 8192. However, due to exponential growth, some of the largest systems we can generate are significantly less than this threshold. Additionally, we are sometimes inhibited by the negative value computation difficulty mentioned earlier. Below is a chart of the L-Systems and their maximum size on our system. Generally, there are few differences between the FPGA for edges of length 7 and length 10. For the python, it takes longer to generate the length 10 versus length 7 images. Within the results, there is only one instance of the negative values causing the FPGA to have a lower number of iterations for the length 10 graph than the length 7 graph. In all other cases, the max number of iterations is set by the M10K blocks.
A full working demo of our entire system can be found below:
Conclusion
Our original idea was to be able to animate the drawing of a fractal to the VGA screen. The user would be able to input the number of iterations and the length of the step size. They would then choose between generating a dragon curve and a fractal plant. We also envisioned adding randomness to the L-system generation. This would have been done by selecting a random rule for the given character out of a set of applicable rules, or choosing no rule and passing the character. Finally, we wanted to be able to generate multiple L-Systems simultaneously. We thought it would be interesting to have a group of trees on the screen with their own unique attributes.
Our final system is a more user driven experience. The user is allowed to select from a variety of different
L-Systems, input a desired axiom, the number of iterations, the line length, line color, and the starting
position on the screen. These values are then passed to the FPGA. The rules
module is called on
the axiom,
transforming it one character at a time based on the desired L-System. This is then fed into one of the
dual_clock_ram
modules. The create_system
module will then ping-pong between the
dual_clock_ram
modules until
the iteration threshold is reached. The values in the M10K are then passed to the top level for graphing.
Based on the L-System, the top-level will determine the angle and coordinates of every line segment, write the
pixels to the VGA. Once this is complete, the system will be ready to accept another user input.
Unfortunately, the system is not perfect. We encountered limits on graphing in terms of the number of characters we could hold and the appearance of lines on the screen. Specifically, the M10K blocks have a finite size we cannot exceed. If we were to do this project again, we would more thoroughly investigate the size of the M10K blocks we could use in relation to access time. On the graphing end, we would implement signed registers rather than unsigned registers to mitigate the negative value issue. We would also experiment with different diagonal line drawing algorithms, at high iterations, some lines do not perfectly connect with one another. Additionally, as a feature we wish we had implemented, we would want to attempt user defined rules. This would allow a user the creative freedom to explore any L-system they desired within the character limit for rules.
In terms of safety, this project was done entirely remotely. The boards were stored in a safe, grounded environment. Students connected to them through a remote desktop to assigned computers. This was done for the health and safety of everyone involved, but also reduced the likelihood of physical damage to the boards and computers. Regarding intellectual property, the initial python code is an open source program we found. We translated this into C for our proof of concept running on the HPS. Furthermore, the base computer system module we wrote our L-system accelerator around is not ours. The base system was taken from Bruce Land’s ECE 5760 website. We modified the base as necessary for the accelerator.
Appendices
Team
Priya Kattappurath
ECE '20, MEng '21
Michael Rivera
ECE '20, MEng '20
Caitlin Stanton
ECE '20, MEng '21
Appendix A: Permissions
The group approves this report for inclusion on the course website. The group approves the video for inclusion on the course YouTube channel.
Appendix B: Work Distribution
All members contributed to the development, debugging, and demoing of the baseline and full-system designs.
- Introduction - Caitlin
- High-Level Design - Caitlin
- Baseline Design - Caitlin
- Full System Design - Priya, Caitlin, Michael
- Testing - Caitlin, Priya
- Future Work - Priya
- Results - Michael
- Conclusion - Michael
- Commented Code - Caitlin
- Diagrams - Priya
- Graphs - Caitlin, Michael
Appendix C: References
- Wikipedia: L-System
- Wolfram Alpha: Lindenmayer Systems
- What are Fractals?
- Coding a Dragon Curve
- Coding a Fractal Plant
- Wikipedia: Dragon curve
- Drawing nice L-Systems
- Drawing nice stochastic plants
- ECE 5760: VGA examples
- Recursive Modules
- Avalon Bus Master
Appendix D: Code
Here are the major files referenced in this lab report. The entirety of our code repository can be found in this Github repository.
rules.v
//////////////////////////////////////////////////////////////////////// // RULES.V // Responsible for computing and storing the L-System // (Priya Kattappurath, Michael Rivera, Caitlin Stanton) //////////////////////////////////////////////////////////////////////// // Dual Clock RAM // Equivalent to instantiating ~8 M10Ks module dual_clock_ram(q, d, write_address, read_address, we, clk1, clk2); output reg [7:0] q; input [7:0] d; input [12:0] write_address, read_address; input we, clk1, clk2; reg [12:0] read_address_reg; reg [7:0] mem [8191:0]; //8192 memory locations always @ (posedge clk1) begin if (we) mem[write_address] <= d; end always @ (posedge clk2) begin q <= mem[read_address_reg]; read_address_reg <= read_address; end endmodule // Fixed point signed multiplier // Works in 11.21 fixed point module signed_mult (out, a, b); //11.21 fixed point output signed [31:0] out; input signed [31:0] a; input signed [31:0] b; // intermediate full bit length wire signed [63:0] mult_out; assign mult_out = a * b; // select bits for 11.21 fixed point assign out = {mult_out[63], mult_out[51:21]}; //11.21 fixed point endmodule // Rule module // Declares rules for 7 L-Systems: Dragon curve, 2 Sierpinski arrowheads, 2 Koch curves, cross, tessellated triangle // (Equivalent to applyRule in lsystem.py) module rules(clk, reset, lsystem, rule_val, rule_prev, rule_result, rule_done); input clk, reset, rule_val; input [2:0] lsystem; input [7:0] rule_prev; //8 bit input, defined by ASCII code output [79:0] rule_result; //up to 80 bit output output [1:0] rule_done; reg [7:0] rule_prev_reg; reg [1:0] rule_done_reg; reg [79:0] rule_result_reg; reg [2:0] rule_state_reg; assign rule_result = rule_result_reg; assign rule_done = rule_done_reg; //ASCII definitions of alphabet localparam [7:0] X = 8'b01011000; localparam [7:0] Y = 8'b01011001; localparam [7:0] plus = 8'd43; //+ localparam [7:0] minus = 8'd45; //- localparam [7:0] F = 8'd70; localparam [7:0] A = 8'd65; //FSM states localparam RULE_UPDATE = 3'b0; localparam DRAGON_TRANSLATE = 3'b1; localparam TRIANGLE_TRANSLATE = 3'd2; localparam ARROW_TRANSLATE = 3'd3; localparam KOCH_TRANSLATE = 3'd4; localparam SNOWFLAKE_TRANSLATE = 3'd5; localparam CROSS_TRANSLATE = 3'd6; localparam TESSELLATE_TRANSLATE = 3'd7; always @ (posedge clk) begin //reset state if (reset) begin rule_result_reg <= 80'b0; rule_done_reg <= 2'b0; rule_prev_reg <= rule_prev; rule_state_reg <= RULE_UPDATE; end else begin case (rule_state_reg) //waits for new character to apply rule to RULE_UPDATE: begin rule_prev_reg <= rule_prev; rule_result_reg <= rule_result_reg; if (rule_val) begin //if new character has arrived rule_done_reg <= 2'b0; //choose which state to transition to based on inputted lsystem if (lsystem == 3'b0) begin rule_state_reg <= DRAGON_TRANSLATE; end else if (lsystem == 3'b1) begin rule_state_reg <= TRIANGLE_TRANSLATE; end else if (lsystem == 3'd2) begin rule_state_reg <= KOCH_TRANSLATE; end else if (lsystem == 3'd3) begin rule_state_reg <= ARROW_TRANSLATE; end else if (lsystem == 3'd4) begin rule_state_reg <= SNOWFLAKE_TRANSLATE; end else if (lsystem == 3'd5) begin rule_state_reg <= CROSS_TRANSLATE; end else if (lsystem == 3'd6) begin rule_state_reg <= TESSELLATE_TRANSLATE; end else begin rule_state_reg <= RULE_UPDATE; end end else begin //stays in this state until a valid character is received rule_done_reg <= rule_done_reg; rule_state_reg <= RULE_UPDATE; end end //RULE APPLICATION STATES //If the inputted character matches a rule, output that string //Otherwise output the character //Buffered by zeroes to be 10 bytes (80 bits) //Dragon curve DRAGON_TRANSLATE: begin rule_prev_reg <= rule_prev_reg; if (rule_prev_reg == X) begin rule_done_reg <= 2'b1; rule_result_reg <= {X, plus, Y, F, plus, 40'b0}; //"X+YF+" end else if (rule_prev_reg == Y) begin rule_done_reg <= 2'd2; rule_result_reg <= {minus, F, X, minus, Y, 40'b0}; //"-FX-Y" end else begin rule_done_reg <= 2'd3; rule_result_reg <= {rule_prev_reg, 72'b0}; end rule_state_reg <= RULE_UPDATE; end //Sierpsinki arrowhead (1) TRIANGLE_TRANSLATE: begin rule_prev_reg <= rule_prev_reg; if (rule_prev_reg == X) begin rule_done_reg <= 2'b1; rule_result_reg <= {Y, F, plus, X, F, plus, Y, 24'b0}; //"YF+XF+Y" end else if (rule_prev_reg == Y) begin rule_done_reg <= 2'd2; rule_result_reg <= {X, F, minus, Y, F, minus, X, 24'b0}; //"XF-YF-X" end else begin rule_done_reg <= 2'd3; rule_result_reg <= {rule_prev_reg, 72'b0}; end rule_state_reg <= RULE_UPDATE; end //Sierpinski arrowhead (2) ARROW_TRANSLATE: begin rule_prev_reg <= rule_prev_reg; if (rule_prev_reg == A) begin rule_done_reg <= 2'b1; rule_result_reg <= {plus, F, minus, A, minus, F, plus, 24'b0}; //"+F-A-F+" end else if (rule_prev_reg == F) begin rule_done_reg <= 2'd2; rule_result_reg <= {minus, A, plus, F, plus, A, minus, 24'b0}; //"-A+F+A-" end else begin rule_done_reg <= 2'd3; rule_result_reg <= {rule_prev_reg, 72'b0}; end rule_state_reg <= RULE_UPDATE; end //Koch curve KOCH_TRANSLATE: begin rule_prev_reg <= rule_prev_reg; if (rule_prev_reg == F) begin rule_done_reg <= 2'b1; rule_result_reg <= {F, plus, F, minus, minus, F, plus, F, 16'b0}; //"F+F--F+F" end else begin rule_done_reg <= 2'd3; rule_result_reg <= {rule_prev_reg, 72'b0}; end rule_state_reg <= RULE_UPDATE; end //Koch snowflake SNOWFLAKE_TRANSLATE: begin rule_prev_reg <= rule_prev_reg; if (rule_prev_reg == F) begin rule_done_reg <= 2'b1; rule_result_reg <= {F, minus, F, plus, plus, F, minus, F, 16'b0}; //"F-F++F-F" end else begin rule_done_reg <= 2'd3; rule_result_reg <= {rule_prev_reg, 72'b0}; end rule_state_reg <= RULE_UPDATE; end //Cross CROSS_TRANSLATE: begin rule_prev_reg <= rule_prev_reg; if (rule_prev_reg == F) begin rule_done_reg <= 2'b1; rule_result_reg <= {F, plus, F, F, plus, plus, F, plus, F, 8'b0}; //"F+FF++F+F" end else begin rule_done_reg <= 2'd3; rule_result_reg <= {rule_prev_reg, 72'b0}; end rule_state_reg <= RULE_UPDATE; end //Tessellated triangle TESSELLATE_TRANSLATE: begin rule_prev_reg <= rule_prev_reg; if (rule_prev_reg == F) begin rule_done_reg <= 2'b1; rule_result_reg <= {F, minus, F, plus, F, 40'b0}; //"F-F+F" end else begin rule_done_reg <= 2'd3; rule_result_reg <= {rule_prev_reg, 72'b0}; end rule_state_reg <= RULE_UPDATE; end endcase end end endmodule // Create System module // Performs given number of iterations on the L-System string // Stores strings in memory // Sends result of each iteration byte by byte to the top-level // (Equivalent to createSystem and processString in lsystem.py) module create_system(clk, reset, top_rdy, top_graphing, lsystem, iterations, system_input_string, system_output_string, system_val, system_done, iterations_counter); //equivalent to createSystem() in translated_python.c input clk, reset, top_rdy; input [1:0] top_graphing; input [31:0] system_input_string; input [2:0] lsystem; input [3:0] iterations; output [7:0] system_output_string; output system_val; output system_done; output [3:0] iterations_counter; //ASCII definitions of alphabet localparam [7:0] X = 8'b01011000; localparam [7:0] Y = 8'b01011001; localparam [7:0] plus = 8'd43; //+ localparam [7:0] minus = 8'd45; //- localparam [7:0] F = 8'd70; localparam [7:0] A = 8'd65; localparam [7:0] open_bracket = 8'd91; //[ localparam [7:0] closing_bracket = 8'd93; //] //dual_clock_ram instantiations (M10K memory) wire [7:0] a_q; wire [7:0] a_d; reg[7:0] a_d_reg; wire [12:0] a_write_address; reg [12:0] a_write_address_reg; wire [12:0] a_read_address; reg [12:0] a_read_address_reg; wire a_we; reg a_we_reg; assign a_d = a_d_reg; assign a_write_address = a_write_address_reg; assign a_read_address = a_read_address_reg; assign a_we = a_we_reg; dual_clock_ram a( .q (a_q), .d (a_d), .write_address (a_write_address), .read_address (a_read_address), .we (a_we), .clk1 (clk), .clk2 (clk) ); wire [7:0] b_q; wire [7:0] b_d; reg[7:0] b_d_reg; wire [12:0] b_write_address; reg [12:0] b_write_address_reg; wire [12:0] b_read_address; reg [12:0] b_read_address_reg; wire b_we; reg b_we_reg; assign b_d = b_d_reg; assign b_write_address = b_write_address_reg; assign b_read_address = b_read_address_reg; assign b_we = b_we_reg; dual_clock_ram b( .q (b_q), .d (b_d), .write_address (b_write_address), .read_address (b_read_address), .we (b_we), .clk1 (clk), .clk2 (clk) ); //rules module reg rule_val_reg; wire [79:0] rule_result; reg [79:0] rule_result_reg; wire [1:0] rule_done; wire rule_val; reg [7:0] input_char_reg; wire [7:0] input_char; assign rule_val = rule_val_reg; assign input_char = input_char_reg; rules rule( .clk (clk), .reset (reset), .lsystem (lsystem), .rule_prev (input_char), .rule_result (rule_result), .rule_done (rule_done), .rule_val (rule_val) ); //FSM states localparam RESET_SYSTEM = 4'd0; localparam CLEAR_M10KS = 4'd1; localparam GET_CHAR = 4'd2; localparam READ_M10K = 4'd3; localparam COMPUTE_DRAGON = 4'd4; localparam WRITE_M10K_A = 4'd5; localparam INCREMENT_WRITE_A = 4'd6; localparam WRITE_M10K_B = 4'd7; localparam INCREMENT_WRITE_B = 4'd8; localparam NEXT_BYTE = 4'd9; localparam INCREMENT_ITER = 4'd10; localparam ZERO_READ = 4'd11; localparam DONE = 4'd12; reg [3:0] state_reg; reg [7:0] system_output_string_reg; reg [7:0] output_counter; reg [3:0] iterations_counter_reg; reg system_val_reg; reg system_done_reg; reg [31:0] axiom; reg [1:0] read_counter; assign system_output_string = system_output_string_reg; assign system_val = system_val_reg; assign system_done = system_done_reg; assign iterations_counter = iterations_counter_reg; always @ (posedge clk) begin //reset state, board startup if (reset) begin iterations_counter_reg <= 4'b0; input_char_reg <= 8'b0; a_d_reg <= 8'b0; a_write_address_reg <= 13'b0; a_read_address_reg <= 13'b0; a_we_reg <= 1'b0; b_d_reg <= 8'b0; b_write_address_reg <= 13'b0; b_read_address_reg <= 13'b0; b_we_reg <= 1'b0; axiom <= system_input_string; rule_result_reg <= rule_result; read_counter <= 1'b0; system_val_reg <= 1'b0; system_output_string_reg <= 8'b0; system_done_reg <= 1'b0; state_reg <= CLEAR_M10KS; end else begin case (state_reg) //reset state, new L-System RESET_SYSTEM: begin if (reset) begin iterations_counter_reg <= 4'b0; input_char_reg <= 8'b0; a_d_reg <= 8'b0; a_write_address_reg <= 13'b0; a_read_address_reg <= 13'b0; a_we_reg <= 1'b0; b_d_reg <= 8'b0; b_write_address_reg <= 13'b0; b_read_address_reg <= 13'b0; b_we_reg <= 1'b0; axiom <= system_input_string; rule_result_reg <= rule_result; read_counter <= 1'b0; system_val_reg <= 1'b0; system_output_string_reg <= 8'b0; system_done_reg <= 1'b0; state_reg <= CLEAR_M10KS; end else begin state_reg <= RESET_SYSTEM; end end //sets all M10K data to zero CLEAR_M10KS: begin a_d_reg <= 8'b0; b_d_reg <= 8'b0; if (a_write_address_reg < 13'h1FFF) begin a_we_reg <= 1'b1; a_write_address_reg <= a_write_address_reg + 13'b1; end if (b_write_address_reg < 13'h1FFF) begin b_we_reg <= 1'b1; b_write_address_reg <= b_write_address_reg + 13'b1; end if (a_write_address_reg == 13'h1FFF && b_write_address_reg == 13'h1FFF) begin a_we_reg <= 1'b0; b_we_reg <= 1'b0; a_write_address_reg <= 13'b0; b_write_address_reg <= 13'b0; state_reg <= GET_CHAR; end else begin state_reg <= CLEAR_M10KS; end end //grabs the next character in the string to process, either from the axiom or M10K memory GET_CHAR: begin a_d_reg <= 8'b0; a_write_address_reg <= a_write_address_reg; a_read_address_reg <= a_read_address_reg; a_we_reg <= 1'b0; b_d_reg <= 8'b0; b_write_address_reg <= b_write_address_reg; b_read_address_reg <= b_read_address_reg; b_we_reg <= 1'b0; rule_result_reg <= rule_result; read_counter <= 1'b0; system_val_reg <= 1'b0; system_output_string_reg <= system_output_string_reg; system_done_reg <= 1'b0; if (top_rdy == 1'b0) begin state_reg <= GET_CHAR; end else begin if (iterations_counter_reg == 4'b0) begin input_char_reg <= axiom[7:0]; state_reg <= COMPUTE_DRAGON; end else if (iterations_counter_reg < iterations) begin state_reg <= READ_M10K; end else begin state_reg <= DONE; end end end //Reads from either of the M10Ks depending on the iteration count //Requires 3 clock cycles with dual_port_ram READ_M10K: begin a_d_reg <= 8'b0; a_write_address_reg <= a_write_address_reg; a_read_address_reg <= a_read_address_reg; a_we_reg <= 1'b0; b_d_reg <= 8'b0; b_write_address_reg <= b_write_address_reg; b_read_address_reg <= b_read_address_reg; b_we_reg <= 1'b0; rule_result_reg <= rule_result; read_counter <= read_counter; system_val_reg <= 1'b0; system_done_reg <= 1'b0; if (iterations_counter_reg[0] == 1'b1) begin input_char_reg <= a_q; end else begin input_char_reg <= b_q; end state_reg <= COMPUTE_DRAGON; if (iterations_counter_reg[0] == 1'b0) begin b_read_address_reg <= b_read_address_reg + 13'b1; end else begin a_read_address_reg <= a_read_address_reg + 13'b1; end end //Sends character to the rules module to be processed COMPUTE_DRAGON: begin a_d_reg <= 8'b0; a_write_address_reg <= a_write_address_reg; a_read_address_reg <= a_read_address_reg; a_we_reg <= 1'b0; b_d_reg <= 8'b0; b_write_address_reg <= b_write_address_reg; b_read_address_reg <= b_read_address_reg; b_we_reg <= 1'b0; rule_result_reg <= rule_result; rule_val_reg <= 1'b1; read_counter <= 1'b0; system_val_reg <= 1'b0; system_output_string_reg <= system_output_string_reg; system_done_reg <= 1'b0; if (rule_done > 2'b0) begin if (iterations_counter_reg[0] == 1'b0) begin state_reg <= WRITE_M10K_A; end else begin state_reg <= WRITE_M10K_B; end end else begin state_reg <= COMPUTE_DRAGON; end end //Writes to M10K A memory on even iterations //Determines if entire result from rules has been written to memory //Sends result from rules byte by byte to top level WRITE_M10K_A: begin a_d_reg <= rule_result_reg[79:72]; a_write_address_reg <= a_write_address_reg; a_read_address_reg <= a_read_address_reg; rule_result_reg <= rule_result_reg; read_counter <= 1'b0; system_done_reg <= 1'b0; system_output_string_reg <= rule_result_reg[79:72]; if (rule_result_reg == 80'b0) begin a_we_reg <= 1'b0; system_val_reg <= 1'b0; state_reg <= NEXT_BYTE; end else begin a_we_reg <= 1'b1; if (top_rdy == 1'b1) begin system_val_reg <= 1'b1; state_reg <= INCREMENT_WRITE_A; end else begin system_val_reg <= 1'b0; state_reg <= WRITE_M10K_A; end end end //Increments write address for M10K A to write entire result to memory INCREMENT_WRITE_A: begin system_output_string_reg <= system_output_string_reg; a_write_address_reg <= a_write_address_reg + 13'b1; a_we_reg <= 1'b0; rule_result_reg <= rule_result_reg << 8; read_counter <= 1'b0; system_val_reg <= 1'b0; system_done_reg <= 1'b0; state_reg <= WRITE_M10K_A; end //Writes to M10K B memory on odd iterations //Determines if entire result from rules has been written to memory //Sends result from rules byte by byte to top level WRITE_M10K_B: begin b_d_reg <= rule_result_reg[79:72]; b_write_address_reg <= b_write_address_reg; b_read_address_reg <= b_read_address_reg; rule_result_reg <= rule_result_reg; read_counter <= 1'b0; system_done_reg <= 1'b0; system_output_string_reg <= rule_result_reg[79:72]; if (rule_result_reg == 80'b0) begin b_we_reg <= 1'b0; system_val_reg <= 1'b0; state_reg <= NEXT_BYTE; end else begin b_we_reg <= 1'b1; if (top_rdy == 1'b1) begin system_val_reg <= 1'b1; state_reg <= INCREMENT_WRITE_B; end else begin system_val_reg <= 1'b0; state_reg <= WRITE_M10K_B; end end end //Increments write address for M10K B to write entire result to memory INCREMENT_WRITE_B: begin system_output_string_reg <= system_output_string_reg; b_write_address_reg <= b_write_address_reg + 13'b1; b_we_reg <= 1'b0; rule_result_reg <= rule_result_reg << 8; read_counter <= 1'b0; system_val_reg <= 1'b0; system_done_reg <= 1'b0; state_reg <= WRITE_M10K_B; end //Shifts the axiom to indicate a character has been fully processed and written to memory NEXT_BYTE: begin axiom <= axiom >> 8; read_counter <= 1'b0; system_val_reg <= 1'b0; system_done_reg <= 1'b0; system_output_string_reg <= system_output_string_reg; state_reg <= INCREMENT_ITER; end //Increments the iterations_counter for the L-System //This is done: // - when there's nothing left of axiom and we're on the zeroth iteration // - a zero byte is read from M10K A on an odd iteration // - a zero byte is read from M10K B on an even iteration INCREMENT_ITER: begin read_counter <= 1'b0; system_val_reg <= 1'b0; system_done_reg <= 1'b0; if (system_output_string_reg == 8'b0 && axiom == 32'b0 && top_graphing == 2'd2) begin if ( iterations_counter_reg == 4'b0 || (iterations_counter_reg[0] == 1'b1 && a_q == 8'b0) || (iterations_counter_reg[0] == 1'b0 && b_q == 8'b0)) begin a_write_address_reg <= 13'b0; b_write_address_reg <= 13'b0; iterations_counter_reg <= iterations_counter_reg + 4'b1; state_reg <= ZERO_READ; end else begin state_reg <= GET_CHAR; end end else begin state_reg <= GET_CHAR; end end //Zeroes out the read addresses so that they'll start at the top of the M10K in the next iteration ZERO_READ: begin if (iterations_counter[0] == 1'b0) begin a_read_address_reg <= a_read_address_reg; b_read_address_reg <= 13'b0; end else begin a_read_address_reg <= 13'b0; b_read_address_reg <= b_read_address_reg; end state_reg <= GET_CHAR; end //Designated number of iterations has been performed on the L-System DONE: begin system_done_reg <= 1'b1; a_write_address_reg <= 13'b0; b_write_address_reg <= 13'b0; a_read_address_reg <= 13'b0; b_read_address_reg <= 13'b0; state_reg <= RESET_SYSTEM; end endcase end end endmodule
DE1_SoC_Computer.v
//////////////////////////////////////////////////////////////////////// // DE1_SOC_COMPUTER.V // Top level program // Responsible for receiving user input from the HPS and graphing // (Priya Kattappurath, Michael Rivera, Caitlin Stanton) //////////////////////////////////////////////////////////////////////// module DE1_SoC_Computer ( //////////////////////////////////// // FPGA Pins //////////////////////////////////// // Clock pins CLOCK_50, CLOCK2_50, CLOCK3_50, CLOCK4_50, // ADC ADC_CS_N, ADC_DIN, ADC_DOUT, ADC_SCLK, // Audio AUD_ADCDAT, AUD_ADCLRCK, AUD_BCLK, AUD_DACDAT, AUD_DACLRCK, AUD_XCK, // SDRAM DRAM_ADDR, DRAM_BA, DRAM_CAS_N, DRAM_CKE, DRAM_CLK, DRAM_CS_N, DRAM_DQ, DRAM_LDQM, DRAM_RAS_N, DRAM_UDQM, DRAM_WE_N, // I2C Bus for Configuration of the Audio and Video-In Chips FPGA_I2C_SCLK, FPGA_I2C_SDAT, // 40-Pin Headers GPIO_0, GPIO_1, // Seven Segment Displays HEX0, HEX1, HEX2, HEX3, HEX4, HEX5, // IR IRDA_RXD, IRDA_TXD, // Pushbuttons KEY, // LEDs LEDR, // PS2 Ports PS2_CLK, PS2_DAT, PS2_CLK2, PS2_DAT2, // Slider Switches SW, // Video-In TD_CLK27, TD_DATA, TD_HS, TD_RESET_N, TD_VS, // VGA VGA_B, VGA_BLANK_N, VGA_CLK, VGA_G, VGA_HS, VGA_R, VGA_SYNC_N, VGA_VS, //////////////////////////////////// // HPS Pins //////////////////////////////////// // DDR3 SDRAM HPS_DDR3_ADDR, HPS_DDR3_BA, HPS_DDR3_CAS_N, HPS_DDR3_CKE, HPS_DDR3_CK_N, HPS_DDR3_CK_P, HPS_DDR3_CS_N, HPS_DDR3_DM, HPS_DDR3_DQ, HPS_DDR3_DQS_N, HPS_DDR3_DQS_P, HPS_DDR3_ODT, HPS_DDR3_RAS_N, HPS_DDR3_RESET_N, HPS_DDR3_RZQ, HPS_DDR3_WE_N, // Ethernet HPS_ENET_GTX_CLK, HPS_ENET_INT_N, HPS_ENET_MDC, HPS_ENET_MDIO, HPS_ENET_RX_CLK, HPS_ENET_RX_DATA, HPS_ENET_RX_DV, HPS_ENET_TX_DATA, HPS_ENET_TX_EN, // Flash HPS_FLASH_DATA, HPS_FLASH_DCLK, HPS_FLASH_NCSO, // Accelerometer HPS_GSENSOR_INT, // General Purpose I/O HPS_GPIO, // I2C HPS_I2C_CONTROL, HPS_I2C1_SCLK, HPS_I2C1_SDAT, HPS_I2C2_SCLK, HPS_I2C2_SDAT, // Pushbutton HPS_KEY, // LED HPS_LED, // SD Card HPS_SD_CLK, HPS_SD_CMD, HPS_SD_DATA, // SPI HPS_SPIM_CLK, HPS_SPIM_MISO, HPS_SPIM_MOSI, HPS_SPIM_SS, // UART HPS_UART_RX, HPS_UART_TX, // USB HPS_CONV_USB_N, HPS_USB_CLKOUT, HPS_USB_DATA, HPS_USB_DIR, HPS_USB_NXT, HPS_USB_STP ); //======================================================= // PARAMETER declarations //======================================================= //======================================================= // PORT declarations //======================================================= //////////////////////////////////// // FPGA Pins //////////////////////////////////// // Clock pins input CLOCK_50; input CLOCK2_50; input CLOCK3_50; input CLOCK4_50; // ADC inout ADC_CS_N; output ADC_DIN; input ADC_DOUT; output ADC_SCLK; // Audio input AUD_ADCDAT; inout AUD_ADCLRCK; inout AUD_BCLK; output AUD_DACDAT; inout AUD_DACLRCK; output AUD_XCK; // SDRAM output [12: 0] DRAM_ADDR; output [ 1: 0] DRAM_BA; output DRAM_CAS_N; output DRAM_CKE; output DRAM_CLK; output DRAM_CS_N; inout [15: 0] DRAM_DQ; output DRAM_LDQM; output DRAM_RAS_N; output DRAM_UDQM; output DRAM_WE_N; // I2C Bus for Configuration of the Audio and Video-In Chips output FPGA_I2C_SCLK; inout FPGA_I2C_SDAT; // 40-pin headers inout [35: 0] GPIO_0; inout [35: 0] GPIO_1; // Seven Segment Displays output [ 6: 0] HEX0; output [ 6: 0] HEX1; output [ 6: 0] HEX2; output [ 6: 0] HEX3; output [ 6: 0] HEX4; output [ 6: 0] HEX5; // IR input IRDA_RXD; output IRDA_TXD; // Pushbuttons input [ 3: 0] KEY; // LEDs output [ 9: 0] LEDR; // PS2 Ports inout PS2_CLK; inout PS2_DAT; inout PS2_CLK2; inout PS2_DAT2; // Slider Switches input [ 9: 0] SW; // Video-In input TD_CLK27; input [ 7: 0] TD_DATA; input TD_HS; output TD_RESET_N; input TD_VS; // VGA output [ 7: 0] VGA_B; output VGA_BLANK_N; output VGA_CLK; output [ 7: 0] VGA_G; output VGA_HS; output [ 7: 0] VGA_R; output VGA_SYNC_N; output VGA_VS; //////////////////////////////////// // HPS Pins //////////////////////////////////// // DDR3 SDRAM output [14: 0] HPS_DDR3_ADDR; output [ 2: 0] HPS_DDR3_BA; output HPS_DDR3_CAS_N; output HPS_DDR3_CKE; output HPS_DDR3_CK_N; output HPS_DDR3_CK_P; output HPS_DDR3_CS_N; output [ 3: 0] HPS_DDR3_DM; inout [31: 0] HPS_DDR3_DQ; inout [ 3: 0] HPS_DDR3_DQS_N; inout [ 3: 0] HPS_DDR3_DQS_P; output HPS_DDR3_ODT; output HPS_DDR3_RAS_N; output HPS_DDR3_RESET_N; input HPS_DDR3_RZQ; output HPS_DDR3_WE_N; // Ethernet output HPS_ENET_GTX_CLK; inout HPS_ENET_INT_N; output HPS_ENET_MDC; inout HPS_ENET_MDIO; input HPS_ENET_RX_CLK; input [ 3: 0] HPS_ENET_RX_DATA; input HPS_ENET_RX_DV; output [ 3: 0] HPS_ENET_TX_DATA; output HPS_ENET_TX_EN; // Flash inout [ 3: 0] HPS_FLASH_DATA; output HPS_FLASH_DCLK; output HPS_FLASH_NCSO; // Accelerometer inout HPS_GSENSOR_INT; // General Purpose I/O inout [ 1: 0] HPS_GPIO; // I2C inout HPS_I2C_CONTROL; inout HPS_I2C1_SCLK; inout HPS_I2C1_SDAT; inout HPS_I2C2_SCLK; inout HPS_I2C2_SDAT; // Pushbutton inout HPS_KEY; // LED inout HPS_LED; // SD Card output HPS_SD_CLK; inout HPS_SD_CMD; inout [ 3: 0] HPS_SD_DATA; // SPI output HPS_SPIM_CLK; input HPS_SPIM_MISO; output HPS_SPIM_MOSI; inout HPS_SPIM_SS; // UART input HPS_UART_RX; output HPS_UART_TX; // USB inout HPS_CONV_USB_N; input HPS_USB_CLKOUT; inout [ 7: 0] HPS_USB_DATA; input HPS_USB_DIR; input HPS_USB_NXT; output HPS_USB_STP; //======================================================= // REG/WIRE declarations //======================================================= wire [15: 0] hex3_hex0; //wire [15: 0] hex5_hex4; //assign HEX0 = ~hex3_hex0[ 6: 0]; // hex3_hex0[ 6: 0]; //assign HEX1 = ~hex3_hex0[14: 8]; //assign HEX2 = ~hex3_hex0[22:16]; //assign HEX3 = ~hex3_hex0[30:24]; assign HEX4 = 7'b1111111; assign HEX5 = 7'b1111111; HexDigit Digit0(HEX0, hex3_hex0[3:0]); HexDigit Digit1(HEX1, hex3_hex0[7:4]); HexDigit Digit2(HEX2, hex3_hex0[11:8]); HexDigit Digit3(HEX3, hex3_hex0[15:12]); //======================================================= // SRAM/VGA state machine //======================================================= // --Check for sram address=0 nonzero, which means that // HPS wrote some new data. // // --Read sram address 1 and 2 to get x1, y1 // left-most x, upper-most y // --Read sram address 3 and 4 to get x2, y2 // right-most x, lower-most y // --Read sram address 5 to get color // --write a rectangle to VGA // // --clear sram address=0 to signal HPS //======================================================= // Controls for Qsys sram slave exported in system module //======================================================= wire [31:0] sram_readdata ; reg [31:0] data_buffer, sram_writedata ; reg [7:0] sram_address; reg sram_write ; wire sram_clken = 1'b1; wire sram_chipselect = 1'b1; reg [7:0] state ; // rectangle corners reg [9:0] x1, y1, x2, y2 ; reg [31:0] timer ; // may need to throttle write-rate //======================================================= // Controls for VGA memory //======================================================= wire [31:0] vga_out_base_address = 32'h0000_0000 ; // vga base addr reg [7:0] vga_sram_writedata ; reg [31:0] vga_sram_address; reg vga_sram_write ; wire vga_sram_clken = 1'b1; wire vga_sram_chipselect = 1'b1; //======================================================= // pixel address is reg [9:0] vga_x_cood, vga_y_cood ; reg [7:0] pixel_color ; //======================================================= // L SYSTEM MODULE //======================================================= //ASCII alphabet localparam [7:0] X = 8'b01011000; localparam [7:0] Y = 8'b01011001; localparam [7:0] plus = 8'd43; //+ localparam [7:0] minus = 8'd45; //- localparam [7:0] F = 8'd70; localparam [7:0] A = 8'd65; localparam [7:0] open_bracket = 8'd91; //[ localparam [7:0] closing_bracket = 8'd93; //] wire [7:0] system_output_string; wire system_done; wire reset; reg [7:0] system_output_string_reg; reg [7:0] top_char_reg; reg top_done_reg; wire top_rdy; reg top_rdy_reg; assign top_rdy = top_rdy_reg; wire system_val; wire [1:0] top_graphing; reg [1:0] top_graphing_reg; assign top_graphing = top_graphing_reg; reg [31:0] timer_counter_reg; wire [3:0] iterations_counter; // PIO port connections wire [7:0] top_char; assign top_char = top_char_reg; wire [31:0] axiom; wire [3:0] iterations; wire [4:0] length; wire [2:0] lsystem; wire [9:0] start_x, start_y; wire [31:0] timer_counter; assign timer_counter = timer_counter_reg; wire [7:0] color; //create_system module from rules.v // Calculates the L-System string to be graphed create_system system( .clk (CLOCK_50), .reset (reset), .top_rdy (top_rdy), .top_graphing (top_graphing), .lsystem (lsystem), .iterations (iterations), .system_input_string (axiom), .system_output_string (system_output_string), .system_val (system_val), .system_done (system_done), .iterations_counter (iterations_counter) ); //Math for graphing diagonally wire [31:0] root_three = {1'b0,10'b1,21'b1011_1011_0110_0111_1011_0}; wire [31:0] y_triangle_length; //length*(sqrt3)/2 wire [31:0] x_triangle_length = {1'b0, length >> 1, 21'b0}; //length/2 signed_mult triangle_mult( .out (y_triangle_length), .a (root_three), .b (x_triangle_length) ); //FSM states localparam TOP_RESET = 4'd0; localparam TOP_WAIT = 4'd1; localparam TOP_SETUP = 4'd2; localparam TOP_TARGET = 4'd3; localparam TOP_BOUND_CHECK = 4'd4; localparam TOP_GRAPH_DRAGON = 4'd5; localparam TOP_GRAPH_TRIANGLE_X = 4'd6; localparam TOP_GRAPH_TRIANGLE_Y = 4'd7; localparam TOP_SHIFT = 4'd8; localparam TOP_DONE = 4'd9; //Internal registers reg [3:0] top_state_reg; reg [31:0] x_reg; reg [31:0] y_reg; reg [31:0] targetx_reg; //11.21 fixed point (signed, but won't use sign bit) reg [31:0] targety_reg; //11.21 fixed point (signed, but won't use sign bit) reg [9:0] angle_reg; reg [9:0] angle_increment; //90 or 60, depending on L-System reg [4:0] length_reg; reg [3:0] triangle_x_counter; reg [3:0] triangle_y_counter; always @ (posedge CLOCK_50) begin //Reset state, board startup //Grabs user input from HPS if (reset) begin timer_counter_reg <= 32'b0; system_output_string_reg <= 8'b0; top_done_reg <= 1'b0; top_graphing_reg <= 2'b0; top_rdy_reg <= 1'b0; vga_sram_write <= 1'b0; vga_sram_address <= vga_out_base_address; vga_sram_writedata <= 8'h00; x_reg <= {22'b0, start_x}; y_reg <= {22'b0, start_y}; targetx_reg <= 32'd0; targety_reg <= 32'd0; triangle_x_counter <= 3'b0; triangle_y_counter <= 3'b0; angle_reg <= 10'b0; if (lsystem == 3'b0 || lsystem == 3'd5) begin angle_increment <= 10'd90; end else if (lsystem == 3'b1 || lsystem == 3'd2 || lsystem == 3'd3 || lsystem == 3'd4) begin angle_increment <= 10'd60; end else if (lsystem == 3'd6) begin angle_increment <= 10'd120; end else begin angle_increment <= 10'd0; end length_reg <= length; top_state_reg <= TOP_WAIT; end else begin case (top_state_reg) //Reset state, new L-System //Grabs user input from HPS TOP_RESET: begin timer_counter_reg <= 32'b0; system_output_string_reg <= 8'b0; top_graphing_reg <= 2'b0; top_rdy_reg <= 1'b0; top_done_reg <= 1'b0; vga_sram_write <= 1'b0; vga_sram_address <= vga_out_base_address; vga_sram_writedata <= 8'h00; x_reg <= {22'b0, start_x}; y_reg <= {22'b0, start_y}; targetx_reg <= 32'd0; targety_reg <= 32'd0; angle_reg <= 10'b0; if (lsystem == 3'b0 || lsystem == 3'd5) begin angle_increment <= 10'd90; end else if (lsystem == 3'b1 || lsystem == 3'd2 || lsystem == 3'd3 || lsystem == 3'd4) begin angle_increment <= 10'd60; end else if (lsystem == 3'd6) begin angle_increment <= 10'd120; end else begin angle_increment <= 10'd0; end length_reg <= length; if (reset) begin top_state_reg <= TOP_RESET; end else begin top_state_reg <= TOP_WAIT; end end //Waiting for next valid character byte from create_system TOP_WAIT: begin timer_counter_reg <= timer_counter_reg + 32'b1; top_graphing_reg <= top_graphing_reg; top_done_reg <= 1'b0; vga_sram_write <= 1'b0; vga_sram_address <= vga_out_base_address; vga_sram_writedata <= 8'h00; x_reg <= x_reg; y_reg <= y_reg; targetx_reg <= x_reg; targety_reg <= y_reg; angle_reg <= angle_reg; if (system_done) begin top_state_reg <= TOP_DONE; top_rdy_reg <= 1'b0; end else begin if (system_val == 1'b0) begin top_rdy_reg <= 1'b1; top_state_reg <= TOP_WAIT; end else begin top_rdy_reg <= 1'b0; system_output_string_reg <= system_output_string; top_state_reg <= TOP_SETUP; end end end //Grabbing the character byte from create_system //Setting the parameters for graphing later on TOP_SETUP: begin timer_counter_reg <= timer_counter_reg + 32'b1; top_graphing_reg <= 2'b0; top_rdy_reg <= 1'b0; top_done_reg <= 1'b0; vga_sram_write <= 1'b0; vga_sram_address <= vga_out_base_address; vga_sram_writedata <= 8'h00; x_reg <= x_reg; y_reg <= y_reg; targetx_reg <= targetx_reg; targety_reg <= targety_reg; angle_reg <= angle_reg; top_char_reg <= system_output_string_reg; system_output_string_reg <= system_output_string_reg; top_state_reg <= TOP_TARGET; end //Calculating the coordinates for graphing based on the character byte from create_system TOP_TARGET: begin timer_counter_reg <= timer_counter_reg + 32'b1; top_graphing_reg <= 2'b0; top_rdy_reg <= 1'b0; vga_sram_write <= 1'b0; vga_sram_address <= vga_out_base_address; vga_sram_writedata <= 8'h00; x_reg <= x_reg; y_reg <= y_reg; top_state_reg <= TOP_BOUND_CHECK; //Graphing only occurs in last iteration, so does target coordinate calculation if (iterations_counter == iterations - 4'b1) begin case(top_char_reg) //move forward by length based on orientation (0 -> up) F: begin if (angle_reg == 10'd0) begin targetx_reg <= x_reg; targety_reg <= y_reg - length_reg; angle_reg <= angle_reg; end else if (angle_reg == 10'd60) begin targetx_reg <= x_reg + x_triangle_length[30:21]; targety_reg <= y_reg - y_triangle_length[30:21]; angle_reg <= angle_reg; end else if (angle_reg == 10'd90) begin targetx_reg <= x_reg + length_reg; targety_reg <= y_reg; angle_reg <= angle_reg; end else if (angle_reg == 10'd120) begin targetx_reg <= x_reg + x_triangle_length[30:21]; targety_reg <= y_reg + y_triangle_length[30:21]; angle_reg <= angle_reg; end else if (angle_reg == 10'd180) begin targetx_reg <= x_reg; targety_reg <= y_reg + length_reg; angle_reg <= angle_reg; end else if (angle_reg == 10'd240) begin targetx_reg <= x_reg - x_triangle_length[30:21]; targety_reg <= y_reg + y_triangle_length[30:21]; angle_reg <= angle_reg; end else if (angle_reg == 10'd270) begin targetx_reg <= x_reg - length_reg; targety_reg <= y_reg; angle_reg <= angle_reg; end else if (angle_reg == 10'd300) begin targetx_reg <= x_reg - x_triangle_length[30:21]; targety_reg <= y_reg - y_triangle_length[30:21]; angle_reg <= angle_reg; end end //move forward by length based on orientation (0 -> up) A: begin if (angle_reg == 10'd0) begin targetx_reg <= x_reg; targety_reg <= y_reg - length_reg; angle_reg <= angle_reg; end else if (angle_reg == 10'd60) begin targetx_reg <= x_reg + x_triangle_length[30:21]; targety_reg <= y_reg - y_triangle_length[30:21]; angle_reg <= angle_reg; end else if (angle_reg == 10'd90) begin targetx_reg <= x_reg + length_reg; targety_reg <= y_reg; angle_reg <= angle_reg; end else if (angle_reg == 10'd120) begin targetx_reg <= x_reg + x_triangle_length[30:21]; targety_reg <= y_reg + y_triangle_length[30:21]; angle_reg <= angle_reg; end else if (angle_reg == 10'd180) begin targetx_reg <= x_reg; targety_reg <= y_reg + length_reg; angle_reg <= angle_reg; end else if (angle_reg == 10'd240) begin targetx_reg <= x_reg - x_triangle_length[30:21]; targety_reg <= y_reg + y_triangle_length[30:21]; angle_reg <= angle_reg; end else if (angle_reg == 10'd270) begin targetx_reg <= x_reg - length_reg; targety_reg <= y_reg; angle_reg <= angle_reg; end else if (angle_reg == 10'd300) begin targetx_reg <= x_reg - x_triangle_length[30:21]; targety_reg <= y_reg - y_triangle_length[30:21]; angle_reg <= angle_reg; end end //Increments the angle representing orientation plus: begin if ( (angle_reg == 10'd270 && lsystem == 3'b0) || (angle_reg == 10'd300 && lsystem == 3'b1) || (angle_reg == 10'd300 && lsystem == 3'd2) || (angle_reg == 10'd300 && lsystem == 3'd3) || (angle_reg == 10'd300 && lsystem == 3'd4) || (angle_reg == 10'd270 && lsystem == 3'd5) || (angle_reg == 10'd240 && lsystem == 3'd6)) begin targetx_reg <= targetx_reg; targety_reg <= targety_reg; angle_reg <= 10'd0; end else begin targetx_reg <= targetx_reg; targety_reg <= targety_reg; angle_reg <= angle_reg + angle_increment; end end //Decrements the angle representing orientation minus: begin if (angle_reg == 10'd0) begin targetx_reg <= targetx_reg; targety_reg <= targety_reg; angle_reg <= 10'd360 - angle_increment; end else begin targetx_reg <= targetx_reg; targety_reg <= targety_reg; angle_reg <= angle_reg - angle_increment; end end endcase end else begin targetx_reg <= targetx_reg; targety_reg <= targety_reg; angle_reg <= angle_reg; end end //Checks the target coordinates to see if they're within the 640x480 confines of the VGA screen //If they're not, don't set the write enable //Otherwise go to the appropriate graphing state for the L-System TOP_BOUND_CHECK: begin timer_counter_reg <= timer_counter_reg + 32'b1; top_rdy_reg <= 1'b0; targetx_reg <= targetx_reg; targety_reg <= targety_reg; angle_reg <= angle_reg; if (targetx_reg < 32'b0 ||targetx_reg > 32'd639 || targety_reg < 32'b0 || targety_reg > 32'd479) begin top_state_reg <= TOP_SHIFT; x_reg <= targetx_reg; y_reg <= targety_reg; end else begin if (lsystem == 3'b0 || lsystem == 3'd5) begin top_state_reg <= TOP_GRAPH_DRAGON; end else if (lsystem == 3'b1 || lsystem == 3'd2 || lsystem == 3'd3 || lsystem == 3'd4 || lsystem == 3'd6) begin top_state_reg <= TOP_GRAPH_TRIANGLE_X; end else begin top_state_reg <= TOP_TARGET; end end end //Responsible for graphing when the L-System's angle_increment is 90 degrees (only straight lines) TOP_GRAPH_DRAGON: begin timer_counter_reg <= timer_counter_reg + 32'b1; top_graphing_reg <= 2'b1; top_rdy_reg <= 1'b0; targetx_reg <= targetx_reg; targety_reg <= targety_reg; angle_reg <= angle_reg; //Only graph on last iteration if (iterations_counter == iterations - 4'b1) begin if (x_reg[31] == 1'b1 ||x_reg > 32'd639 || y_reg[31] == 1'b1 || y_reg > 32'd479) begin //bound check vga_sram_write <= 1'b0; //don't graph if out of bounds vga_sram_address <= vga_out_base_address; // compute address vga_sram_writedata <= 8'h00; // data end else begin vga_sram_write <= 1'b1; //graph if in bounds vga_sram_address <= vga_out_base_address + x_reg + (y_reg*640) ; // compute address vga_sram_writedata <= color; // data end end else begin vga_sram_write <= 1'b0; //don't graph before last iteration vga_sram_address <= vga_out_base_address; // compute address vga_sram_writedata <= 8'h00; // data end // iterate through all x,y until target is reached if (x_reg < targetx_reg) begin x_reg <= x_reg + 32'd1 ; end else if (x_reg > targetx_reg) begin x_reg <= x_reg - 32'd1 ; end else begin x_reg <= x_reg; end if (y_reg < targety_reg) begin y_reg <= y_reg + 32'd1 ; end else if (y_reg > targety_reg) begin y_reg <= y_reg - 32'd1 ; end else begin y_reg <= y_reg; end //has target been reached? if (x_reg == targetx_reg && y_reg == targety_reg) begin top_state_reg <= TOP_SHIFT; end else if (x_reg > 32'd4294967295 || y_reg > 32'd4294967295) begin //outside of bounds of (2^32)-1 (maximum value stored in 32 bits) top_state_reg <= TOP_DONE; end else begin top_state_reg <= TOP_GRAPH_DRAGON; end end //Responsible for x graphing when the L-System's angle_increment is 60 degrees (can have diagonal and straight lines) TOP_GRAPH_TRIANGLE_X: begin timer_counter_reg <= timer_counter_reg + 32'b1; top_graphing_reg <= 2'b1; top_rdy_reg <= 1'b0; targetx_reg <= targetx_reg; targety_reg <= targety_reg; angle_reg <= angle_reg; if (iterations_counter == iterations - 4'b1) begin //last iteration (for graphing) if (x_reg[31] == 1'b1 ||x_reg > 32'd639 || y_reg[31] == 1'b1 || y_reg > 32'd479) begin //bound check vga_sram_write <= 1'b0; //don't graph if out of bounds vga_sram_address <= vga_out_base_address; // compute address vga_sram_writedata <= 8'h00; // data end else begin vga_sram_write <= 1'b1; //graph if in bounds vga_sram_address <= vga_out_base_address + x_reg + (y_reg*640) ; // compute address vga_sram_writedata <= color; // data end end else begin vga_sram_write <= 1'b0; //don't graph before last iteration vga_sram_address <= vga_out_base_address; // compute address vga_sram_writedata <= 8'h00; // data end // iterate through all x,y until target is reached if (x_reg < targetx_reg) begin x_reg <= x_reg + 32'd1 ; triangle_x_counter <= triangle_x_counter + 3'b1; end else if (x_reg > targetx_reg) begin x_reg <= x_reg - 32'd1 ; triangle_x_counter <= triangle_x_counter + 3'b1; end else begin x_reg <= x_reg; triangle_x_counter <= 3'b0; end //has target been reached? if ( (angle_reg == 10'd60 && x_reg >= targetx_reg && y_reg <= targety_reg) || (angle_reg == 10'd120 && x_reg >= targetx_reg && y_reg >= targety_reg) || (angle_reg == 10'd180 && x_reg == targetx_reg && y_reg == targety_reg) || (angle_reg == 10'd240 && x_reg <= targetx_reg && y_reg >= targety_reg) || (angle_reg == 10'd300 && x_reg <= targetx_reg && y_reg <= targety_reg) || (angle_reg == 10'd0 && x_reg == targetx_reg && y_reg == targety_reg)) begin top_state_reg <= TOP_SHIFT; end //continue graphing in the x-direction else if (triangle_x_counter < 3'd2 && ( (angle_reg == 10'd60 && x_reg < targetx_reg) || (angle_reg == 10'd120 && x_reg < targetx_reg) || (angle_reg == 10'd180 && x_reg != targetx_reg) || (angle_reg == 10'd240 && x_reg > targetx_reg) || (angle_reg == 10'd300 && x_reg > targetx_reg) || (angle_reg == 10'd0 && x_reg != targetx_reg) ) ) begin top_state_reg <= TOP_GRAPH_TRIANGLE_X; end else if (x_reg > 32'd4294967295 || y_reg > 32'd4294967295) begin //outside of bounds of (2^32)-1 (maximum value stored in 32 bits) top_state_reg <= TOP_DONE; end //continue graphing in the y-direction else begin triangle_x_counter <= 3'b0; top_state_reg <= TOP_GRAPH_TRIANGLE_Y; end end //Responsible for y graphing when the L-System's angle_increment is 60 degrees (can have diagonal and straight lines) TOP_GRAPH_TRIANGLE_Y: begin timer_counter_reg <= timer_counter_reg + 32'b1; top_graphing_reg <= 2'b1; top_rdy_reg <= 1'b0; targetx_reg <= targetx_reg; targety_reg <= targety_reg; angle_reg <= angle_reg; if (iterations_counter == iterations - 4'b1) begin //last iteration (for graphing) if (x_reg[31] == 1'b1 ||x_reg > 32'd639 || y_reg[31] == 1'b1 || y_reg > 32'd479) begin //bound check vga_sram_write <= 1'b0; //don't graph if out of bounds vga_sram_address <= vga_out_base_address; // compute address vga_sram_writedata <= 8'h00; // data end else begin vga_sram_write <= 1'b1; //graph if in bounds vga_sram_address <= vga_out_base_address + x_reg + (y_reg*640) ; // compute address vga_sram_writedata <= color; // data end end else begin vga_sram_write <= 1'b0; //don't graph before last iteration vga_sram_address <= vga_out_base_address; // compute address vga_sram_writedata <= 8'h00; // data end // iterate through all x,y until target is reached if (y_reg < targety_reg) begin y_reg <= y_reg + 32'd1 ; triangle_y_counter <= triangle_y_counter + 3'b1; end else if (y_reg > targety_reg) begin y_reg <= y_reg - 32'd1 ; triangle_y_counter <= triangle_y_counter + 3'b1; end else begin y_reg <= y_reg; triangle_y_counter <= 3'b0; end //has target been reached? if ( (angle_reg == 10'd60 && x_reg >= targetx_reg && y_reg <= targety_reg) || (angle_reg == 10'd120 && x_reg >= targetx_reg && y_reg >= targety_reg) || (angle_reg == 10'd180 && x_reg == targetx_reg && y_reg == targety_reg) || (angle_reg == 10'd240 && x_reg <= targetx_reg && y_reg >= targety_reg) || (angle_reg == 10'd300 && x_reg <= targetx_reg && y_reg <= targety_reg) || (angle_reg == 10'd0 && x_reg == targetx_reg && y_reg == targety_reg)) begin top_state_reg <= TOP_SHIFT; end //continue graphing in the y-direction else if (triangle_y_counter < 3'd5 && ( (angle_reg == 10'd60 && y_reg > targety_reg) || (angle_reg == 10'd120 && y_reg < targety_reg) || (angle_reg == 10'd180 && y_reg != targety_reg) || (angle_reg == 10'd240 && y_reg < targety_reg) || (angle_reg == 10'd300 && y_reg > targety_reg) || (angle_reg == 10'd0 && y_reg != targety_reg) ) ) begin top_state_reg <= TOP_GRAPH_TRIANGLE_Y; end else if (x_reg > 32'd4294967295 || y_reg > 32'd4294967295) begin //outside of bounds of (2^32)-1 (maximum value stored in 32 bits) top_state_reg <= TOP_DONE; end //continue graphing in the x-direction else begin triangle_y_counter <= 3'b0; top_state_reg <= TOP_GRAPH_TRIANGLE_X; end end //Graphing complete, time to wait for next character TOP_SHIFT: begin timer_counter_reg <= timer_counter_reg + 32'b1; top_graphing_reg <= 2'd2; top_rdy_reg <= 1'b0; top_done_reg <= 1'b0; vga_sram_write <= 1'b0; vga_sram_address <= vga_out_base_address; vga_sram_writedata <= 8'h00; x_reg <= x_reg; y_reg <= y_reg; targetx_reg <= targetx_reg; targety_reg <= targety_reg; angle_reg <= angle_reg; system_output_string_reg <= system_output_string_reg; top_state_reg <= TOP_WAIT; end //system_done signal received from create_system, so there are no more characters to graph TOP_DONE: begin timer_counter_reg <= timer_counter_reg; top_graphing_reg <= 2'b0; top_rdy_reg <= 1'b0; top_done_reg <= 1'b1; vga_sram_write <= 1'b0; vga_sram_address <= vga_out_base_address; vga_sram_writedata <= 8'h00; x_reg <= x_reg; y_reg <= y_reg; targetx_reg <= targetx_reg; targety_reg <= targety_reg; angle_reg <= angle_reg; if (reset) begin top_state_reg <= TOP_RESET; end else begin top_state_reg <= TOP_DONE; end end default: begin timer_counter_reg <= 32'b0; system_output_string_reg <= system_output_string; top_state_reg <= TOP_WAIT; top_graphing_reg <= 2'b0; top_rdy_reg <= 1'b0; top_done_reg <= 1'b0; vga_sram_write <= 1'b0; vga_sram_address <= vga_out_base_address; vga_sram_writedata <= 8'h00; x_reg <= x_reg; y_reg <= y_reg; targetx_reg <= targetx_reg; targety_reg <= targety_reg; angle_reg <= angle_reg; if (lsystem == 3'b0 || lsystem == 3'd5) begin angle_increment <= 10'd90; end else if (lsystem == 3'b1 || lsystem == 3'd2 || lsystem == 3'd3 || lsystem == 3'd4) begin angle_increment <= 10'd60; end else if (lsystem == 3'd6) begin angle_increment <= 10'd120; end else begin angle_increment <= 10'd0; end end endcase end end //======================================================= // Structural coding //======================================================= // From Qsys Computer_System The_System ( //////////////////////////////////// // FPGA Side //////////////////////////////////// //PIO port connections .reset_external_connection_export (reset), .lsystem_char_external_connection_export (top_char), .axiom_external_connection_export (axiom), .iterations_external_connection_export (iterations), .length_external_connection_export (length), .lsystem_external_connection_export (lsystem), .start_x_external_connection_export (start_x), .start_y_external_connection_export (start_y), .timing_external_connection_export (timer_counter), .color_external_connection_export (color), // Global signals .system_pll_ref_clk_clk (CLOCK_50), .system_pll_ref_reset_reset (1'b0), // SRAM shared block with HPS .onchip_sram_s1_address (sram_address), .onchip_sram_s1_clken (sram_clken), .onchip_sram_s1_chipselect (sram_chipselect), .onchip_sram_s1_write (sram_write), .onchip_sram_s1_readdata (sram_readdata), .onchip_sram_s1_writedata (sram_writedata), .onchip_sram_s1_byteenable (4'b1111), // sram to video .onchip_vga_buffer_s1_address (vga_sram_address), .onchip_vga_buffer_s1_clken (vga_sram_clken), .onchip_vga_buffer_s1_chipselect (vga_sram_chipselect), .onchip_vga_buffer_s1_write (vga_sram_write), .onchip_vga_buffer_s1_readdata (), // never read from vga here .onchip_vga_buffer_s1_writedata (vga_sram_writedata), // AV Config .av_config_SCLK (FPGA_I2C_SCLK), .av_config_SDAT (FPGA_I2C_SDAT), // 50 MHz clock bridge .clock_bridge_0_in_clk_clk (CLOCK_50), //(CLOCK_50), // VGA Subsystem .vga_pll_ref_clk_clk (CLOCK2_50), .vga_pll_ref_reset_reset (1'b0), .vga_CLK (VGA_CLK), .vga_BLANK (VGA_BLANK_N), .vga_SYNC (VGA_SYNC_N), .vga_HS (VGA_HS), .vga_VS (VGA_VS), .vga_R (VGA_R), .vga_G (VGA_G), .vga_B (VGA_B), // SDRAM .sdram_clk_clk (DRAM_CLK), .sdram_addr (DRAM_ADDR), .sdram_ba (DRAM_BA), .sdram_cas_n (DRAM_CAS_N), .sdram_cke (DRAM_CKE), .sdram_cs_n (DRAM_CS_N), .sdram_dq (DRAM_DQ), .sdram_dqm ({DRAM_UDQM,DRAM_LDQM}), .sdram_ras_n (DRAM_RAS_N), .sdram_we_n (DRAM_WE_N), //////////////////////////////////// // HPS Side //////////////////////////////////// // DDR3 SDRAM .memory_mem_a (HPS_DDR3_ADDR), .memory_mem_ba (HPS_DDR3_BA), .memory_mem_ck (HPS_DDR3_CK_P), .memory_mem_ck_n (HPS_DDR3_CK_N), .memory_mem_cke (HPS_DDR3_CKE), .memory_mem_cs_n (HPS_DDR3_CS_N), .memory_mem_ras_n (HPS_DDR3_RAS_N), .memory_mem_cas_n (HPS_DDR3_CAS_N), .memory_mem_we_n (HPS_DDR3_WE_N), .memory_mem_reset_n (HPS_DDR3_RESET_N), .memory_mem_dq (HPS_DDR3_DQ), .memory_mem_dqs (HPS_DDR3_DQS_P), .memory_mem_dqs_n (HPS_DDR3_DQS_N), .memory_mem_odt (HPS_DDR3_ODT), .memory_mem_dm (HPS_DDR3_DM), .memory_oct_rzqin (HPS_DDR3_RZQ), // Ethernet .hps_io_hps_io_gpio_inst_GPIO35 (HPS_ENET_INT_N), .hps_io_hps_io_emac1_inst_TX_CLK (HPS_ENET_GTX_CLK), .hps_io_hps_io_emac1_inst_TXD0 (HPS_ENET_TX_DATA[0]), .hps_io_hps_io_emac1_inst_TXD1 (HPS_ENET_TX_DATA[1]), .hps_io_hps_io_emac1_inst_TXD2 (HPS_ENET_TX_DATA[2]), .hps_io_hps_io_emac1_inst_TXD3 (HPS_ENET_TX_DATA[3]), .hps_io_hps_io_emac1_inst_RXD0 (HPS_ENET_RX_DATA[0]), .hps_io_hps_io_emac1_inst_MDIO (HPS_ENET_MDIO), .hps_io_hps_io_emac1_inst_MDC (HPS_ENET_MDC), .hps_io_hps_io_emac1_inst_RX_CTL (HPS_ENET_RX_DV), .hps_io_hps_io_emac1_inst_TX_CTL (HPS_ENET_TX_EN), .hps_io_hps_io_emac1_inst_RX_CLK (HPS_ENET_RX_CLK), .hps_io_hps_io_emac1_inst_RXD1 (HPS_ENET_RX_DATA[1]), .hps_io_hps_io_emac1_inst_RXD2 (HPS_ENET_RX_DATA[2]), .hps_io_hps_io_emac1_inst_RXD3 (HPS_ENET_RX_DATA[3]), // Flash .hps_io_hps_io_qspi_inst_IO0 (HPS_FLASH_DATA[0]), .hps_io_hps_io_qspi_inst_IO1 (HPS_FLASH_DATA[1]), .hps_io_hps_io_qspi_inst_IO2 (HPS_FLASH_DATA[2]), .hps_io_hps_io_qspi_inst_IO3 (HPS_FLASH_DATA[3]), .hps_io_hps_io_qspi_inst_SS0 (HPS_FLASH_NCSO), .hps_io_hps_io_qspi_inst_CLK (HPS_FLASH_DCLK), // Accelerometer .hps_io_hps_io_gpio_inst_GPIO61 (HPS_GSENSOR_INT), //.adc_sclk (ADC_SCLK), //.adc_cs_n (ADC_CS_N), //.adc_dout (ADC_DOUT), //.adc_din (ADC_DIN), // General Purpose I/O .hps_io_hps_io_gpio_inst_GPIO40 (HPS_GPIO[0]), .hps_io_hps_io_gpio_inst_GPIO41 (HPS_GPIO[1]), // I2C .hps_io_hps_io_gpio_inst_GPIO48 (HPS_I2C_CONTROL), .hps_io_hps_io_i2c0_inst_SDA (HPS_I2C1_SDAT), .hps_io_hps_io_i2c0_inst_SCL (HPS_I2C1_SCLK), .hps_io_hps_io_i2c1_inst_SDA (HPS_I2C2_SDAT), .hps_io_hps_io_i2c1_inst_SCL (HPS_I2C2_SCLK), // Pushbutton .hps_io_hps_io_gpio_inst_GPIO54 (HPS_KEY), // LED .hps_io_hps_io_gpio_inst_GPIO53 (HPS_LED), // SD Card .hps_io_hps_io_sdio_inst_CMD (HPS_SD_CMD), .hps_io_hps_io_sdio_inst_D0 (HPS_SD_DATA[0]), .hps_io_hps_io_sdio_inst_D1 (HPS_SD_DATA[1]), .hps_io_hps_io_sdio_inst_CLK (HPS_SD_CLK), .hps_io_hps_io_sdio_inst_D2 (HPS_SD_DATA[2]), .hps_io_hps_io_sdio_inst_D3 (HPS_SD_DATA[3]), // SPI .hps_io_hps_io_spim1_inst_CLK (HPS_SPIM_CLK), .hps_io_hps_io_spim1_inst_MOSI (HPS_SPIM_MOSI), .hps_io_hps_io_spim1_inst_MISO (HPS_SPIM_MISO), .hps_io_hps_io_spim1_inst_SS0 (HPS_SPIM_SS), // UART .hps_io_hps_io_uart0_inst_RX (HPS_UART_RX), .hps_io_hps_io_uart0_inst_TX (HPS_UART_TX), // USB .hps_io_hps_io_gpio_inst_GPIO09 (HPS_CONV_USB_N), .hps_io_hps_io_usb1_inst_D0 (HPS_USB_DATA[0]), .hps_io_hps_io_usb1_inst_D1 (HPS_USB_DATA[1]), .hps_io_hps_io_usb1_inst_D2 (HPS_USB_DATA[2]), .hps_io_hps_io_usb1_inst_D3 (HPS_USB_DATA[3]), .hps_io_hps_io_usb1_inst_D4 (HPS_USB_DATA[4]), .hps_io_hps_io_usb1_inst_D5 (HPS_USB_DATA[5]), .hps_io_hps_io_usb1_inst_D6 (HPS_USB_DATA[6]), .hps_io_hps_io_usb1_inst_D7 (HPS_USB_DATA[7]), .hps_io_hps_io_usb1_inst_CLK (HPS_USB_CLKOUT), .hps_io_hps_io_usb1_inst_STP (HPS_USB_STP), .hps_io_hps_io_usb1_inst_DIR (HPS_USB_DIR), .hps_io_hps_io_usb1_inst_NXT (HPS_USB_NXT) ); endmodule // end top level //============================================================ // M10K module for testing //============================================================ // See example 12-16 in // http://people.ece.cornell.edu/land/courses/ece5760/DE1_SOC/HDL_style_qts_qii51007.pdf //============================================================ module M10K_256_32( output reg [31:0] q, input [31:0] d, input [7:0] write_address, read_address, input we, clk ); // force M10K ram style // 256 words of 32 bits reg [31:0] mem [255:0] /* synthesis ramstyle = "no_rw_check, M10K" */; always @ (posedge clk) begin if (we) begin mem[write_address] <= d; end q <= mem[read_address]; // q doesn't get d in this clock cycle end endmodule //============================================================ // MLAB module for testing //============================================================ // See example 12-16 in // http://people.ece.cornell.edu/land/courses/ece5760/DE1_SOC/HDL_style_qts_qii51007.pdf //============================================================ module MLAB_20_32( output reg signed [31:0] q, input [31:0] data, input [7:0] readaddr, writeaddr, input wren, clock ); // force MLAB ram style // 20 words of 32 bits reg signed [31:0] mem [19:0] /* synthesis ramstyle = "no_rw_check, MLAB" */; always @ (posedge clock) begin if (wren) begin mem[writeaddr] <= data; end q <= mem[readaddr]; end endmodule /************************************************************************** * Following floating point modules written by Bruce Land * March 2017 *************************************************************************/ /************************************************************************** * Floating Point to 16-bit integer * * Combinational * Numbers with mag > than +/-32768 get clipped to 32768 or -32768 *************************************************************************/ module Int2Fp( input signed [15:0] iInteger, output[26:0] oA ); // output fields wire A_s; wire [7:0] A_e; wire [17:0] A_f; wire [15:0] abs_input ; // get output sign bit assign A_s = (iInteger < 0); // remove sign from input assign abs_input = (iInteger < 0)? -iInteger : iInteger ; // find the most significant (nonzero) bit wire [7:0] shft_amt; assign shft_amt = abs_input[15] ? 8'd3 : abs_input[14] ? 8'd4 : abs_input[13] ? 8'd5 : abs_input[12] ? 8'd6 : abs_input[11] ? 8'd7 : abs_input[10] ? 8'd8 : abs_input[9] ? 8'd9 : abs_input[8] ? 8'd10 : abs_input[7] ? 8'd11 : abs_input[6] ? 8'd12 : abs_input[5] ? 8'd13 : abs_input[4] ? 8'd14 : abs_input[3] ? 8'd15 : abs_input[2] ? 8'd16 : abs_input[1] ? 8'd17 : abs_input[0] ? 8'd18 : 8'd19; // exponent 127 + (18-shift_amt) // 127 is 2^0 // 18 is amount '1' is shifted assign A_e = 127 + 18 - shft_amt ; // where the intermediate value is formed wire [33:0] shift_buffer ; // remember that the high-order '1' is not stored, // but is shifted to bit 18 assign shift_buffer = {16'b0, abs_input} << shft_amt ; assign A_f = shift_buffer[17:0]; assign oA = (iInteger==0)? 27'b0 : {A_s, A_e, A_f}; endmodule //Int2Fp /************************************************************************** * Floating Point to 16-bit integer * * Combinational * Numbers with mag > than +/-32768 get clipped to 32768 or -32768 *************************************************************************/ module Fp2Int( input [26:0] iA, output reg [15:0] oInteger ); // Extract fields of A and B. wire A_s; wire [7:0] A_e; wire [17:0] A_f; assign A_s = iA[26]; assign A_e = iA[25:18]; assign A_f = iA[17:0]; wire [15:0] max_int = 16'h7fff ; //32768 wire [33:0] shift_buffer ; // form (1.A_f) and shift it to postiion assign shift_buffer = {15'b0, 1'b1, A_f}<<(A_e-127) ; // If exponent less than 127, oInteger=0 // If exponent greater than 127+14 oInteger=max value // Between these two values: // set up input mantissa with 1.mantissa // and the "1." in the lowest bit of an extended word. // shift-left by A_e-127 // If the sign bit is set, negate oInteger always @(*) begin if (A_e < 127) oInteger = 16'b0; else if (A_e > 141) begin if (A_s) oInteger = -max_int; else oInteger = max_int; end else begin if (A_s) oInteger = -shift_buffer[33:18]; else oInteger = shift_buffer[33:18]; end end endmodule //Fp2Int /************************************************************************** * Floating Point shift * * Combinational * Negative shift input is right shift *************************************************************************/ module FpShift( input [26:0] iA, input [7:0] iShift, output [26:0] oShifted ); // Extract fields of A and B. wire A_s; wire [7:0] A_e; wire [17:0] A_f; assign A_s = iA[26]; assign A_e = iA[25:18]; assign A_f = iA[17:0]; // Flip bit 26 // zero the output if underflow/overflow // assign oShifted = (A_e+iShift<8'd254 && A_e+iShift>8'd2)? // {A_s, A_e+iShift, A_f} assign oShifted = {A_s, A_e+iShift, A_f} ; endmodule //FpShift /************************************************************************** * Floating Point sign negation * * Combinational * *************************************************************************/ module FpNegate( input [26:0] iA, output [26:0] oNegative ); // Extract fields of A and B. wire A_s; wire [7:0] A_e; wire [17:0] A_f; assign A_s = iA[26]; assign A_e = iA[25:18]; assign A_f = iA[17:0]; // Flip bit 26 assign oNegative = {~A_s, A_e, A_f}; endmodule //FpNegate /************************************************************************** * Floating Point absolute * * Combinational * *************************************************************************/ module FpAbs( input [26:0] iA, output [26:0] oAbs ); // Extract fields of A and B. wire A_s; wire [7:0] A_e; wire [17:0] A_f; assign A_s = iA[26]; assign A_e = iA[25:18]; assign A_f = iA[17:0]; // zero bit 26 assign oAbs = {1'b0, A_e, A_f}; endmodule //Fp absolute /************************************************************************** * Floating Point compare * * Combinational * output=1 if A>=B *************************************************************************/ module FpCompare( input [26:0] iA, input [26:0] iB, output reg oA_larger ); // Extract fields of A and B. wire A_s; wire [7:0] A_e; wire [17:0] A_f; wire B_s; wire [7:0] B_e; wire [17:0] B_f; assign A_s = iA[26]; assign A_e = iA[25:18]; assign A_f = iA[17:0]; assign B_s = iB[26]; assign B_e = iB[25:18]; assign B_f = iB[17:0]; // Determine which of A, B is larger wire A_mag_larger ; assign A_mag_larger =(A_e > B_e) ? 1'b1 : ((A_e == B_e) && (A_f >= B_f)) ? 1'b1 : 1'b0; // now do the sign checks always @(*) begin if (A_s==0 && B_s==1) begin // A positive, B negative oA_larger = 1'b1 ; end else if (A_s==1 && B_s==0) begin // A negative, B positive oA_larger = 1'b0 ; end else if (A_s==0 && B_s==0) begin // A positive, B positive oA_larger = A_mag_larger ; end else if (A_s==1 && B_s==1) begin // A negative, B negative oA_larger = ~A_mag_larger ; end else oA_larger = 0; // make sure no inferred latch end endmodule //FpCompare /************************************************************************** * Following floating point written by Mark Eiding mje56 * * ECE 5760 * * Modified IEEE single precision FP * * bit 26: Sign (0: pos, 1: neg) * * bits[25:18]: Exponent (unsigned) * * bits[17:0]: Fraction (unsigned) * * (-1)^SIGN * 2^(EXP-127) * (1+.FRAC) * * (http://en.wikipedia.org/wiki/Single-precision_floating-point_format) * * Adapted from Skyler Schneider ss868 * *************************************************************************/ /************************************************************************** * Floating Point Fast Inverse Square Root * * 5-stage pipeline * * http://en.wikipedia.org/wiki/Fast_inverse_square_root * * Magic number 27'd49920718 * * 1.5 = 27'd33423360 * *************************************************************************/ module FpInvSqrt ( input iCLK, input [26:0] iA, output [26:0] oInvSqrt ); // Extract fields of A and B. wire A_s; wire [7:0] A_e; wire [17:0] A_f; assign A_s = iA[26]; assign A_e = iA[25:18]; assign A_f = iA[17:0]; //Stage 1 wire [26:0] y_1, y_1_out, half_iA_1; assign y_1 = 27'd49920718 - (iA>>1); assign half_iA_1 = {A_s, A_e-8'd1,A_f}; FpMul s1_mult ( .iA(y_1), .iB(y_1), .oProd(y_1_out) ); //Stage 2 reg [26:0] y_2, mult_2_in, half_iA_2; wire [26:0] y_2_out; FpMul s2_mult ( .iA(half_iA_2), .iB(mult_2_in), .oProd(y_2_out) ); //Stage 3 reg [26:0] y_3, add_3_in; wire [26:0] y_3_out; FpAdd s3_add ( .iCLK(iCLK), .iA({~add_3_in[26],add_3_in[25:0]}), .iB(27'd33423360), .oSum(y_3_out) ); //Stage 4 reg [26:0] y_4; //Stage 5 reg [26:0] y_5, mult_5_in; FpMul s5_mult ( .iA(y_5), .iB(mult_5_in), .oProd(oInvSqrt) ); always @(posedge iCLK) begin //Stage 1 to 2 y_2 <= y_1; mult_2_in <= y_1_out; half_iA_2 <= half_iA_1; //Stage 2 to 3 y_3 <= y_2; add_3_in <= y_2_out; //Stage 3 to 4 y_4 <= y_3; //Stage 4 to 5 y_5 <= y_4; mult_5_in <= y_3_out; end endmodule /************************************************************************** * Floating Point Multiplier * * Combinational * *************************************************************************/ module FpMul ( input [26:0] iA, // First input input [26:0] iB, // Second input output [26:0] oProd // Product ); // Extract fields of A and B. wire A_s; wire [7:0] A_e; wire [17:0] A_f; wire B_s; wire [7:0] B_e; wire [17:0] B_f; assign A_s = iA[26]; assign A_e = iA[25:18]; assign A_f = {1'b1, iA[17:1]}; assign B_s = iB[26]; assign B_e = iB[25:18]; assign B_f = {1'b1, iB[17:1]}; // XOR sign bits to determine product sign. wire oProd_s; assign oProd_s = A_s ^ B_s; // Multiply the fractions of A and B wire [35:0] pre_prod_frac; assign pre_prod_frac = A_f * B_f; // Add exponents of A and B wire [8:0] pre_prod_exp; assign pre_prod_exp = A_e + B_e; // If top bit of product frac is 0, shift left one wire [7:0] oProd_e; wire [17:0] oProd_f; assign oProd_e = pre_prod_frac[35] ? (pre_prod_exp-9'd126) : (pre_prod_exp - 9'd127); assign oProd_f = pre_prod_frac[35] ? pre_prod_frac[34:17] : pre_prod_frac[33:16]; // Detect underflow wire underflow; assign underflow = pre_prod_exp < 9'h80; // Detect zero conditions (either product frac doesn't start with 1, or underflow) assign oProd = underflow ? 27'b0 : (B_e == 8'd0) ? 27'b0 : (A_e == 8'd0) ? 27'b0 : {oProd_s, oProd_e, oProd_f}; endmodule /************************************************************************** * Floating Point Adder * * 2-stage pipeline * *************************************************************************/ module FpAdd ( input iCLK, input [26:0] iA, input [26:0] iB, output reg [26:0] oSum ); // Extract fields of A and B. wire A_s; wire [7:0] A_e; wire [17:0] A_f; wire B_s; wire [7:0] B_e; wire [17:0] B_f; assign A_s = iA[26]; assign A_e = iA[25:18]; assign A_f = {1'b1, iA[17:1]}; assign B_s = iB[26]; assign B_e = iB[25:18]; assign B_f = {1'b1, iB[17:1]}; wire A_larger; // Shift fractions of A and B so that they align. wire [7:0] exp_diff_A; wire [7:0] exp_diff_B; wire [7:0] larger_exp; wire [36:0] A_f_shifted; wire [36:0] B_f_shifted; assign exp_diff_A = B_e - A_e; // if B bigger assign exp_diff_B = A_e - B_e; // if A bigger assign larger_exp = (B_e > A_e) ? B_e : A_e; assign A_f_shifted = A_larger ? {1'b0, A_f, 18'b0} : (exp_diff_A > 9'd35) ? 37'b0 : ({1'b0, A_f, 18'b0} >> exp_diff_A); assign B_f_shifted = ~A_larger ? {1'b0, B_f, 18'b0} : (exp_diff_B > 9'd35) ? 37'b0 : ({1'b0, B_f, 18'b0} >> exp_diff_B); // Determine which of A, B is larger assign A_larger = (A_e > B_e) ? 1'b1 : ((A_e == B_e) && (A_f > B_f)) ? 1'b1 : 1'b0; // Calculate sum or difference of shifted fractions. wire [36:0] pre_sum; assign pre_sum = ((A_s^B_s) & A_larger) ? A_f_shifted - B_f_shifted : ((A_s^B_s) & ~A_larger) ? B_f_shifted - A_f_shifted : A_f_shifted + B_f_shifted; // buffer midway results reg [36:0] buf_pre_sum; reg [7:0] buf_larger_exp; reg buf_A_e_zero; reg buf_B_e_zero; reg [26:0] buf_A; reg [26:0] buf_B; reg buf_oSum_s; always @(posedge iCLK) begin buf_pre_sum <= pre_sum; buf_larger_exp <= larger_exp; buf_A_e_zero <= (A_e == 8'b0); buf_B_e_zero <= (B_e == 8'b0); buf_A <= iA; buf_B <= iB; buf_oSum_s <= A_larger ? A_s : B_s; end // Convert to positive fraction and a sign bit. wire [36:0] pre_frac; assign pre_frac = buf_pre_sum; // Determine output fraction and exponent change with position of first 1. wire [17:0] oSum_f; wire [7:0] shft_amt; assign shft_amt = pre_frac[36] ? 8'd0 : pre_frac[35] ? 8'd1 : pre_frac[34] ? 8'd2 : pre_frac[33] ? 8'd3 : pre_frac[32] ? 8'd4 : pre_frac[31] ? 8'd5 : pre_frac[30] ? 8'd6 : pre_frac[29] ? 8'd7 : pre_frac[28] ? 8'd8 : pre_frac[27] ? 8'd9 : pre_frac[26] ? 8'd10 : pre_frac[25] ? 8'd11 : pre_frac[24] ? 8'd12 : pre_frac[23] ? 8'd13 : pre_frac[22] ? 8'd14 : pre_frac[21] ? 8'd15 : pre_frac[20] ? 8'd16 : pre_frac[19] ? 8'd17 : pre_frac[18] ? 8'd18 : pre_frac[17] ? 8'd19 : pre_frac[16] ? 8'd20 : pre_frac[15] ? 8'd21 : pre_frac[14] ? 8'd22 : pre_frac[13] ? 8'd23 : pre_frac[12] ? 8'd24 : pre_frac[11] ? 8'd25 : pre_frac[10] ? 8'd26 : pre_frac[9] ? 8'd27 : pre_frac[8] ? 8'd28 : pre_frac[7] ? 8'd29 : pre_frac[6] ? 8'd30 : pre_frac[5] ? 8'd31 : pre_frac[4] ? 8'd32 : pre_frac[3] ? 8'd33 : pre_frac[2] ? 8'd34 : pre_frac[1] ? 8'd35 : pre_frac[0] ? 8'd36 : 8'd37; wire [53:0] pre_frac_shft, uflow_shift; // the shift +1 is because high order bit is not stored, but implied assign pre_frac_shft = {pre_frac, 17'b0} << (shft_amt+1); //? shft_amt+1 assign uflow_shift = {pre_frac, 17'b0} << (shft_amt); //? shft_amt for overflow assign oSum_f = pre_frac_shft[53:36]; wire [7:0] oSum_e; assign oSum_e = buf_larger_exp - shft_amt + 8'b1; // Detect underflow wire underflow; // this incorrectly sets uflow for 10-10.1 //assign underflow = ~oSum_e[7] && buf_larger_exp[7] && (shft_amt != 8'b0); // if top bit of matissa is not set, then denorm assign underflow = ~uflow_shift[53]; always @(posedge iCLK) begin oSum <= (buf_A_e_zero && buf_B_e_zero) ? 27'b0 : buf_A_e_zero ? buf_B : buf_B_e_zero ? buf_A : underflow ? 27'b0 : (pre_frac == 0) ? 27'b0 : {buf_oSum_s, oSum_e, oSum_f}; end //output update endmodule /// end /////////////////////////////////////////////////////////////////////
graphics.c
//////////////////////////////////////////////////////////////////////// // GRAPHICS.C // Responsible for sending user input from serial input commands // Compile with gcc graphics.c -o graphics // (Priya Kattappurath, Michael Rivera, Caitlin Stanton) //////////////////////////////////////////////////////////////////////// #include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <fcntl.h> #include <math.h> #include <sys/types.h> #include <string.h> // interprocess comm #include <sys/ipc.h> #include <sys/shm.h> #include <sys/mman.h> #include <time.h> // network stuff #include <sys/socket.h> #include <netinet/in.h> #include <netdb.h> #include "address_map_arm_brl4.h" // fixed point #define float2fix30(a) ((int)((a)*1073741824)) // 2^30 #define SWAP(X, Y) \ do \ { \ int temp = X; \ X = Y; \ Y = temp; \ } while (0) // shift fraction to 32-bit sound #define fix2audio28(a) (a << 4) // shift fraction to 16-bit sound #define fix2audio16(a) (a >> 12) /* function prototypes */ void VGA_text(int, int, char *); void VGA_text_clear(); void VGA_box(int, int, int, int, short); void VGA_line(int, int, int, int, short); void VGA_disc(int, int, int, short); int VGA_read_pixel(int, int); int video_in_read_pixel(int, int); void draw_delay(void); // 8-bit line color choices #define pink (0xe6) #define red (0xe1) #define orange (0xf1) #define green (0x59) #define blue (0x32) #define purple (0xaa) #define white (0xff) #define black (0x00) int colors[] = {pink, red, orange, green, blue, purple, white, black}; // pixel macro // !!!PACKED VGA MEMORY!!! #define VGA_PIXEL(x, y, color) \ do \ { \ char *pixel_ptr; \ pixel_ptr = (char *)vga_pixel_ptr + ((y)*640) + (x); \ *(char *)pixel_ptr = (color); \ } while (0) // virtual to real address pointers volatile unsigned int *reset_pio; volatile unsigned int *lsystem_char; volatile char *axiom; volatile unsigned int *iterations; volatile unsigned int *length; volatile unsigned int *lsystem; volatile unsigned int *start_x; volatile unsigned int *start_y; volatile unsigned int *timer_counter; volatile char *color; // phase accumulator // fixed pt macros suitable for 32-bit sound typedef signed int fix28; // drum-specific multiply macros simulated by shifts #define times0pt5(a) ((a) >> 1) #define times0pt25(a) ((a) >> 2) #define times2pt0(a) ((a) << 1) #define times4pt0(a) ((a) << 2) #define times0pt9998(a) ((a) - ((a) >> 12)) //>>10 #define times0pt9999(a) ((a) - ((a) >> 13)) //>>10 #define times0pt999(a) ((a) - ((a) >> 10)) //>>10 #define MIN(X, Y) (((X) < (Y)) ? (X) : (Y)) //multiply two fixed 4:28 #define multfix28(a, b) ((fix28)((((signed long long)(a)) * ((signed long long)(b))) >> 28)) //#define multfix28(a,b) ((fix28)((( ((short)((a)>>17)) * ((short)((b)>>17)) )))) #define float2fix28(a) ((fix28)((a)*268435456.0f)) // 2^28 #define fix2float28(a) ((float)(a) / 268435456.0f) #define int2fix28(a) ((a) << 28) #define fix2int28(a) ((a) >> 28) #define times_rho(a, b) (multfix28(a, float2fix28(MIN(0.49, 0.06 + fix2float28(multfix28((b >> 1), (b >> 1))))))) //>>2 // shift fraction to 32-bit sound #define fix2audio28(a) (a << 4) // reset pio #define RESET_START 0x00000000 #define RESET_END 0x0000000f #define RESET_SPAN 0x00000010 // lsystem char pio #define LSYSTEM_CHAR_START 0x00000010 #define LSYSTEM_CHAR_END 0x0000001f #define LSYSTEM_CHAR_SPAN 0x00000010 // axiom pio #define AXIOM_START 0x00000030 #define AXIOM_END 0x0000003f #define AXIOM_SPAN 0x00000010 // iterations pio #define ITERATIONS_START 0x00000040 #define ITERATIONS_END 0x0000004f #define ITERATIONS_SPAN 0x00000010 // length pio #define LENGTH_START 0x00000050 #define LENGTH_END 0x0000005f #define LENGTH_SPAN 0x00000010 // lsystem pio #define LSYSTEM_START 0x00000060 #define LSYSTEM_END 0x0000006f #define LSYSTEM_SPAN 0x00000010 // start_x pio #define START_X_START 0x00000070 #define START_X_END 0x0000007f #define START_X_SPAN 0x00000010 // start_y pio #define START_Y_START 0x00000080 #define START_Y_END 0x0000008f #define START_Y_SPAN 0x00000010 // timer counter pio #define TIMER_START 0x00000090 #define TIMER_END 0x0000009f #define TIMER_SPAN 0x00000010 // color pio #define COLOR_START 0x00000100 #define COLOR_END 0x0000010f #define COLOR_SPAN 0x00000010 // the light weight buss base void *h2p_lw_virtual_base; // pixel buffer volatile unsigned int *vga_pixel_ptr = NULL; void *vga_pixel_virtual_base; // character buffer volatile unsigned int *vga_char_ptr = NULL; void *vga_char_virtual_base; // /dev/mem file descriptor int fd; // shared memory key_t mem_key = 0xf0; int shared_mem_id; int *shared_ptr; /// int string_length(volatile char *s) { int c = 0; while (*s != '\0') { c++; *s++; } return c; } int main(void) { // Declare volatile pointers to I/O registers (volatile // means that IO load and store instructions will be used // to access these pointer locations, // instead of regular memory loads and stores) uint64_t diff; struct timespec start, end; //int i; // === shared memory ======================= // with video process shared_mem_id = shmget(mem_key, 100, IPC_CREAT | 0666); shared_ptr = shmat(shared_mem_id, NULL, 0); // === need to mmap: ======================= // FPGA_CHAR_BASE // FPGA_ONCHIP_BASE // HW_REGS_BASE // === get FPGA addresses ================== // Open /dev/mem if ((fd = open("/dev/mem", (O_RDWR | O_SYNC))) == -1) { printf("ERROR: could not open \"/dev/mem\"...\n"); return (1); } // get virtual addr that maps to physical h2p_lw_virtual_base = mmap(NULL, HW_REGS_SPAN, (PROT_READ | PROT_WRITE), MAP_SHARED, fd, HW_REGS_BASE); if (h2p_lw_virtual_base == MAP_FAILED) { printf("ERROR: mmap1() failed...\n"); close(fd); return (1); } //map reset PIO port reset_pio = (volatile unsigned int *)(h2p_lw_virtual_base + RESET_START); lsystem_char = (volatile unsigned int *)(h2p_lw_virtual_base + LSYSTEM_CHAR_START); axiom = (volatile char *)(h2p_lw_virtual_base + AXIOM_START); iterations = (volatile unsigned int *)(h2p_lw_virtual_base + ITERATIONS_START); length = (volatile unsigned int *)(h2p_lw_virtual_base + LENGTH_START); lsystem = (volatile unsigned int *)(h2p_lw_virtual_base + LSYSTEM_START); start_x = (volatile unsigned int *)(h2p_lw_virtual_base + START_X_START); start_y = (volatile unsigned int *)(h2p_lw_virtual_base + START_Y_START); timer_counter = (volatile unsigned int *)(h2p_lw_virtual_base + TIMER_START); color = (volatile char *)(h2p_lw_virtual_base + COLOR_START); // address to resolution register //res_reg_ptr =(unsigned int *)(h2p_lw_virtual_base + // resOffset); //addr to vga status //stat_reg_ptr = (unsigned int *)(h2p_lw_virtual_base + // statusOffset); // === get VGA char addr ===================== // get virtual addr that maps to physical vga_char_virtual_base = mmap(NULL, FPGA_CHAR_SPAN, (PROT_READ | PROT_WRITE), MAP_SHARED, fd, FPGA_CHAR_BASE); if (vga_char_virtual_base == MAP_FAILED) { printf("ERROR: mmap2() failed...\n"); close(fd); return (1); } // Get the address that maps to the FPGA LED control vga_char_ptr = (unsigned int *)(vga_char_virtual_base); // === get VGA pixel addr ==================== // get virtual addr that maps to physical vga_pixel_virtual_base = mmap(NULL, FPGA_ONCHIP_SPAN, (PROT_READ | PROT_WRITE), MAP_SHARED, fd, FPGA_ONCHIP_BASE); if (vga_pixel_virtual_base == MAP_FAILED) { printf("ERROR: mmap3() failed...\n"); close(fd); return (1); } // Get the address that maps to the FPGA pixel buffer vga_pixel_ptr = (unsigned int *)(vga_pixel_virtual_base); char buffer[256]; //add a signal to help clear between new l-systems //high in TOP_RESET *length = 0; memset(axiom, 0, 4); while (1) { printf("What L-System would you like?\n"); //recommend default axiom, say what characters are available printf("0: Dragon Curve; 1: Sierpinski Arrowhead (1); 2: Koch Curve; 3: Sierpinski Arrowhead (2); 4: Koch Snowflake; 5: Cross; 6: Tessellated Triangle \n"); printf("L-System: "); scanf("%i", lsystem); //print default axiom and available characters based on lsystem choice if (*lsystem == 0) { printf("The default axiom for the Dragon Curve is FX\n"); printf("Rule-making characters: X,Y\n"); printf("Drawing characters: F\n"); printf("Available characters are: F,A,X,Y,+,-\n"); } else if (*lsystem == 1) { printf("The default axiom for the Sierpinski Arrowhead (1) is YF\n"); printf("Rule-making characters: X,Y\n"); printf("Drawing characters: F\n"); printf("Available characters are: F,A,X,Y,+,-\n"); } else if (*lsystem == 2) { printf("The default axiom for the Koch Curve is F\n"); printf("Rule-making characters: F\n"); printf("Drawing characters: F\n"); printf("Available characters are: F,A,X,Y,+,-\n"); } else if (*lsystem == 3) { printf("The default axiom for the Sierpinski Arrowhead (2) is A\n"); printf("Rule-making characters: A,F\n"); printf("Drawing characters: A,F\n"); printf("Available characters are: F,A,X,Y,+,-\n"); } else if (*lsystem == 4) { printf("The default axiom for the Koch Snowflake is F++F++F\n"); printf("Rule-making characters: F\n"); printf("Drawing characters: F\n"); printf("Available characters are: F,A,X,Y,+,-\n"); } else if (*lsystem == 5) { printf("The default axiom for the Cross is F+F+F+F\n"); printf("Rule-making characters: F\n"); printf("Drawing characters: F\n"); printf("Available characters are: F,A,X,Y,+,-\n"); } else if (*lsystem == 6) { printf("The default axiom for the Tessellated Triangle is F+F+F\n"); printf("Rule-making characters: F\n"); printf("Drawing characters: F\n"); printf("Available characters are: F,A,X,Y,+,-\n"); } else { printf("Please input a valid L-System\n\n"); } while (*lsystem > 6 || *length < 0) { printf("What L-System would you like?\n"); //recommend default axiom, say what characters are available printf("0: Dragon curve; 1: Sierpinski Arrowhead (1); 2: Koch Curve; 3: Sierpinski Arrowhead (2); 4: Koch Snowflake; 5: Cross; 6: Tessellated Triangle \n"); printf("L-System: "); scanf("%i", lsystem); //print default axiom and available characters based on lsystem choice if (*lsystem == 0) { printf("The default axiom for the Dragon Curve is FX\n"); printf("Rule-making characters: X,Y\n"); printf("Drawing characters: F\n"); printf("Available characters are: F,A,X,Y,+,-\n"); } else if (*lsystem == 1) { printf("The default axiom for the Sierpinski Arrowhead (1) is YF\n"); printf("Rule-making characters: X,Y\n"); printf("Drawing characters: F\n"); printf("Available characters are: F,A,X,Y,+,-\n"); } else if (*lsystem == 2) { printf("The default axiom for the Koch Curve is F\n"); printf("Rule-making characters: F\n"); printf("Drawing characters: F\n"); printf("Available characters are: F,A,X,Y,+,-\n"); } else if (*lsystem == 3) { printf("The default axiom for the Sierpinski Arrowhead (2) is A\n"); printf("Rule-making characters: A,F\n"); printf("Drawing characters: A,F\n"); printf("Available characters are: F,A,X,Y,+,-\n"); } else if (*lsystem == 4) { printf("The default axiom for the Koch Snowflake is F++F++F\n"); printf("Rule-making characters: F\n"); printf("Drawing characters: F\n"); printf("Available characters are: F,A,X,Y,+,-\n"); } else if (*lsystem == 5) { printf("The default axiom for the Cross is F+F+F+F\n"); printf("Rule-making characters: F\n"); printf("Drawing characters: F\n"); printf("Available characters are: F,A,X,Y,+,-\n"); } else if (*lsystem == 6) { printf("The default axiom for the Tessellated Triangle is F+F+F\n"); printf("Rule-making characters: F\n"); printf("Drawing characters: F\n"); printf("Available characters are: F,A,X,Y,+,-\n"); } else { printf("Please input a valid L-System\n\n"); } } printf("Axiom: "); scanf("%s", &axiom[0]); printf("Number of iterations: "); scanf("%i", iterations); printf("Length of line: "); scanf("%i", length); if (*lsystem == 0) { while (*length < 3 || *length > 31) { printf("Length of line: "); scanf("%i", length); } } else if (*lsystem == 1 || *lsystem == 2 || *lsystem == 3) { while (*length < 7 || *length > 31) { printf("Length of line: "); scanf("%i", length); } } printf("What color would you like?\n"); printf("Choices are: pink, red, orange, green, blue, purple, and white\n"); printf("Line color: "); scanf("%s", &buffer[0]); if (strcmp(buffer, "pink") == 0) { *color = pink; } else if (strcmp(buffer, "red") == 0) { *color = red; } else if (strcmp(buffer, "orange") == 0) { *color = orange; } else if (strcmp(buffer, "green") == 0) { *color = green; } else if (strcmp(buffer, "blue") == 0) { *color = blue; } else if (strcmp(buffer, "purple") == 0) { *color = purple; } else if (strcmp(buffer, "white") == 0) { *color = white; } else { printf("Invalid color choice, default white chosen\n"); *color = white; } printf("Starting x coordinate (between 1 and 638): "); scanf("%i", start_x); while (*start_x < 1 || *start_x > 638) { printf("Starting x coordinate (between 1 and 638): "); scanf("%i", start_x); } printf("Starting y coordinate (between 1 and 478): "); scanf("%i", start_y); while (*start_y < 1 || *start_y > 478) { printf("Starting y coordinate (between 1 and 478): "); scanf("%i", start_y); } VGA_box(0, 0, 639, 479, black); *reset_pio = 1; scanf("%s", &buffer[0]); *reset_pio = 0; printf("reset sent\n"); int wait = 0; while (wait != *timer_counter) { wait = *timer_counter; }; printf("Time to visualize: %f\n\n", (float)(*timer_counter * (pow(10, -9)) * 20)); *buffer = 0; while (*buffer != 4) { printf("1: Zoom in; 2: Zoom out; 3: Pan to new coordinates; 4: Quit zoom/pan functionality\n"); scanf("%i", buffer); if (*buffer == 4) { printf("\n"); break; } else { if (*buffer == 1) { //zoom in *length = *length * 2; if (*length <= 2) { *length == 3; } VGA_box(0, 0, 639, 479, black); } if (*buffer == 2) { //zoom in *length = *length / 2; if (*length >= 32) { *length == 31; } VGA_box(0, 0, 639, 479, black); } if (*buffer == 3) { printf("Starting x coordinate (between 1 and 638): "); scanf("%i", start_x); while (*start_x < 1 || *start_x > 638) { printf("Starting x coordinate (between 1 and 638): "); scanf("%i", start_x); } printf("Starting y coordinate (between 1 and 478): "); scanf("%i", start_y); while (*start_y < 1 || *start_y > 478) { printf("Starting y coordinate (between 1 and 478): "); scanf("%i", start_y); } VGA_box(0, 0, 639, 479, black); } *reset_pio = 1; scanf("%s", &buffer[0]); *reset_pio = 0; printf("reset sent\n"); wait = 0; while (wait != *timer_counter) { wait = *timer_counter; }; printf("Time to visualize: %f\n\n", (float)(*timer_counter * (pow(10, -9)) * 20)); *buffer = 0; } } } // end while(1) } // end main /**************************************************************************************** * Subroutine to read a pixel from the VGA monitor ****************************************************************************************/ int VGA_read_pixel(int x, int y) { char *pixel_ptr; pixel_ptr = (char *)vga_pixel_ptr + ((y)*640) + (x); return *pixel_ptr; } /**************************************************************************************** * Subroutine to send a string of text to the VGA monitor ****************************************************************************************/ void VGA_text(int x, int y, char *text_ptr) { volatile char *character_buffer = (char *)vga_char_ptr; // VGA character buffer int offset; /* assume that the text string fits on one line */ offset = (y << 7) + x; while (*(text_ptr)) { // write to the character buffer *(character_buffer + offset) = *(text_ptr); ++text_ptr; ++offset; } } /**************************************************************************************** * Subroutine to clear text to the VGA monitor ****************************************************************************************/ void VGA_text_clear() { volatile char *character_buffer = (char *)vga_char_ptr; // VGA character buffer int offset, x, y; for (x = 0; x < 79; x++) { for (y = 0; y < 59; y++) { /* assume that the text string fits on one line */ offset = (y << 7) + x; // write to the character buffer *(character_buffer + offset) = ' '; } } } /**************************************************************************************** * Draw a filled rectangle on the VGA monitor ****************************************************************************************/ #define SWAP(X, Y) \ do \ { \ int temp = X; \ X = Y; \ Y = temp; \ } while (0) void VGA_box(int x1, int y1, int x2, int y2, short pixel_color) { char *pixel_ptr; int row, col; /* check and fix box coordinates to be valid */ if (x1 > 639) x1 = 639; if (y1 > 479) y1 = 479; if (x2 > 639) x2 = 639; if (y2 > 479) y2 = 479; if (x1 < 0) x1 = 0; if (y1 < 0) y1 = 0; if (x2 < 0) x2 = 0; if (y2 < 0) y2 = 0; if (x1 > x2) SWAP(x1, x2); if (y1 > y2) SWAP(y1, y2); for (row = y1; row <= y2; row++) for (col = x1; col <= x2; ++col) { //640x480 VGA_PIXEL(col, row, pixel_color); //pixel_ptr = (char *)vga_pixel_ptr + (row<<10) + col ; // set pixel color //*(char *)pixel_ptr = pixel_color; } } /**************************************************************************************** * Draw a filled circle on the VGA monitor ****************************************************************************************/ void VGA_disc(int x, int y, int r, short pixel_color) { char *pixel_ptr; int row, col, rsqr, xc, yc; rsqr = r * r; for (yc = -r; yc <= r; yc++) for (xc = -r; xc <= r; xc++) { col = xc; row = yc; // add the r to make the edge smoother if (col * col + row * row <= rsqr + r) { col += x; // add the center point row += y; // add the center point //check for valid 640x480 if (col > 639) col = 639; if (row > 479) row = 479; if (col < 0) col = 0; if (row < 0) row = 0; VGA_PIXEL(col, row, pixel_color); //pixel_ptr = (char *)vga_pixel_ptr + (row<<10) + col ; // set pixel color //nanosleep(&delay_time, NULL); //draw_delay(); //*(char *)pixel_ptr = pixel_color; } } } // ============================================= // === Draw a line // ============================================= //plot a line //at x1,y1 to x2,y2 with color //Code is from David Rodgers, //"Procedural Elements of Computer Graphics",1985 void VGA_line(int x1, int y1, int x2, int y2, short c) { int e; signed int dx, dy, j, temp; signed int s1, s2, xchange; signed int x, y; char *pixel_ptr; /* check and fix line coordinates to be valid */ if (x1 > 639) x1 = 639; if (y1 > 479) y1 = 479; if (x2 > 639) x2 = 639; if (y2 > 479) y2 = 479; if (x1 < 0) x1 = 0; if (y1 < 0) y1 = 0; if (x2 < 0) x2 = 0; if (y2 < 0) y2 = 0; x = x1; y = y1; //take absolute value if (x2 < x1) { dx = x1 - x2; s1 = -1; } else if (x2 == x1) { dx = 0; s1 = 0; } else { dx = x2 - x1; s1 = 1; } if (y2 < y1) { dy = y1 - y2; s2 = -1; } else if (y2 == y1) { dy = 0; s2 = 0; } else { dy = y2 - y1; s2 = 1; } xchange = 0; if (dy > dx) { temp = dx; dx = dy; dy = temp; xchange = 1; } e = ((int)dy << 1) - dx; for (j = 0; j <= dx; j++) { //video_pt(x,y,c); //640x480 VGA_PIXEL(x, y, c); //pixel_ptr = (char *)vga_pixel_ptr + (y<<10)+ x; // set pixel color //*(char *)pixel_ptr = c; if (e >= 0) { if (xchange == 1) x = x + s1; else y = y + s2; e = e - ((int)dx << 1); } if (xchange == 1) y = y + s2; else x = x + s1; e = e + ((int)dy << 1); } } /////////////////////////////////////////////
lsystem.py
# //////////////////////////////////////////////////////////////////////// # LSYSTEM.PY # Baseline Python code # Computes and graphs entire L-Systems # (Priya Kattappurath, Michael Rivera, Caitlin Stanton) # //////////////////////////////////////////////////////////////////////// import pygame import math import random import time # other = variables # F,A = move n forward # - = turn left by angle # + = turn right by angle rules = {} # Have to comment out rules, axioms, and angles for all L-Systems but one # The following are the seven L-Systems for our project # rules['X'] = 'X+YF+' # Dragon curve # rules['Y'] = '-FX-Y' # axiom = 'FX' # angle = 90 # rules['X'] = 'YF+XF+Y' # Sierpinski arrowhead (1) # rules['Y'] = 'XF-YF-X' # axiom = 'YF' # angle = 60 # rules['A'] = '+F-A-F+' # Sierpinski arrowhead (2) # rules['F'] = '-A+F+A-' # axiom = 'A' # angle = 60 # rules['F'] = 'F+F--F+F' # Koch curve # axiom = 'F' # angle = 60 # rules['F'] = 'F-F++F-F' # Koch snowflake # axiom = 'F++F++F' # angle = 60 # rules['F'] = 'F+FF++F+F' # Cross # axiom = 'F+F+F+F' # angle = 90 # rules['F'] = 'F-F+F' # Tessellated triangle # axiom = 'F+F+F' # angle = 120 iterations = 12 # number of iterations step = 7 # step size / line length angleoffset = 90 size = width, height = 1920, 1080 # display with/height pygame.init() # init display screen = pygame.display.set_mode(size) # open screen # startpos = 100, height - 225 # startpos = 50, height / 2 - 50 # startpos = width / 2, height / 2 startpos = width / 2 - 200, height / 2 # startpos = 100, height / 2 # startpos = 10, 10 def applyRule(input): output = "" for rule, result in rules.items( ): # applying the rule by checking the current char against it if (input == rule): output = result # Rule 1 break else: output = input # else ( no rule set ) output = the current char -> no rule was applied return output def processString(oldStr): newstr = "" for character in oldStr: newstr = newstr + applyRule(character) # build the new string return newstr def createSystem(numIters, axiom): startString = axiom endString = "" for i in range(numIters): # iterate with appling the rules print("Iteration: {0}".format(i)) endString = processString(startString) startString = endString return endString def polar_to_cart(theta, r, offx, offy): x = r * math.cos(math.radians(theta)) y = r * math.sin(math.radians(theta)) return tuple([x + y for x, y in zip((int(x), int(y)), (offx, offy))]) def cart_to_polar(x, y): return (math.degrees(math.atan(y / x)), math.sqrt(math.pow(x, 2) + math.pow(y, 2))) def drawTree(input, oldpos): a = 0 # angle i = 0 # counter for processcalculation processOld = 0 # old process newpos = oldpos color = (255, 255, 255) linesize = 1 for character in input: # process for drawing the l-system by writing the string to the screen i += 1 # print process in percent process = i * 100 / len(input) if not process == processOld: # print(process, "%") processOld = process if character == 'A': # magic happens here newpos = polar_to_cart(a + angleoffset, step, *oldpos) pygame.draw.line(screen, color, oldpos, newpos, linesize) oldpos = newpos elif character == 'F': newpos = polar_to_cart(a + angleoffset, step, *oldpos) pygame.draw.line(screen, color, oldpos, newpos, linesize) oldpos = newpos elif character == '+': a += angle elif character == '-': a -= angle if __name__ == '__main__': start = time.time() tree = (createSystem(iterations, axiom)) # print(len(tree)) drawTree(tree, startpos) end = time.time() pygame.display.flip() pygame.image.save(screen, "screenshot.png") print("Finished in " + str(end - start) + " seconds") while (1): pass exit() # uncommand
translated_python.c
//////////////////////////////////////////////////////////////////////// // TRANSLATED_PYTHON.C // Baseline C code from lsystem.py // Computes and graphs entire L-Systems // Compile with gcc translated_python.c -o fp1 -lm // Must have ECE 5760 GPU with FAST display from SRAM project loaded onto board // (Priya Kattappurath, Michael Rivera, Caitlin Stanton) //////////////////////////////////////////////////////////////////////// #include <stdio.h> #include <string.h> #include <stdlib.h> #include <unistd.h> #include <fcntl.h> #include <sys/types.h> #include <sys/ipc.h> #include <sys/shm.h> #include <sys/mman.h> #include <sys/time.h> #include <math.h> #include "address_map_arm_brl4.h" /* function prototypes */ void VGA_text(int, int, char *); void VGA_text_clear(); void VGA_box(int, int, int, int, short); void VGA_line(int, int, int, int, short); void VGA_disc(int, int, int, short); int VGA_read_pixel(int, int); int video_in_read_pixel(int, int); void draw_delay(void); // 16-bit primary colors #define red (0 + (0 << 5) + (31 << 11)) #define dark_red (0 + (0 << 5) + (15 << 11)) #define green (0 + (63 << 5) + (0 << 11)) #define dark_green (0 + (31 << 5) + (0 << 11)) #define blue (31 + (0 << 5) + (0 << 11)) #define dark_blue (15 + (0 << 5) + (0 << 11)) #define yellow (0 + (63 << 5) + (31 << 11)) #define cyan (31 + (63 << 5) + (0 << 11)) #define magenta (31 + (0 << 5) + (31 << 11)) #define black (0x0000) #define gray (15 + (31 << 5) + (51 << 11)) #define white (0xffff) int colors[] = {red, dark_red, green, dark_green, blue, dark_blue, yellow, cyan, magenta, gray, black, white}; // the light weight buss base void *h2p_lw_virtual_base; // RAM FPGA command buffer volatile unsigned int *sram_ptr = NULL; void *sram_virtual_base; // pixel buffer volatile unsigned int *vga_pixel_ptr = NULL; void *vga_pixel_virtual_base; // character buffer volatile unsigned int *vga_char_ptr = NULL; void *vga_char_virtual_base; // /dev/mem file id int fd; // pixel macro // !!!PACKED VGA MEMORY!!! #define VGA_PIXEL(x, y, color) \ do \ { \ char *pixel_ptr; \ pixel_ptr = (char *)vga_pixel_ptr + ((y)*640) + (x); \ *(char *)pixel_ptr = (color); \ } while (0) char *applyRule_DragonCurve(char input) { char tmp[1000000]; switch (input) { case 'X': { strcpy(tmp, "X+YF+"); break; } case 'Y': { strcpy(tmp, "-FX-Y"); break; } default: { strcpy(tmp, (char[2]){(char)input, '\0'}); break; } } return tmp; } char *processString_DragonCurve(char *prev) { int i = 0; char *tmp; char *check; int length = strlen(prev); for (i = 0; i < length; i++) { check = prev; tmp = applyRule_DragonCurve(prev[i]); strcat(prev, tmp); } return prev; } char *createSystem_DragonCurve(int numIters, char *axiom) { char start[1000000]; strcpy(start, axiom); char end[1000000]; *end = ""; int i = 0; char *check; for (i = 0; i < numIters; i++) { check = processString_DragonCurve(start); *start = end; } return start; } void draw_DragonCurve(char *input, int old_x, int old_y) { int a = 0; // 0 degrees is straight up vertically int length = 10; int new_x = old_x; int new_y = old_y; int i = 0; char *check = input; printf("GRAPHING STRING: "); while (*check != '\0') printf("%c", *check++); printf("\n"); for (i = 0; i < strlen(input); i++) { if (input[i] == 'X') { continue; } else if (input[i] == 'Y') { continue; } else if (input[i] == 'F') { if (a % 360 == 0) { VGA_line(new_x, new_y, new_x, new_y - length, red); new_x = new_x; new_y = new_y - length; } else if (a % 270 == 0) { VGA_line(new_x, new_y, new_x - length, new_y, yellow); new_x = new_x - length; new_y = new_y; } else if (a % 180 == 0) { VGA_line(new_x, new_y, new_x, new_y + length, red); new_x = new_x; new_y = new_y + length; } else if (a % 90 == 0) { VGA_line(new_x, new_y, new_x + length, new_y, yellow); new_x = new_x + length; new_y = new_y; } } else if (input[i] == '+') { a = a + 90; } else if (input[i] == '-') { a = a - 90; } } } // measure time struct timeval t1, t2; double elapsedTime; struct timespec delay_time; int main(void) { delay_time.tv_nsec = 10; delay_time.tv_sec = 0; // Declare volatile pointers to I/O registers (volatile // means that IO load and store instructions will be used // to access these pointer locations, // instead of regular memory loads and stores) // === need to mmap: ======================= // FPGA_CHAR_BASE // FPGA_ONCHIP_BASE // HW_REGS_BASE // === get FPGA addresses ================== // Open /dev/mem if ((fd = open("/dev/mem", (O_RDWR | O_SYNC))) == -1) { printf("ERROR: could not open \"/dev/mem\"...\n"); return (1); } // get virtual addr that maps to physical // for light weight bus h2p_lw_virtual_base = mmap(NULL, HW_REGS_SPAN, (PROT_READ | PROT_WRITE), MAP_SHARED, fd, HW_REGS_BASE); if (h2p_lw_virtual_base == MAP_FAILED) { printf("ERROR: mmap1() failed...\n"); close(fd); return (1); } // === get VGA char addr ===================== // get virtual addr that maps to physical vga_char_virtual_base = mmap(NULL, FPGA_CHAR_SPAN, (PROT_READ | PROT_WRITE), MAP_SHARED, fd, FPGA_CHAR_BASE); if (vga_char_virtual_base == MAP_FAILED) { printf("ERROR: mmap2() failed...\n"); close(fd); return (1); } // Get the address that maps to the character vga_char_ptr = (unsigned int *)(vga_char_virtual_base); // === get VGA pixel addr ==================== // get virtual addr that maps to physical // SDRAM vga_pixel_virtual_base = mmap(NULL, FPGA_ONCHIP_SPAN, (PROT_READ | PROT_WRITE), MAP_SHARED, fd, SDRAM_BASE); //SDRAM_BASE if (vga_pixel_virtual_base == MAP_FAILED) { printf("ERROR: mmap3() failed...\n"); close(fd); return (1); } // Get the address that maps to the FPGA pixel buffer vga_pixel_ptr = (unsigned int *)(vga_pixel_virtual_base); // === get RAM FPGA parameter addr ========= sram_virtual_base = mmap(NULL, FPGA_ONCHIP_SPAN, (PROT_READ | PROT_WRITE), MAP_SHARED, fd, FPGA_ONCHIP_BASE); //fp if (sram_virtual_base == MAP_FAILED) { printf("ERROR: mmap3() failed...\n"); close(fd); return (1); } // Get the address that maps to the RAM buffer sram_ptr = (unsigned int *)(sram_virtual_base); // =========================================== // clear the screen VGA_box(0, 0, 639, 479, 0x03); printf("start\n"); char input[100] = "FX"; char *tree = createSystem_DragonCurve(9, input); draw_DragonCurve(tree, 100, 500); } // end main /**************************************************************************************** * Subroutine to read a pixel from the video input ****************************************************************************************/ // int video_in_read_pixel(int x, int y){ // char *pixel_ptr ; // pixel_ptr = (char *)video_in_ptr + ((y)<<9) + (x) ; // return *pixel_ptr ; // } /**************************************************************************************** * Subroutine to read a pixel from the VGA monitor ****************************************************************************************/ int VGA_read_pixel(int x, int y) { char *pixel_ptr; pixel_ptr = (char *)vga_pixel_ptr + ((y)*640) + (x); return *pixel_ptr; } /**************************************************************************************** * Subroutine to send a string of text to the VGA monitor ****************************************************************************************/ void VGA_text(int x, int y, char *text_ptr) { volatile char *character_buffer = (char *)vga_char_ptr; // VGA character buffer int offset; /* assume that the text string fits on one line */ offset = (y << 7) + x; while (*(text_ptr)) { // write to the character buffer *(character_buffer + offset) = *(text_ptr); ++text_ptr; ++offset; } } /**************************************************************************************** * Subroutine to clear text to the VGA monitor ****************************************************************************************/ void VGA_text_clear() { volatile char *character_buffer = (char *)vga_char_ptr; // VGA character buffer int offset, x, y; for (x = 0; x < 79; x++) { for (y = 0; y < 59; y++) { /* assume that the text string fits on one line */ offset = (y << 7) + x; // write to the character buffer *(character_buffer + offset) = ' '; } } } /**************************************************************************************** * Draw a filled rectangle on the VGA monitor ****************************************************************************************/ #define SWAP(X, Y) \ do \ { \ int temp = X; \ X = Y; \ Y = temp; \ } while (0) void VGA_box(int x1, int y1, int x2, int y2, short pixel_color) { char *pixel_ptr; int row, col; /* check and fix box coordinates to be valid */ if (x1 > 639) x1 = 639; if (y1 > 479) y1 = 479; if (x2 > 639) x2 = 639; if (y2 > 479) y2 = 479; if (x1 < 0) x1 = 0; if (y1 < 0) y1 = 0; if (x2 < 0) x2 = 0; if (y2 < 0) y2 = 0; if (x1 > x2) SWAP(x1, x2); if (y1 > y2) SWAP(y1, y2); for (row = y1; row <= y2; row++) for (col = x1; col <= x2; ++col) { //640x480 VGA_PIXEL(col, row, pixel_color); //pixel_ptr = (char *)vga_pixel_ptr + (row<<10) + col ; // set pixel color //*(char *)pixel_ptr = pixel_color; } } /**************************************************************************************** * Draw a filled circle on the VGA monitor ****************************************************************************************/ void VGA_disc(int x, int y, int r, short pixel_color) { char *pixel_ptr; int row, col, rsqr, xc, yc; rsqr = r * r; for (yc = -r; yc <= r; yc++) for (xc = -r; xc <= r; xc++) { col = xc; row = yc; // add the r to make the edge smoother if (col * col + row * row <= rsqr + r) { col += x; // add the center point row += y; // add the center point //check for valid 640x480 if (col > 639) col = 639; if (row > 479) row = 479; if (col < 0) col = 0; if (row < 0) row = 0; VGA_PIXEL(col, row, pixel_color); //pixel_ptr = (char *)vga_pixel_ptr + (row<<10) + col ; // set pixel color //nanosleep(&delay_time, NULL); //draw_delay(); //*(char *)pixel_ptr = pixel_color; } } } // ============================================= // === Draw a line // ============================================= //plot a line //at x1,y1 to x2,y2 with color //Code is from David Rodgers, //"Procedural Elements of Computer Graphics",1985 void VGA_line(int x1, int y1, int x2, int y2, short c) { int e; signed int dx, dy, j, temp; signed int s1, s2, xchange; signed int x, y; char *pixel_ptr; /* check and fix line coordinates to be valid */ if (x1 > 639) x1 = 639; if (y1 > 479) y1 = 479; if (x2 > 639) x2 = 639; if (y2 > 479) y2 = 479; if (x1 < 0) x1 = 0; if (y1 < 0) y1 = 0; if (x2 < 0) x2 = 0; if (y2 < 0) y2 = 0; x = x1; y = y1; //take absolute value if (x2 < x1) { dx = x1 - x2; s1 = -1; } else if (x2 == x1) { dx = 0; s1 = 0; } else { dx = x2 - x1; s1 = 1; } if (y2 < y1) { dy = y1 - y2; s2 = -1; } else if (y2 == y1) { dy = 0; s2 = 0; } else { dy = y2 - y1; s2 = 1; } xchange = 0; if (dy > dx) { temp = dx; dx = dy; dy = temp; xchange = 1; } e = ((int)dy << 1) - dx; for (j = 0; j <= dx; j++) { //video_pt(x,y,c); //640x480 VGA_PIXEL(x, y, c); //pixel_ptr = (char *)vga_pixel_ptr + (y<<10)+ x; // set pixel color //*(char *)pixel_ptr = c; if (e >= 0) { if (xchange == 1) x = x + s1; else y = y + s2; e = e - ((int)dx << 1); } if (xchange == 1) y = y + s2; else x = x + s1; e = e + ((int)dy << 1); } } ///////////////////////////////////////////// #define NOP10() asm("nop;nop;nop;nop;nop;nop;nop;nop;nop;nop") void draw_delay(void) { NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); //16 NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); //32 NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); //48 NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); //64 NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); NOP10(); //68 // NOP10(); NOP10(); NOP10(); NOP10(); // NOP10(); NOP10(); NOP10(); NOP10(); // NOP10(); NOP10(); NOP10(); NOP10(); // NOP10(); NOP10(); NOP10(); NOP10(); // NOP10(); NOP10(); NOP10(); NOP10(); // NOP10(); NOP10(); NOP10(); NOP10(); // NOP10(); NOP10(); NOP10(); NOP10(); // NOP10(); NOP10(); NOP10(); NOP10(); // NOP10(); NOP10(); NOP10(); NOP10(); // NOP10(); NOP10(); NOP10(); NOP10(); // NOP10(); NOP10(); NOP10(); NOP10(); // NOP10(); NOP10(); NOP10(); NOP10(); //80 // NOP10(); NOP10(); NOP10(); NOP10(); // NOP10(); NOP10(); NOP10(); NOP10(); // NOP10(); NOP10(); NOP10(); NOP10(); // NOP10(); NOP10(); NOP10(); NOP10(); // NOP10(); NOP10(); NOP10(); NOP10(); // NOP10(); NOP10(); NOP10(); NOP10(); // NOP10(); NOP10(); NOP10(); NOP10(); // NOP10(); NOP10(); NOP10(); NOP10(); // NOP10(); NOP10(); NOP10(); NOP10(); // NOP10(); NOP10(); NOP10(); NOP10(); // NOP10(); NOP10(); NOP10(); NOP10(); // NOP10(); NOP10(); NOP10(); NOP10(); // NOP10(); NOP10(); NOP10(); NOP10(); // NOP10(); NOP10(); NOP10(); NOP10(); // NOP10(); NOP10(); NOP10(); NOP10(); // NOP10(); NOP10(); NOP10(); NOP10(); //96 } /// /// ///////////////////////////////////// /// end /////////////////////////////////////