ECE 5760: Wireworld and Conway's Game of Life

By Aidan McNay and Thomas Figura

High-Level Hardware/RTL Design

Note that modules discussed may not be exact representations of our RTL, but are high-level overviews of how our RTL is structured and interacts

Node

The most basic module in our structure is a node. This module implements the automata logic; it takes in the current state of a cell, as well as the current state of the cell's neighbors, and computes what the next state of the cell should be:

High-level diagram of a node

We initially implemented this to solely operate on states; however, with the late addition of Game Of Life functionality, we included another input, wireworld_or_conway, to determine which automata our update logic should follow. This was connected to a switch on the FPGA, allowing the user to switch between automata with ease.

Iterator

Once we have our node, we can instantiate inside of an iterator. An iterator is responsible for iterating across an entire row of state, providing the new states for cells as it iterates along:

High-level diagram of an iterator

Here, we can see that the iterator pipelines the state it reads in order to feed the current state of a cell and its neighbors to the node; the iterator will always be reading two steps ahead of the cell that it's computing the next state for. This has the side benefit that by the time we write new state back to memory, we have already read it into the iterator, guaranteeing that we won't be reading back the new state we just calculated.

Column

Each iterator is intended to iterate across a single row of state. To update the entire state, we need a column of these iterators, each responsible for a designated row. This diagram is shown below; the columns are fed in the data (rvals) from the same address (raddr), ensuring that they update state in lockstep and allowing the new state (wvals) to be written to the same address (waddr) using a write-enable signal (we). Here, addresses correspond to a particular column, with the value for each row in the column being read in parallel.

High-level diagram of the column datapath

Here, control logic is necessary to determiine the correct addresses to read from and write to, especially noting that we need to feed in extra Empty values whenever we are at the beginning or the end of the grid to avoid wrapping around. We implement this control logic using an FSM, which has INIT states to feed in these extra empty values, and which updates a counter to keep track of where we should be reading from and writing to.

High-level diagram of the column Finite-State Machine

Notably, we allocate one bit more than is necessary to address the entire row, to allow for iterators to take in Empty when we go beyond the row length. Whenever raddr = ROW_LEN, we inject an Empty value into our iterator pipeline, to make sure that the end of a row (and the beginning of the next row) is interpreted as Empty for the automata.

Not shown: The go signal comes from the HPS. When the HPS wants to have the solvers update, it will set this register to the number of updates to perform. The solvers check this register in INIT0, only progressing when it's high (and decrementing by 1 when it progresses to INIT1). Lastly, the go signal sent from the register to the solvers is only high at the beginning of the horizontal blanking state, discussed below. This ensures that we only start an update at the beginning of horizontal blanking; since this state takes 45 rows * 800 pixels/row = 36000 cycles, this ensures that the entire state can update before horizontal blanking is over (and we lose the ability to write).

Mouse Interface

To interact with our design, we use a PS2 Mouse, allowing us to control it directly through the FPGA. To do this, we use an external piece of IP to interface with the PS2 protocol, giving us the current location of the mouse, as well as when the mouse is clicked:

Interface for the PS2 Mouse

The main interaction that we perform with the mouse if using it to update memory. For this, we use a write buffer, which translates the current location into a memory write when the mouse is clicked (complete with an address, a value, and a write-enable based on which row we're writing in the addressed column). In this way, the buffer is responsible for translating a given coordinate into the cell that we want to write to.

Write Buffer for the PS2 Mouse

The reason we call it a "buffer" is because there are some other underlying signals that cause mouse writes to buffer; specifically, when we receive a click, we wait until the corresponding location is read, so that we know what the next state should be (with mouse clicks cycling through all possible cell states). However, the mouse doesn't necessarily control these signals, but rather sniffs what is already being read, so they are omitted for brevity.

VGA Interface

In addition to the PS2 Mouse, the other external hardware that our design interacts with is the VGA screen. To perform this interaction, we were able to re-use Professor Adams' VGA Driver that was used in Lab 2; since we already understood how this piece of IP worked, we were able to integrate it into our project smoothly (albeit modified for full 24-bit color)

The provided VGA Driver

To fully integrate this driver into our design, we needed to have a VGA Mapper. This module was responsible for mapping the coordinates that the VGA requested to cells in our memory, similar to the mouse write buffer. In addition, it also detected whether we were near the "edge" of a cell, and always gave the colour grey if so (similarly, always giving the colour purple if the requested cell wasn't valid, given by the parametrized size of our cell grid).

A VGA Mapper to translate coordinates to cells

Coordinate Mapping

Normally, one could hook up the mouse interface directly to the write buffer; similarly, one could connect the VGA driver to the mapper. However, in our design, we wanted to implement zooming and panning. This requires an in-between module to translate the initial input coordinates into "mapped" coordinates that reflect any zooming or panning that has occurred.

For our design, zooming is done through the use of two FPGA buttons. Panning is done by holding the right mouse button while moving the mouse, so that users can drag to move across the grid of cells. Our coordinate mapper module takes these as inputs to keep track of the current zoom and pan amounts. From here, it can translate the input VGA and Mouse coordinates into their mapped versions to reflect the zooming and panning that the user has done. These will be fed into the mouse write buffer and VGA mapper instead, so that they operate on the desired, mapped coordinates.

The coordinate mapper, to map coordinates according to current zoom and pan levels

HPS Interface

The last interface that our automata state has is to the Hard Processor System (HPS). Here, we want to allow user code to also be able to read and write state. This is done using PIOs (the protocol for which is described with the user program). The HPS Interface is responsible for turning the signals from these PIOs into read and writes to memory, using a similar address-value combination that is used with other modules. Notably, the HPS operates on absolute cell values, not screen coordinates, so no translation is necessary.

HPS Interface for User read/writes.

At a high-level, when the HPS interface receives a request on the request PIOs, it performs the action, and updates resp_row and resp_col to match req_row and req_col when the action is done, indicating that the HPS can move on. More detail is given on the Program page.

Memory Organization

A large component of our design is how we're storing the automata's state; we do this with M10K blocks. Each cell needs 2 bits to represent 1 of 4 states, meaning that with the M10K capacity, we can store the state of up to 390 * 4K = ~1.5M cells (although we limited ourselves to 1 million cells for our final implementation, on a 1000x1000 grid).

Initially, we assigned each iterator to an individual M10K block; however, this limited our aspect ratio, as we could only have 390 iterators, and therefore 390 rows in our grid. To achieve our desired 1000x1000 grid, we mapped 8 iterators to 1 M10K block. To do this, we had a write buffer before the M10K block:

This is implemented as a memory wrapper, surrounding each M10K block. Note how the values being read have the option to be registered as the next values to be written. Additionally, we keep track of the write address, and have the option to use it as the read address, in the case where we need the remaining data.

A memory wrapper for varying write widths

System Composition

The primary difficulty in composing the system is organizing our memory access pattern. We have many elements that want to be able to read and write state:

To achieve this, we utilize the different VGA states. As the VGA iterates across the screen, the protocol requires a "blanking" region at the end of every row (vertical blanking), as well as for a few rows after the end of the screen (horizontal blanking), during which no data is sent to the VGA:

A diagram of the different VGA states

During these states, we define which modules are able to read and write (with the primary constraint being to have the VGA able to read data during the active regime). Modules must be aware of which state the VGA Driver is in, and not assume that their reads/writes will be correct if they're not in a state when they can read/write.

With this in mind, we can compose the entire system, with the VGA, Solvers, Mouse, and HPS all able to access state in memory:

The integrated Wireworld system

This includes some top-level logic that hasn't been previously discussed: