ADIOS - A Deployable Internal OScilloscope¶

ECE 5760 - Kofi Efah (kae87), Bryce Roth(ber72), Lily Yu (gly6)¶

Introduction
System Design
Creating A Custom Qsys Module
RTL Design and Testing
Software Design
Results

Introduction¶

Logic Analyzers are extremely useful tools that help FPGA engineers debug their hardware designs. Some designs, however, are very difficult or near impossible to debug in simulation tools such as ModelSim. In these cases, the RTL must be synthesized and debugged in hardware. Debugging directly on hardware is much more taxing, as you cannot use debugging statements like in programming languages, and it takes much longer to synthesize the hardware onto the FPGA. Thankfully Quartus comes with the SignalTap logic analyzer, which is very useful, but difficult for students to use.

The HOLA Logic Analyzer is much more lightweight than SignalTap, but requires the use of the VGA screen to visualize waveforms. This is difficult to set up and the user cannot easily interact with the waveforms. We decided to make some extensions to the HOLA project to make it easily deployable and interactable by firstly wrapping it into a Qsys module that can be imported and used in any project as well as using the HPS to post process the data words and triggers in order to make the logic analyzer output viewable in a waveform viewer such as GTKWave.

System Design¶

The figure below shows how we will integrate the Qsys module with the main DE1-SoC.v file and the HPS. The current implementation of HOLA instantiates the logic analyzer model in the Computer System Verilog file. However, with a Qsys implementation, the logic analyzer module will be broken out into its own verilog file. The module would only need to take in the data input and trigger mask from the Computer System module. All of the control signals and M10K read/write protocols are handled in the Qsys module. Specifically, the M10K blocks are instantiated inside the logic analyzer module where the The M10K blocks will be shared with the HPS so that a .vcd file can be created from the memory writes to the M10K blocks from the Qsys module. The HPS will run a C program we write to dump all the samples into a text file, which is then post-processed by a python script into a .vcd file.

Creating a Custom Qsys Module¶

Simple Counter Example¶

This example shows how to create a simple Qsys module with ports visible to the Computer System Verilog module. We will create a simple counter module that has two external breakouts: a one bit reset input, and a 32 bit counter output. We will start with a bare Qsys project provided here.

Our basic counter module is contained in the Verilog file here. The ports of the module are what become exposed to the Qsys GUI interface.

This opens the Component Editor window. In the first screen, change the name and display name to your desired module. This menu is also where you can write documentation for future reference.

In the “Files” menu, add the counter.v file into the Synthesis files.

Click the “Analyze Synthesis Files” button. This connects the behavioral model in the Verilog file to the component.

This module will have three ports visible in Qsys, corresponding to our two inputs and one output. We need a clock, a reset, and a 32 bit counter. We build these in the “Signals and Interfaces” menu. First delete all the signals generated from the synthesis file analysis. Next, we with the clock input. We do not want the clock input to be visible to the Computer System, since the clock should be attached in Qsys. As a result we add a “Clock Input” to the module instead of a “Conduit”, which we use for external signals.

We then add the “clk” signal to the Clock Input.

Next, we create our reset signal. We initialize this as a “Reset Input”.

Notice that by default, the assigned clock is set to our clock input.

We then create the reset input.

Now that both our clock and reset signals are initialized, we can create our counter output. We want the counter to be visible to the Verilog in the Computer System file, so we create a “Conduit”, which is analogous to a wire in verilog.

Our reset is not initialized by default, so we set it to the reset signal we created in the previous step.

Next we create the signal for our output counter. We add a 32 bit signal.

At first, the name and width of our signal are not set properly, so we configure the width to 32 bits and the name to match our signal name in our Verilog., which is “count”. The signal type is a tag that the module designer decides on and is used for record-keeping. It has no impact on the actual design. Here we will call it “output”. The direction is also improperly configured as “input”, so we change it to “output”.

Now that our component is fully set up, we can see the block diagram in the “Block Diagram” tab to confirm that our inputs and outputs are what we expect.

Click the “Finish” button at the bottom of the Component Editor window to complete the module. In the IP Catalog in the main Qsys window, the counter module should now be visible.

Double click the module to add it to the design.

Connect the clock and reset ports to the sys_clock and reset ports on the HPS respectively. Export the conduit_out port so that it is visible to the Computer System Verilog.

Click “Generate HDL” in the bottom right corner to finish the module. Once generation is complete, we will be able to see our exposed port in the Computer_System.v file inside the Computer_System/synthesis/ folder.

This is the signal we want to hook up in our Computer System. In our project we attach the counter to the onboard hex display.

Compile the final project. This creates an FPGA project that increments the Hex display every cycle.

Creating modules to communicate with the HPS using the Avalon Slave Bus¶

Although the previous example is simple, it does not connect to the HPS. As a result, we cannot access the module via the host code. In order to communicate with the module through the HPS, we need to connect it to the HPS using Avalon Slave buses. In this example, we create an M10K block which can be read and written via the host code.

The module we use in this example initializes a dual read dual write M10K memory block and ports necessary for the creation of an Avalon Slave Bus. The M10K block stores 32 bit data in an 8 bit address space. The ports on the module include a 32 bit read_data and write_data, an 8 bit address, and a one bit write_enable. We include a read_enable line for debugging since the Avalon Slave Bus uses one, however it does not need to be connected to anything. We also provide the standard clock and reset ports, as well as a 32 bit “debug” port for us to transfer signals to the FPGA via a conduit. The Verilog can be found here.

We enter the Component Editor in the same way as we do in the counter example, and follow all the same steps until we finish creating the “Reset” signal. Next, instead of creating a Conduit, we create an “Avalon Memory Mapped Slave”.

We create an Avalon Memory Mapped Slave to interface with the HPS, which has two Avalon Memory Mapped Master buses - h2f_axi_master, and h2f_lw_axi_master. Once the port has been created, we create the five ports that we previously discussed: the address, read, readdata, write, and writedata.

We now have to configure the ports to match our ports declared in Verilog. We start with the address. In our Verilog, the address input is named “address_hps” and is 8 bits. We change the name and bitwidth to match this.

We adjust the next four ports in this manner.

Once we configure the ports, we add the 32 bit debug conduit, finish the module, and add the module to Platform Designer.

We connect the ports as shown and generate the RTL. We test the implementation by running the associated C code found here.

Our C code writes each address between 1 and 10 with an incrementing value. The value of 0 is written as whatever the 10th output is. Writing to the M10K blocks in the HPS code implicitly sets the appropriate write enable (write_enable_hps) signal for the M10K block for one cycle in order for the write operation to be completed. Similarly, the write enable will be set low for one cycle when the HPS initiates a read from the M10K memory.

It is particularly important to note the base address of the Avalon Memory Mapped Slave as beginning at 0x0000_4000 and ending at 0x0000_43FF. In order for our example to work properly, the correct pointers need to also be set in the C code.

To move the module from one project to another, copy the .v, .v.bak, _hw.tcl, and _hw.tcl~ files to the desired top-level directory.

Importing ADIOS to a new project¶

In order to properly integrate the ADIOS project, an approach similar to the above needs to be applied. Firstly, copy the .v, .v.bak and the _hw.tcl~ files to the top level directory like in the image below.

Next, connect the signals in Qsys as shown in the image below. Take note of the Base Address of the two slave ports as they must be changed in the HPS code if they are also changed Qsys. After all the connections have been made, click "Generate HDL".

Once the module has been connected in Qsys and the hardware generated, we connect the conduits to the proper ports in the DE1-SoC_Computer.v file. The two ports that must be connected are data_input and ext_trigger. Data_input is the 32 bit word that you want to capture in the logic analyzer. Data_input can be one 32 bit signal or multiple other signals concatenated together. Ext_trigger is the signal that the oscilloscope triggers off of. This port is masked in the C code, so not all 32 bits need to be used. Connect this signal to the trigger signal in your Verilog. By masking the signal, you can connect multiple potential triggers to this port, then use masks to select the one you would like to use on any given run. These ports are pictured in the image below.

Run the C program by compiling the logic_analyzer_new.c file into la_new. Run the program. Enter the trigger mask and trigger value. The trigger mask controls which bits are examined by the logic analyzer when detecting the trigger. Bits assigned “1” here are checked and “0” are ignored. This input is given in hex. Trigger value is the value that, when read from the trigger word after masking, triggers the oscilloscope. In the example below, we set the trigger mask to 0xffff, meaning that we check the first 16 bits of the trigger word when determining when to trigger. Our trigger value is 0x1000, meaning that we trigger off of a value of 0xXXXX1000. The most significant 16 bits do not matter because of our trigger mask.

We run the counter example to demonstrate the triggering. The oscilloscope captures 500 samples before and after the counter reads 0xXXXX1000.

After the oscilloscope triggers, a dump_text_X.txt file is created. This file contains the waveforms obtained from the logic analyzer. We can examine this file by downloading it, uploading it to ecelinux, and creating a .vcd file. We do this by sourcing the ECE 5745 class script (available at /classes/ece5745/setup-ece5745.sh) then running the Python code available here with the dump_text_X.txt file as an argument. This generates a .vcd file which we can then view with GTKwave.

We can see that our newly generated .vcd file matches our logic analyzer output, as they both trigger in the same place and have the same data_input value.

RTL Design and Testing¶

The logic analyzer is composed of three main state machines: The analyzer state machine, control state machine, and lastly, the trigger state machine. The analyzer state machine is responsible for logging one 32-bit data word input (the data you are trying to scope) per cycle by writing the data word to each location in the analyzer buffer, wrapping back around to address 0 when the number of data words written exceeds 1024. If a trigger is detected, the logic analyzer will grab more samples ( by setting a counter to 0 ) and then waits for the HPS to read all of the data. After the HPS reads all of the data from the FPGA, the analyzer state machine restarts logging the samples. The trigger signal is set by the trigger state machine when the external trigger (asserted if the trigger source ANDed with the trigger mask is equal to the trigger value ANDed with trigger mask) and

The control state machine receives parameters including but not limited to the trigger count, trigger mask, and the trigger value. It starts by waiting for a signal from the HPS to receive new data. After it has determined that there is new data to be read, it will read the trigger, trigger count, trigger mask, and trigger value. Next, it will read the capture_arm command from the HPS, then go back into the initial state where it will check the complete flag, which simply distinguishes whether a full set of samples has been properly acquired. This flag is exclusively set by the analyzer state machine. If the complete flag is set properly, the FPGA will send the address of the trigger to the HPS, a done signal, and finally waits for an acknowledgement from the HPS in the form of another done signal. All data read from the HPS will be stored in a dedicated control buffer M10K block, similar to the analyzer buffer.

Testing our modularized logic analyzer was very arduous, as we could not test it in simulation. We had to make extensive use of SignalTap II to debug the logic analyzer. With this we were able to fix all bugs, including but not limited to bitwidth mismatches, incorrectly set addresses and incorrectly instantiated M10K blocks. It was helpful to have a working baseline to compare waveforms with.

Software Design¶

The HPS provides an interface for the user to customize the trigger of our logic analyzer using bitmask and value inputs. The HPS and FPGA both access the control and analyzer M10K blocks, providing shared memory between the two processing elements. The HPS accesses these M10K blocks with pointers defined in the host code. Before prompting the user for a trigger, the program creates a text file and opens it to be written into. Next, the program waits for the user to input a trigger value and a trigger mask. The HPS sets the trigger mask at address 254, the trigger count at address 2, and the trigger value at address 255 in the control buffer. This is hard set at 499 for now due to how the M10K blocks work. Then the program set the logic arm to 1 at address 1 and set the data ready flag to 1 at address 0. Once the program sees that the data ready flag is set to 0, it reads all the data in the analyzer SRAM and saves it into an array called logic data in order. This logic data is then read through from the beginning and prints to the terminal with the trigger sample. The data is also saved into the text file for waveform purposes later. Lastly, the program clears the done flag at address 250. The dump text file only contains the input data and the trigger flag which is all necessary for the vcd file. We generated the vcd file using a python program that takes input text taken from the HPS. The vcd file starts with an initial setup that sets the version, date, timescale, and the module with the signals. For our purposes, we set the module to be named top and the signals within the module to be clock, data, and trigger with clock and trigger being 1 bit and data being 32 bits. We also set the timescale for the waveform to be 10ps. With vcd files, it’s formatted so that there’s a timestamp, with the value of the signals if they do change. The signal is defined by a symbol in the initial setup. Then when there’s a change in the signal, the new number and the symbol is placed in the vcd. The figure below shows what the vcd file format actually looks like. The red text is what that particular line is supposed to be doing in the waveform.

A student can bring up the waveform using GTKWave. GTKWave is on the ecelinux server at Cornell University if you source either the ECE4750 or ECE 5745 files by typing “source setup-ece4750.sh” or “source setup-ece5745.sh”. Afterward one can bring up the waveform by typing gtkwave filename.vcd. Alternatively, they can download the linux packages for GTKWave using "sudo apt install gtkwave" The figure below shows that the waveform viewer actually looks like. To view the signals, one would highlight the top module, which will bring up the signals of clock, data, and trigger below. Then the person will need to hit the append button for it to show up on the right side. Hit the box looking button next to the zoom in button (plus looking button) to then fit the whole waveform. Then you can zoom in and zoom out using the plus and minus looking button. To expand the signals like the input data, highlight the data and then go to Edit -> Expand. To combine data together, highlight the signals you want to combine, and then go to Edit -> Combine Up.

Results¶

In conclusion, though initially faced with a multitude of difficulties, we were able to modularize the Homebrew Logic Analyzer module into a standalone Qsys module that can be imported into any Quartus project. Moreover, we were able to implement a workaround to visualizing the traces on a VGA screen and post processed the data samples sent from the FPGA into a the .vcd file, a standard file format for waveforms. As a result, we were able to visualize this data on a waveform viewer such as GTKWave.

Appendix¶

The ADIOS project is provided here

The Avalon Slave project is provided here

The project video is provided here