ECE 5760: Final Project

Pancake CPU Hardware Extensions

Hardware RAS and MACC Support

Cameron Glass (cig23@cornell.edu)

Introduction    top

"Additions to existing Pancake CPU of hardware return address stack and basic vector support"

For my ECE 5760 final project, I developed hardware extensions for the CPU used on the Altera DE2 boards in ECE 5760, the Pancake CPU. The Pancake CPU was originally developed at the University of Hiroshima by K. Nakano, et al., and adapted for ECE 5760 by Bruce Land. The Pancake CPU is an alternative to the Altera IP processor, the NIOS II, which is provided through the Altera software. An FPGA developer can write C code, instantiate a NIOS II on the FPGA using the Altera toolchain, and run the C natively on the NIOS II. Writing directly in C can be advantageous, but the CPU provides many disadvantages. The NIOS II takes up many logic elements on the FPGA as well as significant amount of memory bus usage. Additionally, the toolchain needed to properly use the NIOS II can be very confusing to understand. The Pancake CPU, on the other hand, only requires around 1100 logic elements on the DE2, which is less than 3.5% of the available logic elements. The Pancake CPU operates out of a single M4K RAM block, so there is no bus traffic generated by the CPU. Also, there is no toolchain to use - a developer just needs to instantiate a Pancake CPU using the provided source code can start developing software.

One of the disadvantages to using Pancake is that code has to be written in a language developed by Bruce Land called Syrup. The Pancake is a stack based CPU, so the Syrup language takes a little getting used to, but it is relatively simple to learn to program in Syrup. While the Pancake CPU achieves its goal of being a small and easy to use CPU for the DE2 board, it is lacking in performance. It only supports a small instruction set, as the instructions are not very wide, and the bus-based architecture can be somewhat limiting. For my ECE 5760 project, I improved the performance of the Pancake by adding a hardware return address stack and adding a vector multiply and accumulate instruction to the ISA. The ISA changes make previously compiled Syrup code no longer backwards compatible, but the syntax of the language did not change. Recompiling Syrup code to use on the new Pancake will increase performance by speeding up function calls, and all other algorithms will still work exactly as originally programmed.

It might be slightly frustrating to recompile Syrup code to run on the new Pancake, but the performance benefits are worth the extra compilation. Function calls originally required 8 instructions to manage a software return address stack, and required 9 instructions to return. The new implementation only requires a single instruction for call and a single instruction for return. Similarly, running a multiply and accumulate operation, one of the most basic and useful signal processing operations, on a pair of vectors originally had to be performed in entirely software. The software implementation of this operation on the Pancake takes 9 + 18N cycles for the entire vector, whereas my implementation completes in 2 + 2N cycles, which is significantly less than the original software implementation. This is an exciting development for the Pancake, because it is now possible to perform basic signal processing fast enough for useful real time signals, such as CDs.

Updated Pancake CPU Architecture

>