For the final project of ECE-5760, we have done a performance evaluation of ARM Cortex-A9 vs FPGA on DE1-SOC board. This project is done using Altera’s OpenCL SDK on DE1-SOC board with a duel embedded ARM core on it.To achieve real time performances from complex algorithms it has become necessary to use massively parallel GPUs or FPGAs along with GPs(General Processors). This heterogeneous computing model is employed in the PC world for graphics, gaming, rendering, server market etc. and now for the handheld/embedded world. To program such systems, Open Computing Language (OpenCL) an open, royalty free standard for parallel programming of modern processors, has been developed. It greatly improves speed and responsiveness for a wide spectrum of applications in numerous market categories from gaming and entertainment to scientific and medical software.

This project is mainly done in OpenCL and C++, where all the kernels are written in OpenCL and host code running on the ARM in C++. Our project aims at evaluating the speedup achieved on FPGA for parallel operations and the overhead involved in achieving that. We have particularly targeted two types of filters to do this evaluation. The Altera OpenCL SDK allows a programmer to use high level code to generate an FPGA design with low-power consumption and good performance. Altera’s AOCL compiler is used to compile the Kernel code to FPGA design and automatically generates SystemVerilog code for the developer. The reasoning behind using filters as the benchmark for OpenCL is because filtering is one of the most important component in image processing applications and computer vision algorithms. Our evaluation aims at comparing the performance of Gaussian Filter and Bilateral filter, which is a nonlinear filter, on ARM and FPGA with the same algorithm and memory restrictions.