## DE1-SoC: Lattice Boltzmann Cornell ece5760

### Lattice Boltzmann

Lattice Boltzmann methods can be used to simulate fluid flow on a grid of cells. The goal is to parallelize the LB calculation onto the FPGA, but there are a few steps to do first. First figure out the algorithm and tune it in matlab. Then 'devectorize' into C. Then build FPGA hardware and parallelize. Nowicki and Claesen give one approach to implementing the FPGA hardware.

The fluid flow is abstracted into continuous-valued densities, flowing in eight discrete directions from a given cell, with zero velocity flow as a ninth flow. Constrains are placed on the densities to conserve momentum and mass. At any time step, eight densities from neighboring cells stream into a cell where they 'collide', are mixed, then on the next time step, stream to new cells. The images below were taken from O'Brien 2008 to show the propagation and collision steps.

Streaming: The distribution for a cell is copied to the eight cells around it. (from O'Brien 2008 ) Collision: Velocity distribution modeled after He and Luo (1997). Boundaries are handled by assuming 'no slip' conditions so that streams hitting a boundary cell are bounced back in the direction they came from. The image below is again from O'Brien 2008. Discrete streams that go into boudary cells are returned to the originating cell in the oppposite directions. This slightly more detailed image is from Nicolas Delbosc labeling the density vectors. Matlab implementation

The first code is a mash-up of the three sources below. The overall framework is from Haslam, but modified to use the simpler (no division) update function of Nowicki and Claesen. The Palabos code was used to understand the scaling of various constants, partcularly the relaxation constant, omega. A frame from the solution(mp4) is shown below with a von Karman vortex street in the wake of the flat plate at x=100. Laminar flow past the flat plate at lower omega (higher dynamic viscosity). The second code implements a jet which pushes fliud to the right in a zero velocity medium.
The region has walls top/bottom and is toroidally connected left/right.
There is also a particle (white disc) advected by the flow. (mp4) A variant injects particles continuously into the jet and surpresses the color coding of velocity. (mp4)(youtube)
One particle per time step is injected into the jet inlet, so the number maxes out at 40000 particles.
With bigger particles (mp4). A moving velocity source is a model for an interactive simulation or game. A velocity source is
moved along the x-direction with a source speed protional to the movment speed.
The code needs work for stability, but produces nice animation. I suspect that the distribution functions
for fluid speed from the source need work. A test code implements a square region with a driven edge. A main vortex forms with two weak
vortices in the upper corners. Comparision with plots in Perumala and Dassb suggests a
Reynolds number of about 1000-2000. (mp4). The velocity pulse at the very beginning of the animation
is a weak sound wave resulting from initial conditions.

C implementation
Daniel V. Schroeder at Weber State University has written a good explaination of LB, as well a very nice interactive javascript simulation. We ported the javascript to C then optimized it for the HPS. The alogrithm is a memory hog, requiring about twelve arrays of dimension equal to the total number of grid cells.
Optomization:

1. The first step is to convert to fixed point for speed. We chose s1x25 format (sign bit, one bit to the left of the radix point, 25 fractional bits) because it ports well to the FPGA DSP multiplier we are using.
2. The next optimization step was to eliminate two divides per grid cell per time step by Taylor-expanding the reciprocal of the denominator and multiplying. Three terms of the expansion were kept because the fluid density (denominator) was within a few percent of unity. The approximation is shown below, with mFix being the fixed point multiply. The divide is thus converted into four adds and a multiply.
Fix rho_inv = 1 - (rho - 1) + mFix(rho - 1),(rho - 1))
3. Drawing the fluid flow required calculating the speed of the flow from the x and y velocity components. To avoid a square-root calculation we used the simplest possible alpha-max beta-min algorithm to compute the magnitude, with alpha=1 and beta=0.5.
speed = max(abs(ux[i]), abs(uy[i])) + (min(abs(ux[i]), abs(uy[i])) >>1)
This speed approximation is good to about 8%.
4. Speed was mapped to the green channel (6 bits) of the 640x480 VGA and those colors were assigned equal speed intervals for plotting. Tracer particles are advected through the flow field.

A single core version of a s1x25 fixed point LB solver running on the HPS is shown below.
C-code
Verilog sof file is in this project ZIP for Quartus 18.1
Video. A fixed point s1x18 version was written to better fit into Cyclone5 M10k block memory. The FPGA will be memory limited, so a 20-bit format allows a maximum number of grid cells, of around 14,000 or so. The next step was to parallelize across the two ARM cores using pthreads primitives for syncing the calculations. The heavy computational step is the equlibrium distribution calculation which conveniently uses only information already in each individual cell. This was easy to split into left/right halfs for the two cores. The streaming step in which the fluid velocities are propagated separates into seven distinct memory copies, also easy to split. With the best split I could find, the two processors were running at 98% and 80%. The speed up was about 1.7. With both processors running the HPS flooded the bus and sometimes intefered with VGA production.
C code for 2 core

References

Fluid Dynamics Simulation-- interactive LB solver
Background math for the solver
Daniel V. Schroeder at Weber State University

SoC Architecture for Real-Time Interactive Painting based on Lattice-Boltzmann
Domien Nowicki, Luc Claesen
2010 17th IEEE International Conference on Electronics, Circuits and Systems
Year: 2010 Pages: 235 - 238, DOI: 10.1109/ICECS.2010.5724497

Lattice Boltzmann Matlab Scripts
Iain Haslam, March 2006.

A FRAMEWORK FOR DIGITAL WATERCOLOR
A Thesis by PATRICK O’BRIEN 2008, Texas A&M University

Lattice boltzmann model for the incompressible navier-stokes equation
X. He and L.-S. Luo,
Journal of Statistical Physics, vol. 88, no. 3, pp. 927-944, Aug 1997.

Real-Time fluid Simulation
Nicolas Delbosc; School of Mechanical Engineering University of Leeds, England

Application of lattice Boltzmann method for incompressible viscous flows
D. Arumuga Perumala, Anoop K. Dassb.
Applied Mathematical Modelling Volume 37, Issue 6, 15 March 2013, Pages 4075–4092

Simulation of miscible binary mixtures based on lattice Boltzmann method
Hongbin Zhu , Xuehui Liu, Youquan Liu and Enhua Wu
COMPUTER ANIMATION AND VIRTUAL WORLDS Comp. Anim. Virtual Worlds 2006; 17 : 403–410

Copyright Cornell University May 12, 2022