Many phenomena in science and engineering can be explored by viewing a physical system as a collection of particles that obey certain laws. In this project, the particular example is that of astronomical bodies modeled as a collection of point masses which obey Newton's laws of gravitation.

The physical laws in the above example can be written as equations of a type commonly referred to as Ordinary Differential Equations or ODE's. These equations have well-known numerical solution techniques which can be easily expressed as a set of parallel operations. In the case of particle systems, solving the equations also involves calculating a function involving interactions between all pairs of particles. In this document, I will discuss ways to approach such all pairs calculations.

Defining the Problem

As described above, the problem that I will address is the behavior astronomical bodies under Newton's laws of motion and gravitation, commonly refered to as the N-body problem. Suppose that our system consists of N particles, each with a mass, moving with some velocity through 3-dimensional space. Part of the Newtonian view is the contention that mass and velocity of the particles are what affect the system; in particular, we can ignore the diameters and shapes of the particles and treat them as point masses. Here we define velocity and acceleration in the classical sense. Velocity, *dx/dt*, is the change in position over time, and acceleration, *dv/dt*, is the change in velocity over time.

Newton's equations of motion describe how the physical quantities position, velocity, and acceleration are related for astronomical bodies. Newton's First Law, the Law of Gravition, gives the force on one particle in the system due to the other particles as:

For any particle Pi, with mass Mi, the force exerted on Pi by Pj is given by the expression:

Fi,j = G*Mi*Mj/r^2

where G is the universal gravitational constant, and r is the Euclidean distance between the particles. Note that in 3-space, F is actually a vector quantity whose direction is that from the position of particle i to particle j. Also, the force exerted by particle j on particle i is equal and opposite to that exerted by particle i on particle j.

Newton's Second Law relates this force to motion:

F = M*A

where F is the force, M is the mass, and A the acceleration.

Assuming that the masses are fixed, from these laws we can derive equations that express how position and velocity change over time.

For N particles, let X = [X1,...,Xn], V =[V1,...,Vn], A = [A1,...,An]

dX / dt = V,

dV / dt = A, where (For all 1<= i <= N) Ai = G*SUMMATION i=1 to N, i != j ( Mj/r^2)

Note that the equation for acceleration is expressed in terms of position, so that we can represent the current state of a system of particles by the positions and velocities of the particles. From these equations, the positions and velocities of the particles may be evolved over time from an initial state.

Numerical Methods for Solving ODE's

The equations of motion described in the preceeding section are an example of a system of Ordinary Differential Equations or ODE's. Systems of such equations can be very difficult or impossible to solve analytically. For example, the equations of motion above involve integrating over a sum. Because of this, a numerical approximation is typically made for the integral by using the equation to approximate the change over time. The equations given earlier were also all first order, involving only first derivatives. However, if we had written down a second-order equation, such as expressing acceleration as the second derivative of position,

A = d^2X / dt^2

we could always rewrite it as two first order equations by introducing the first derivative quantity, in this case, velocity. Thus, we only need to consider systems of first order equations. So in general we have an equation

dX / dt = f(t,X)

where f is used for a function expressing the right hand side of the equation.

The kinds of problems that we are discussing are also initial value problems. Contrast this to boundary value problems, in which values for *X* are given at two end points and which require slightly more complicated techniques than discussed here.

We can solve an initial value problem by numerical approximation by iterating through a series of time steps. Starting from the initial value, we can calculate positions one step at a time, using the equation for the derivative, or slope, of the function to calculate the change in the function over a time step.

A simple example of a method for finding these values of *X* is given by Euler's Method. This method is too inaccurate for practical use, but serves as a basic example of these kinds of techniques.

Euler's Method

Let h be a time step > 0.

An approximate solution is given by

X0 = initial value

X(i+1) = Xi + h*f(ti,Xi)

This method uses the value of the function *Xi *and the derivative at that point in time, denoted *f(ti,Xi),* to make a simple linear approximation *X(i+1) *of the value of the function at the next point in time. It can be shown that whenever *f* satisfies certain smoothness conditions, there is always a sufficiently small step size *h* such that the difference between the real function value at *ti* and the approximation *X(i+1)* is less than some given error magnitude *e*.

While Euler's Method can produce good approximations in theory, it is not usually practical because of the amount of computation needed to produce a specific error *e* is large in comaprison to the other methods. Notice that in Euler's method, *f* is computed once for each step. Thus, smaller step sizes require more computations of *f* over the entire interval.

To compare Euler's method with other more practical ones, we can analyze the order of the error in terms of the step size. Let *Yi* be the exact solution at time *ti*. Then the approximation over one step size can be expressed as:

Use Taylor's theorem to represent an exact solution

Y(ti+h) = Yti + h * dYti / dt + O(h^2)

Assume X is exact at ti:

Y(ti+h) = Xti + h * dXti / dt + O(h^2)

The derivative is given by the original equation to be solved:

Y(ti+h) = Xti + h * f(ti, Xi) + O(h^2)

Thus,

Y(ti+h) = X(ti+h) + O(h^2)

The local error, the difference between the exact solution and the approximate solution at one time step, is *O(h^2).* Over the whole interval, this multiplies to a global error of *O(h)*. This method is called a first order method, corresponding to the exponent of *h* in the global error.

An improvement of this method is known as the modified Euler method or the midpoint method. With this method, the derivative at a particular time step is used to estimate a value of the function at the mid-point of the time step. This function value is not used, but its derivative is calculated and used to obtain a better estimate of the function value at the next time step. Such a method must evaluate the derivative function twice for each time step, however, it achieves second-order accuracy.

The classical Runge-Kutta method extends these ideas further to not only obtain derivatives of midpoints but the derivative of a trial endpoint. Its main advantage is that we can then use more accurate integration technicques to obtain the endpoint.

The derivative of the first point of the time step is used to obtain a trial midpoint. The derivative of this point is used from the beginning to obtain a second trial midpoint. Its derivative is used to obtain a trial endpoint of the time step. Finally, the values of the derivatives at the initial point, the average of the midpoints, and the endpoint are integrated using Simpson's rule to obtain an endpoint which will be used as the function value over the next time step.

The Classical 4^{th} Order Runge Kutta Method

X0 = initial value

delta1 = f(ti, Xi)

delta2 = f(ti + h/2, Xi + h/2* delta1)

delta3 = f(ti + h/2, Xi + h/2/*delta2)

delta4 = f(t(i+h), Xi + h*delta3)

X(i+1) = Xi + h/6*(delta1 + 2*delta2 + 2*delta3 + delta4)

The derivation of the midpoints and Simpson's rule at each time step leaves terms of *O(h^5)*. And the global error of 4^{th} order Runge-Kutta is *O(h^4).* This is achieved by computing four of the derivative functions *f* at each time step. In general, the 4^{th} order Runge-Kutta method performs 4 times as much calculation at each time step but requires only 1/16 as many time steps as the Euler method.

Computing a Solution

We can now solve the equations of motion for an astronomical particle system in the N-body problem. We can start the particles in an initial state and find their positions and velocities over some period of time. From the formulas given above we have:

For N particles, let X = [X1,...,Xn], V =[V1,...,Vn], A = [A1,...,An]

dX / dt = V,

dV / dt = Accel(X), where (For all 1<= i <= N) Ai = G*SUMMATION i=1 to N, i != j ( Mj/r^2)

In the second equation, I introduce the name *Accel* for the derivative function. Beginning with the initial values for X(0) and V(0), the iterative solution for the equations of motion using Euler's method becomes:

X(i+1) = Xi + h*Vi

V(i+1) = Vi + h*Accel(Xi)

As mentioned earlier this is not a practical method, but its form is representative of the other methods like 4^{th} order Runge-Kutta. And its simplicity allows us to easily devise a parallel algorithm for computation. The solution form using the Runge Kutta would be a nearly identical adaptation of the RK equations presented above.

Any computation involving the numerical method is inherently iterative. At each time step, the solution depends on the immediately preceding one. Thus, if *n* is the number of time steps to solve the equations over some interval, this part of the computation is *O(n)*. Also, the function *Accel(X)*, must be computed for each time step. For each particle *i*, the total force is calculated, which involves summing over the forces on *i *due to all other particles *j*. So if there are *N* particles, each sequential computation of *Accel(X)* is *O(n)* for a total running time of *O(n^2)*. A very significant portion of the computational time is spent evaluating the acceleration function.

Note, however, that for each time step the computation of the net acceleration for each particle is independent of the same computation for any other particle. This implies that the N-body problem is naturally parallelizable. To underscore this point, the software which I have written takes advantage of this by using a multithreaded architecture. The basic idea is that an array of *N* bodies may be divided up among *M* processors, such that each processor calculates the net force and performs the required integration for each of its assigned particles. The trick, however, is to ensure that the processors update their particle positions and velocities in some synchronous manner. Failure to ensure this condition would result in errors in the summation portion of the acceleration calculation as different particles would simultaneously be in positions corresponding to different time steps.

Going Furthur

Becuase of a lack of frictional or dissipative forces toconvert kinetic energy to thermal energy in N-body problems, the motions of the bodies, if perfectly calculated, would conserve energy (kinetic and potential), linear momentum, angular momentum. The center of mass of the system would also have a constant velocity. Fortunately, it is easy to show that any of the numerical methods also obey these conservation laws. In practical situations, one way to improve the accuracy of results is to compute the various energies and momenta at regular intervals and use the conservation laws to either check or help correct the calculation.

Although the data parallel techniques discussed here are successful on astrophysical problems of size of a few thousand particles, many problems involve much larger numbers of particles where the accelerations between individual particles cannot be practically computed. Astrophysical problems on the galactic scale fall into this category. In these problems, the system is no longer represented as an unordered set of particles. Instead, the particles are organized by their position in space in a data structure called a Barnes-Hut tree. This is a oct-tree representation of 3-D space, in which each internal node of the tree represents an area of space and each of its children represents one-eighth of that space until the leaf nodes represent one particle. For each internal node, an approximation of the force of all the particles in that space is calculated. Then for each particle, instead of calculating all N-1 forces due to other particles, the approximations are used to calculate the force due to all the particles which are farther away than some distance.

Of course, as the system evolves over time, the motion of particles requires that the Barnes-Hut tree be dynamically changed as particles move from one part of space to another. Such structures are not naturally represented in the static arrays that are currently used for synchronous data parallel computing and are the subject of ongoing research. More details on solving N-body problems using Barnes-Hut trees can be found in [Salmon].

References

**N-body problems**.

Greenspan, D. Arithmetic Applied Mathematics. Pergamon Press (International Series in Nonlinear Mathematics), Oxford, 1980.

LaBudde, R.A. and Greenspan, D. Energy and momemtum conserving methods of arbitrary order for the numerical integration of equations of motion..

Salmon, John K., Parallel Hierarchial N-Body Methods, Dissertation from Caltech. Technical Report SCCS-52, CRPC-90-14, 1990.

Ordinary Differential Equations

Press, William H., Flannery, Brian P., Teukolsky, Saul A., and Vetterling, William T., Numerical Recipes, the art of scientific computing. Second edition, Cambridge University Press, 1992.

Burden, Richard L. and Faires, J. Douglas, Numerical Analysis. Fourth edition, PWS-Kent Publishing Company, 1989.