Theory
What follows is some of the theory behind hierarchical modeling in
computer graphics and some of the issues related to the camera and
perspective transforms used in the viewing system.
Hierarchical Modeling
Hierarchical Modeling is a way of representing complex objects as
hierarchical combinations of simple objects. A face, for example could
be represented as an oblong sphere with some simple objects "pasted" on
to represent eyes, ears, nose and mouth. Or a wheel could be represented
as a thin torus with a series of cylinders inside to represent the spokes,
for example. A more complicated object could then hierarchically include
several of these wheels with a box to represent a cart.
The basic premise of hierarchical modeling is thus to create a series of
simple objects and allow them to be manipulated and duplicated.
In this way, one creates complicated objects that take advantage not only
of the features of their sub-objects, but also of their hierarchical
properties.
Coordinate Spaces
In general, 3-dimensional coordinate spaces have an origin and 3 non-parallel
axes. Typically, however, these axes are taken to be orthogonal. When modeling
objects, it is convenient to place them in a world-coordinate system. In order
to view these objects, the notion of a camera is used - where
the camera itself has a location in world-coordinates (camera position,
C ), as well as a direction in which it points ( the N vector )
and an up vector ( U ) and a vector to denote increasing "x"
coordinates ( V ). In order to take objects and view them with the
camera, the Camera Transform is used to put the objects in to View
Coordinate Space.
Camera Transform
Another way of representing the camera position
is with two angles and a camera distance. Theta is the angle the
camera sits at in the XY plane, and Phi is angle the camera is "away"
from the Z axis. We shall call the camera distance to the world-coordinate
origin U. The transformation from world coordinate to view coordiante
space is carried out using the Tview matrix.
The TView matrix can be derived through a sequential application of the
following transformations (the first three of which are essentially one step):
- A Translation of -U*Cos(Theta)*Sin(Phi) in X
- A Translation of -U*Sin(Theta)*Sin(Phi) in Y
- A Translation of -U*Cos(Phi) in Z
- A Rotation about the Z axis of 90-Theta
- A Rotaion about the X axis of Phi-180
- An optional flipping of x coordinates depending on
whether you want everything left or right-handed.
Perspective Transform
The perspective transform takes points and compresses the Z coordinates
to turn them from View coordinates to Screen coordinates. In essence,
objects that are far away from the camera appear smaller in the screen than
objects that are closer. In its most basic form, the perspective
transform divides the last column of the transformation matrix by d, the
distance from the camera to the viewing screen. This means
setting the homogeneous coordinate factor to 1/d resulting in dividing
the x and y coordinates by z which gives the desired scaling effect.