Modeling Human Facial Expressions

Daniel D. Hung and Szu-Wen (Steven) T. Huang
CS 718 Topics in Computer Graphics

Introduction

Human faces are among some of the most difficult objects to model in computer graphics, and have been in the attention of numerous attempts, some almost as old as computer graphics itself. Facial expressions are the result of the movement of skin layered atop muscles and bone structures, and may have thousands of combinations. As part of our coursework, we propose to focus on this subject. Our work will be divided into three parts: a survey of various techniques developed throughout the years, an implementation of one such technique, and the presentation of our results.

Limitations

Due to time constraints, it is unlikely that a detailed implementation of any of the models is possible. The goal of the project is thus to produce a technology demonstration of a technique. The availability of suitable input devices or a pre-defined face mesh would limit the accuracy and aesthetic quality of the finished product. We wish to be able to produce, at the minimum, a wire frame animation of a face mesh. The results from the survey and the implemented model will be presented in class.

Prior Art

General Structure

Among the various approaches taken over the years, a distinctive generalization can be made. There generally exists a low-level muscle motion simulator, called variably as action units, abstract muscle action procedures, or minimum perceptible actions. This layer enables the generation of expressions that are not necessarily humanly possible, such as asymmetric movement of two sides of the face.

On top of the "muscle" layer, we find an abstraction for humanly significant expressions. This layer might include smile, frown, horror, surprise, and other expressions.

As the objective of many projects was to emulate the human face during speech, there usually exists another layer above expressions that include phonemes (speech primitives). A complete data set of phonemes allow the synthesized face to look like it is coordinated with speech which is played back separately.

Keyframing

Keyframing was one of the earliest approaches taken, and involved linear transformations from one face mesh to another. The amount of computations were extensive and the data set large. The approach was rather inflexible because the range of expressions that can be generated are limited by the keyframes that were previously digitized. It is also difficult to generalize work on one face mesh to another.

Parametric Deformations

A second approach was to model the human face as a parametric surface and record the transformations as movements of the control points to minimize the data storage requirements. These approaches are still difficult to generalize over different face meshes.

One such attempt utilizes B-spline patches that were defined manually on an actual digitized face mesh [Nahas88]. The control points for the B-spline patches were moved to effect the distortion of the face. While this method is powerful, we know of no automatic way of defining the relevant control points for the B-spline patches.

Another such attempt used Rational Free-form Deformations to move points inside a defined volume [Waters87, Platt81], if not for the amazingly little understanding we have of them. The human facial muscles are capable of a large amount of expressions, and realistic simulation of muscles requires the simulation of muscle action wrapping around the skull structure, jaw rotation, and folding and stretching properties of skin.

Pseudo-muscles

A hybrid approach would be to simulate muscles that are not necessarily anatomically-correct [

After the data set was complete, we decided that the available functions to deform the face mesh were overly complicated, and derived a simple one:

                  1
P' = P + ( -------------- ) (Pd - P) t
            |Po + P| + 1

where t is the time parameter running from 0 to 1, Po is the point where the force is applied, Pd is the point of attractive force, and P is every face vertex in the bounding box. The formula moves every vertex in the bounding box towards the attraction point Pd by a magnitude inversely proportional to its distance from the point of force application Po. An example of this muscle model is shown below:

This formula had an unacceptable side effect of seemingly to "pinch" the points close to Po towards Pd. We abandoned it because we felt we needed a "smoother attraction", which led us to:

                     |Po - P|  
           1 + cos( ---------- pi )
P' = P + (               R          ) (Pd - Po) t
           ------------------------
                      2

where we introduced a new parameter R. R defines the radius of influence, and while closely related to the size of the bounding box, is a separate parameter. In general, however, R should cover roughly the same area as the bounding box. Note that singularities appear if R is much smaller than the bounding box. Note that the Pd - Po term in this formula moves all the vertices in the bounding box and the radius of influence parallel to the vector between the point of force application towards the point of attraction. In effect, the point of attraction actually defines a plane of attraction for the bounding volume. This is the formula we settled with, and the cosine term does provide us with a smooth deformation. A sample of this muscle model is shown below:

Not all muscles on the face apply force linearly. Sphincter muscles around the lips, for example, do not work well with this formula unmodified. We thus defined a special case of the formula to deal with these muscles:

                       |Po - P|  
             1 + cos( ---------- pi )
P' = P + K (               R          ) (P - Pd) t
             ------------------------
                        2

There are two important differences. The K term was introduced to allow the muscle to contract in one axis and expand in another, instead of a uniform contraction or expansion. The direction of motion is now defined by P - Pd, which is in the direction of the point of attraction (similar to formula 1). The reason we inverted the sign of this formula was because we felt that a positive K should denote contraction and a negative K should denote expansion, in order to be consistent with formula 2. The example on the left below shows a contracting sphincter muscle and the one on the right shows an expanding sphincter muscle:

Putting it all together

Armed with the face mesh and the muscles, what remains is the definition of parameters, which is largely a tedious trial-and-error task. We present a few expressions generated by IBM DataExplorer below (for mpeg demos click on the links below the images):

MPEG demos of our work

Click Here to see the smile animation.

Click Here to see the kiss animation.

Data Explorer files of our work

Click here for the connections list of the mesh.
Click here for the DX macro which models a linear contract/relax muscle on the face.
Click here for the DX macro which models the facial mesh. Note that this requires a vertex list, connections, and lip connections file (similar to the ones above) in order to work.

Ahead

The effects we have achieved are far from realistic. The complexities of facial animation have not yet been fully explored, but we believe we have come up with a reasonable model for manipulating data (less than 500 vertices) versus the much larger data set acquired by machines with good quality. Other areas may be explored, and those we feel are interesting are listed below:

Texture-mapped skin [Nahas90]
Coordination of lip muscles with speech [Nahas88]
Motion of other facial elements such as eyeball or jaw
Modelling of the interior of the mouth such as teeth and gums
Modelling of hair features such as hair, moustache, beard, or eyebrows

References

Arad N., Dyn N., Reisfeld D., and Yeshurun Y., Image Warping by Radial Basis Functions - Application to Facial Expressions, CVGIP-Graphical Models and Image Processing, 1994

Badler N. and Platt S., Simulation of Facial Muscle Actions Based on Rational Free-form Deformations, Computer Graphics, September 1992

Kang C., Chen Y., and Hsu W., Automatic Approach to Mapping a Lifelike 2.5D Human Face, Image and Vision Computing, 1994

Magnenat-Thalmann N., Minh H., de Angelis, M., and Thalmann D., Design, Transformation, and Animation of Human Faces, Visual Computer

Magnenat-Thalmann N., Primeau E., and Thalmann D., Abstract Muscle Action Procedures for Human Face Animation, Visual Computer, 1988

Nahas M., Huitric H., Rioux M., and Domey J., Facial Image Synthesis Using Skin Texture Recording, Visual Computer, December 1990

Nahas M., Huitric H., and Saintourens M., Digital Actors for Interactive Television, Proceedings of the IEEE, 1995

Tost D. and Pueyo X., Human Body Animation: A Survey, Visual Computer, March 1988

Waters K., A Muscle Model for Animating Three-dimensional Facial Expressions, Computer Graphics, July 1987

Maintained by: Steven Huang and Dan Hung
Last Modified: December 11, 1995