Physical simulation for animation and visual effects: parallelization and characterization for chip multiprocessors
ACM SIGARCH Computer Architecture News, 2007•dl.acm.org
We explore the emerging application area of physics-based simulation for computer
animation and visual special effects. In particular, we examine its parallelization potential
and characterize its behavior on a chip multiprocessor (CMP). Applications in this domain
model and simulate natural phenomena, and often direct visual components of motion
pictures. We study a set of three workloads that exemplify the span and complexity of
physical simulation applications used in a production environment: fluid dynamics, facial …
animation and visual special effects. In particular, we examine its parallelization potential
and characterize its behavior on a chip multiprocessor (CMP). Applications in this domain
model and simulate natural phenomena, and often direct visual components of motion
pictures. We study a set of three workloads that exemplify the span and complexity of
physical simulation applications used in a production environment: fluid dynamics, facial …
We explore the emerging application area of physics-based simulation for computer animation and visual special effects. In particular, we examine its parallelization potential and characterize its behavior on a chip multiprocessor (CMP). Applications in this domain model and simulate natural phenomena, and often direct visual components of motion pictures. We study a set of three workloads that exemplify the span and complexity of physical simulation applications used in a production environment: fluid dynamics, facial animation, and cloth simulation. They are computationally demanding, requiring from a few seconds to several minutes to simulate a single frame; therefore, they can benefit greatly from the acceleration possible with large scale CMPs.
Starting with serial versions of these applications, we parallelize code accounting for at least 96% of the serial execution time, targeting a large number of threads.We then study the most expensive modules using a simulated 64-core CMP.
For the code representing key modules, we achieve parallel scaling of 45x, 50x, and 30x for fluid, face, and cloth simulations, respectively. The modules have a spectrum of parallel task granularity and locking behavior, and all but one are dominated by loop-level parallelism. Many modules operate on streams of data. In some cases, modules iterate over their data, leading to significant temporal locality. This streaming behavior leads to very high on-die and main memory bandwidth requirements. Finally, most modules have little inter-thread communication since they are data-parallel, but a few require heavy communication between data-parallel operations.
ACM Digital Library