Team for Advanced Flow Simulation and Modeling
For more information:
AHPCRC Bulletin: Winter/Spring 1995 - Volume 5 Number 1-2
Finite Element Flow Computations on the Cray T3D
Shahrouz Aliabadi (AHPCRC)This article addresses porting a code developed for the CM-5 to a Cray T3D. A second article addressing optimization on the Cray T3D will appear in the next issue.
Driven by large speed and memory requirements of 3D computations, numerical formulations are increasingly adapted for use with a variety of massively parallel supercomputers. However, the style of programming varies from one architecture to another and porting codes across machines while maintaining efficiency becomes a major issue. Recently, the research group of Tayfun Tezduyar (Professor of Aerospace Engineering and Mechanics) at the AHPCRC ported the finite element flow solvers, which were originally developed for the CM-5, to the Cray T3D. Marek Behr, an Assistant Professor at the AHPCRC, performed the porting of the incompressible flow code from the CM-5 to the Cray T3D. Subsequently, this author extended the code to the compressible flow solver. Postdoctoral Research Associate, Andrew Johnson, and graduate student, Vinay Kalro, are also involved in parallel finite element computations on the Cray T3D.
The Cray T3D, which is the first massively parallel supercomputer from Cray Research, was officially unveiled in late 1993. This scalable machine with 32 to 2048 processors has a peak performance from 4.8 to over 300 GFLOPS, with memory capacity of 512 Mbytes to 128 Gbytes. The Cray T3D takes advantage of a fast bidirectional 3D torus network which minimizes the inter-processor communication times and ensures short connection paths and high bisection bandwidth. The technical features of the Cray T3D, together with its remarkable stability in operation, makes this machine one of the state-of-the-art systems for parallel computations.
As the message-passing programming model was the sole model available to initial T3D users, the existing CM code could only be used as a guide for a new message-passing implementation. This implementation is still based on two distinct main data storage modes the element-level mode and the equation-level mode similar to the CM-5 implementation. In order to minimize inter-processor communication, these two storage modes have to be suitably aligned with contiguous partitions of elements assigned to individual processors. Each processor must hold the data pertaining to both the element partition and a majority of the equations associated with this partition. The necessary communication between the two storage modes is performed via two-step gather and scatter routines, similar in functionality to those available in CM scientific libraries. The inter-processor communication inside the gather and scatter steps, which is now restricted to partition boundary data, is accomplished using send and receive functions of the Parallel Virtual Machine (PVM) library. To improve performance, Cray-specific PVM extensions such as channels are also employed. After a significant amount of scalar code optimization, mostly related to minimizing out-of-cache memory access, the performance of the most computationally intensive parts of the code reached the order of 20 MFLOPS per processing node, and this is comparable to the per-node performance available on the CM-5. The advantages of the T3D include a larger per-processor memory and a smaller communication cost penalty for communication intense algorithms, e.g. more complex preconditioners.
The incompressible flow computations on the Cray T3D span from high-speed, high Reynolds number flows to low-speed natural convection flows. The dynamics of ram air parafoils at high Reynolds number, which is one of the major projects at the AHPCRC, is now partially simulated on the Cray T3D. The parafoil computations on the Cray T3D include the steady-state performance during gliding at various angles of attack. The unstructured mesh used for these computations consists of 144,649 nodes and 905,410 tetrahedral elements. Figure 1 shows the pressure distribution on the parafoil surface at zero degree angle of attack and Reynolds number of 10 million. Figure 1. Dynamics of a ram air parafoil. The picture shows the pressure distribution on the parafoil surface.
Another example of incompressible flow computations on the Cray T3D is for the aerodynamics of automobiles. Here, airflow past an automobile (modeled after a Saturn) at 55 miles per hour is computed. This computation is carried out on an unstructured mesh consisting of 227,135 nodes and 1,407,579 tetrahedral elements (for half of the domain) under wind tunnel conditions. Figure 2 shows the pressure distribution on the automobile surface. In another project being computed on the Cray T3D, the finite element method is used to simulate the process of transient convection in a volumetrically heated fluid. This computation requires the simultaneous solution of the Navier-Stokes equations coupled with the energy equation. Computations are carried out on a 20 x 20 x 20 structured mesh at Rayleigh number 105 and Prandtl number 6.5. Figure 3 shows the temperature field together with the mesh on three sides of the box and an iso-surface of temperature corresponding to the steady-state solution.
The compressible flow research activities on the Cray T3D include the computations of subsonic, transonic, supersonic and hypersonic flows governed by either the Euler or Navier-Stokes equations. An example of supersonic simulations on the Cray T3D, which were carried out to encourage participation of AHPCRC undergraduate research assistants, can be seen in Figure 4. The picture shows the temperature distribution on a fighter aircraft modeled after the Lockheed YF-22. In this computation, the free-stream Mach number is 2 and the compressible flow is assumed to be inviscid and governed by the Euler equations. The computation is carried out on an unstructured mesh consisting of 185,483 nodes and 1,071,580 tetrahedral elements (for half of the domain). The flow simulation for the aircraft and the car were part of an effort by the AHPCRC researchers, partially funded by the Advanced Research Projects Agency, for the development of scalable libraries for fluid mechanics applications.
The hypersonic flow computations on the Cray T3D involve the air in chemical equilibrium with three independent chemical reactions:
O² = 2O,
To test the accuracy of the real gas model with chemical reactions as described, the inviscid air flow at Mach 15 past a circular cylinder was computed. The free-stream temperature and density are 226o K and 0.0187 kg/m3, respectively. The picture on the left of Figure 5 shows the steady-state temperature distribution generated using the ideal gas model. In this case, the maximum temperature is 10,396o K which is off by 91% from the experimental value. The picture on the right of Figure 5 shows the steady-state temperature distribution generated using the real gas model. In this case, the maximum temperature is 5,447o K which is in excellent agreement with experiment.