Cost-Effective Parallel Computational Electromagnetic Modeling

Daniel S. Katz*, Tom Cwik

Jet Propulsion Laboratory, California Institute of Technology, Pasadena, California

ABSTRACT

This paper discusses the use of Beowulf- class computers in solving computational electro- magnetic problems. Small Beowulf-class computers may be purchased in December 1997 for approximately $1,700 per node, including all computation and communication components. Two codes are examined, namely a finite-difference time-domain code, and a finite element iterative solver. The performance of these codes on 1 to 128 processors of a Beowulf-class computer is compared with the performance of similar codes on the same number of processors of a Cray T3D.

INTRODUCTION

This paper discusses the use of Beowulf-class computers in solving computational electromagnetic (CEM) problems. Beowulf-class computers are defined as piles of PCs running LINUX, using fully mass-market, commercial, off-the-shelf (M2COTS) components.

The Beowulf-class systems used in the work described in this paper consist of Pentium Pro processors and fast Ethernet (100Base-T) networking. Small systems can be purchased for approximately $1,700 per node as of December 1997.

Two codes will be examined in this paper; an MPI implementation of the EMCC finite-difference time-domain (FDTD) code [1], and an iterative solver used in JPL's PHOEBUS coupled finite element-integral equation package [2].

The FDTD code is a fairly straightforward domain decomposition code that uses explicit time-stepping, with a majority of the floating point operations being required by stencils based on the second order differences in space. Communication is required by ghost cells on the edges of each processor's domain, and may be used for updates of the edges of the entire domain.

The iterative solver from PHOEBUS requires a sparse matrix (A) that has been reordered to minimize and equalize row bandwidth. The iterative solution of a AX=Y using a QMR algorithm can then be seen primarily as a series of sparse matrix - dense vector multiplies. Some additional work in the form of vector operations, such as dot products, norms, scales, and sums is also performed, but the matrix vector multiply is the time-consuming work. This is also where the majority of the communication occurs, in copying parts of the vector multiplicand from other processors.

PREVIOUS RESULTS

Previous work has demonstrated good performance for FDTD codes and iterative solvers on a 16 node Beowulf-class system (Hyglac, located at JPL) [3]. It was found that both codes could achieve performance within a factor of two or three of the performance of the same code on 16 processors of the T3D, with much lower cost.

NEW WORK

This paper will discuss results of running the FDTD code and the iterative solver on 1 to 128 nodes of a Beowulf-class system located at Caltech (named Naegling). Preliminary results show substantially better communications performance on this system than was observed on Hyglac. In addition, system software has changed in such a way as to provide better computation rates, again compared with Hyglac. Naegling has previously provided performance of over 10 GFLOPS running an n-body gravitational simulation, and should also provide good performance for these CEM computations. Naegling run times will be compared with both those of a Cray T3D, and those of Hyglac, and reasons for the differences will be discussed.

REFERENCES

[1] A. C. Woo and K. C. Hill, "The EMCC/ARPA Massively Parallel Electromagnetic Scattering Project," NAS Tech. Report NAS-96-008. NASA Ames Research Center, 1996.

[2] T. Cwik, D. S. Katz, C. Zuffada, and V. Jamnejad, "The Application of Scalable Distributed Memory Computers to the Finite Element Modeling of Electromagnetic Scattering and Radiation," Int. J. Num. Meth. Eng., in press.

[3] D. S. Katz, "High-Performance Computational Electromagnetic Modeling Using Low-Cost Parallel Computers," IEEE AP-S Symposium/URSI Radio Science Meeting, Montreal, Canada, July 1997.


* D. S. Katz, e-mail d.katz@ieee.org, phone 818-354-7359, fax 818- 393-3134.

The research described in this paper was performed at the Jet Propulsion Laboratory, under contract to the National Aeronautics and Space Administration, and at the California Institute of Technology. The Cray T3D was provided with funding from the NASA offices of Mission to Planet Earth, Aeronautics, and Space Science. This work was funded by the NASA High Performance Computing and Communications Earth and Space Sciences Project.