Physics-based numerical modeling of the seismic response of arbitrarily complex earth media has gained major relevance in recent years, owing, on one side, to the ever-increasing progress in computational algorithms and resources, and, on the other side, to the growing interest towards the development of deterministic scenarios as input within seismic hazard and risk assessment studies. In the last twenty years there has been an impressive progress worldwide towards the development of high-order numerical methods for the simulation of seismic wave propagation under realistic tectonic and geo-morphological conditions.
The increasing need for certified numerical models apt to include the coupled effects of the seismic source, the propagation path through complex geological structures and localized superficial irregularities, such as alluvial basins or/and man-made infrastructures, poses challenging demands on computational methods and resources due to the coexistence of very different spatial scales, from a few tens of kilometers, with reference to the seismic fault, up to a few meters, or even less, when considering some structural elements.
Main features of the SPEED code
SPEED is written in Fortran90 and conforms strictly to the Fortran95 standard. The SPEED package uses parallel programming based upon the Message Passing Interface (MPI) library relying on the domain decomposition paradigm. The mesh generation may be accomplished by a third party software, e.g. CUBIT and then exported in a compatible format. Load balancing is facilitated by graph partitioning based on the open-source library METIS, which is included in the package. The I/O operations accomplished by SPEED during its execution do not require external libraries. The output is written in ASCII format that can be post-processed through an included Matlab package and then visualized with common tools such as ParaView or ArcGIS.
Installation and usage
SPEED currently runs on the following clusters:
- Fermi (Cineca, Bologna, Italy)
- Idra (MOX, Dip.di Matematica, Politecnico di Milano, Milan, Italy)
- Gigat (MOX, Dip.di Matematica, Politecnico di Milano, Milan, Italy)
- SCoPE Datacenter (Università degli Studi di Napoli Federico II, Naples, Italy)
- Hellasgrid (Scientific Computing Center, Aristotele University of Thessaloniki, Greece)
Optimization on Fermi IBM BlueGene/Q
Fermi, a Tier-0 machine which is at present the main CINECA’s HPC facility, is an IBM BlueGene/Q system composed of 10.240 PowerA2 sockets running at 1.6GHz, with 16 cores each, totaling 163.840 compute cores and a system peak performance of 2.1 PFlop/s. The interconnection network is a very fast and efficient 5D Torus. Fermi is one of the most powerful machines in the world, and it has been ranked #9 in the top 500 supercomputer sites list published in November 2012.
SPEED was built with IBM XL compilers and BG/Q system proprietary MPI. Within the project “PRACE 2IP-WP 9.3: porting and optimization of SPEED for Bluegene/Q architectures” the SPEED code has been optimized for BlueGene/Q architectures using the strategy described in Dagna (2013).
The optimized version, not only improves the performance of SPEED in terms of the overall computational time, but also solves a great memory constraint present in the pure MPI version. Indeed, in the pure MPI version only the single MPI process is able to work on its own chunk of data whereas in the hybridized version each MPI process can take advantage of a selected number of OpenMP threads to work on the same chunk of data. This benefit can be a key turning point for a more effective memory usage of the available hardware when real earthquake scenarios are faced.