"High
Performance Geometric Multigrid (HPGMG-FV) is a benchmark designed to
proxy the finite volume based geometric multigrid linear solvers found
in adaptive mesh refinement (AMR) based applications like the Low Mach Combustion Code (LMC). HPGMG-FV is being used to conduct computer science (e.g. Top500 benchmarking, programming models, compilers, performance optimization, and auto-tuners), computer architecture, and applied math research.
HPGMG-FV solves variable-coefficient
elliptic problems (-b div beta grad u = f) on isotropic Cartesian grids
using the finite volume method (FV) and Full Multigrid (FMG). The method
is fourth-order accurate in the max norm, as demonstrated by the FMG
convergence. FMG interpolation (prolongation) is quartic, V-cycle
interpolation is quadratic, and restriction is piecewise constant.
Recursive decomposition is used to construct a space filling curve akin
to Z-Mort in order to distribute work among processes. Out-of-place
Gauss-Seidel, Red-Black is used for smoothing, preconditioned by the
diagonal. FMG convergence is observed with a using a V(3,3) cycle. Thus
convergence is reached in a total of 13 fine-grid operator applications
(3 pre-smooth GSRBs, residual, 3 post-smooth GSRBs). This makes HPGMG-FV
extremely fast, accurate, and energy efficient.
The Publications page includes an introductory presentation to HPGMG-FV
that many users may find useful when attempting to optimize or port the
code. Moreover, the reference HPGMG-FV implementation includes optional
second order discretizations, constant-coefficient variants, and many
alternate smoothers including Jacobi and Chebyshev. Finally, there is
an HPGMG mailing list with discussions of various releases and performance.
There are two public versions of HPGMG. Both
implement the 4th order (v0.3) solver. The former is MPI+OpenMP3 while
the latter adds CUDA support for the key kernels in order to leverage
the computing power of NVIDIA's GPUs:
hpgmg on bitbucket (HPGMG-FV is in the ./finite-volume/source sub
directory)
hpgmg-cuda on bitbucket (v0.3 compliant MPI2+OpenMP3+CUDA7
implementation). Please see NVIDIA's Parallel ForAll blog for more
details."
No comments:
Post a Comment