"Chameleon is a dense linear algebra software relying on
sequential task-based algorithms where sub-tasks of the overall
algorithms are submitted to a Runtime system. Such a system is a layer
between the application and the hardware which handles the scheduling
and the effective execution of tasks on the processing units. A Runtime
system such as
StarPU is able to manage automatically
data transfers between not shared memory area (CPUs-GPUs, distributed
nodes). This kind of implementation paradigm allows to design high
performing linear algebra algorithms on very different type of
architecture: laptop, many-core nodes, CPUs-GPUs, multiple nodes. For
example, Chameleon is able to perform a Cholesky factorization
(double-precision) at 80 TFlop/s on a dense matrix of order 400 000
(e.i. 4 min).
Chameleon includes the following features:
- BLAS 3, LAPACK one-sided and LAPACK norms tile algorithms
- Support QUARK and StarPU runtime systems
- Exploitation of homogeneous and heterogeneous platforms through the use of BLAS/LAPACK CPU kernels and cuBLAS/MAGMA CUDA kernels
- Exploitation of clusters of interconnected nodes with distributed memory (using OpenMPI)
The source code is available as open source."
https://project.inria.fr/chameleon/
http://icl.cs.utk.edu/projectsdev/morse/
https://gitlab.inria.fr/solverstack/chameleon
No comments:
Post a Comment