"The implementation of stencil computations on modern, massively parallel
systems with GPUs and other accelerators currently relies on
manually-tuned coding using low-level approaches like OpenCL and CUDA.
This makes development of stencil applications a complex,
time-consuming, and error-prone task. We describe how stencil
computations can be programmed in our SkelCL approach that combines
high-level programming abstractions with competitive performance on
multi-GPU systems. SkelCL extends the OpenCL standard by three
high-level features: 1) pre-implemented parallel patterns (a.k.a.
skeletons); 2) container data types for vectors and matrices; 3)
automatic data (re)distribution mechanism. We introduce two new SkelCL
skeletons which specifically target stencil computations – MapOverlap
and Stencil – and we describe their use for particular application
examples, discuss their efficient parallel implementation, and report
experimental results on systems with multiple GPUs. Our evaluation of
three real-world applications shows that stencil code written with
SkelCL is considerably shorter and offers competitive performance to
hand-tuned OpenCL code."
http://skelcl.uni-muenster.de/
http://www.worldscientific.com/doi/abs/10.1142/S0129626414410059
No comments:
Post a Comment