Tuesday, January 16, 2018

TAL_SH

Tensor algebra library routines for shared memory systems.  The TAL_SH library provides API for performing basic tensor algebra operations on multicore CPU, NVidia GPU, Intel Xeon Phi, and other accelerators.  Basic tensor algebra operations include tensor contraction, tensor product, tensor addition, tensor transpose, multiplication by a scalar, etc., which operate on locally stored tensors.  The execution of tensor operations on accelerators is asynchronous with respect to the CPU host, if the underlying node is heterogeneous.  Both Fortran and C/C++ API interfaces are provided.  The library has a simplified object-oriented design, although without explicit object-oriented syntax.
























https://github.com/DmitryLyakh/TAL_SH

cuTT: A High-Performance Tensor Transpose Library for CUDA Compatible GPUs - https://hgpu.org/?p=17219

http://on-demand.gputechconf.com/gtc/2017/presentation/s7255-antti-pekka-hynninen-cutt-a-high-performance-tensor-transpose-library-for-gpus.pdf

No comments:

Post a Comment