Wednesday, August 17, 2016

Numba

"Numba gives you the power to speed up your applications with high performance functions written directly in Python. With a few annotations, array-oriented and math-heavy Python code can be just-in-time compiled to native machine instructions, similar in performance to C, C++ and Fortran, without having to switch languages or Python interpreters.

Numba works by generating optimized machine code using the LLVM compiler infrastructure at import time, runtime, or statically (using the included pycc tool). Numba supports compilation of Python to run on either CPU or GPU hardware, and is designed to integrate with the Python scientific software stack."

http://numba.pydata.org/

https://github.com/numba/numba

https://eng.climate.com/2015/04/09/numba-vs-cython-how-to-choose/

https://www.continuum.io/blog/developer/accelerating-python-libraries-numba-part-1

GPU Computing with Apache Spark and Python - http://on-demand.gputechconf.com/gtc/2016/presentation/s6413-stanley-seibert-apache-spark-python.pdf

NumbaPro: High-Level GPU Programming in Python for Rapid Development - http://on-demand.gputechconf.com/gtc/2014/presentations/S4413-numbrapro-gpu-programming-python-rapid-dev.pdf

CUDA Programming

Numba supports CUDA GPU programming by directly compiling a restricted subset of Python code into CUDA kernels and device functions following the CUDA execution model. Kernels written in Numba appear to have direct access to NumPy arrays. NumPy arrays are transferred between the CPU and the GPU automatically.

Numba supports CUDA-enabled GPU with compute capability 2.0 or above with an up-to-data Nvidia driver.

You will need the CUDA toolkit installed. If you are using Conda, just type:

$ conda install cudatoolkit

Numba now contains preliminary support for CUDA programming. Numba will eventually provide multiple entry points for programmers of different levels of expertise on CUDA. For now, Numba provides a Python dialect for low-level programming on the CUDA hardware. It provides full control over the hardware for fine tunning the performance of CUDA kernels.

The CUDA JIT is a low-level entry point to the CUDA features in Numba. It translates Python functions into PTX code which execute on the CUDA hardware. The jit decorator is applied to Python functions written in our Python dialect for CUDA. Numba interacts with the CUDA Driver API to load the PTX onto the CUDA device and execute.

Most of the CUDA public API for CUDA features are exposed in the numba.cuda module:

from numba import cuda

No comments:

Post a Comment