Monday, March 20, 2017

Shifter

"Shifter enables container images for HPC. In a nutshell, Shifter allows an HPC system to efficiently and safely allow end-users to run a docker image. Shifter consists of a few moving parts 1) a utility that typically runs on the compute node that creates the run time environment for the application 2) an image gateway service that pulls images from a registry and repacks it in a format suitable for the HPC system (typically squashfs) 3) and example scripts/plugins to integrate Shifter with various batch scheduler systems.

Shifter is a prototype implementation that NERSC is developing and experimenting with as a scalable way of deploying containers in an HPC environment. It works by converting user or staff generated images in Docker, Virtual Machines, or CHOS (another method for delivering flexible environments) to a common format. This common format then provides a tunable point to allow images to be scalably distributed on the Cray supercomputers at NERSC. The user interface to shifter enables a user to select an image from their dockerhub account and then submit jobs which run entirely within the container.

Shifter works by enabling users to pull images from a DockerHub or private docker registry. An image manager at NERSC then automatically converts the image to a common format based on an automated analysis of the image. That image is then copied to the Lustre scratch filesystem (in a system area).  The user can then submit jobs specifying which image to use. Private images are only accessible by the user that authenticated and pulled them, not by the larger community.  In the job the user has the ability to either run a custom batch script to perform any given command supported by the image, or if a Docker entrypoint is defined, could simply execute the entrypoint; which in combination with custom volume maps could entirely bypass the need for batch scripts.

The way that shifter interacts with images, by using loop mounts of the image file has the advantage of moving metadata operations (like file lookup) to the compute node, rather than rely on the central metadata servers of the parallel filesystem. Based on our testing, shown in figure 2, using the pynamic benchmark, this greatly improves the performance of shared library applications, and of course python. These tests indicate that essentially matches the performance of a single docker instance running on a workstation despite the fact that shifter images are stored on a parallel filesystem."

https://www.nersc.gov/research-and-development/user-defined-images/

https://github.com/NERSC/shifter

https://github.com/NERSC/2016-11-14-sc16-Container-Tutorial

No comments:

Post a Comment