- Provides a mechanism for scalable loading of shared libraries, executables and python files from a shared file system at scale without turning the file system into a bottleneck.
- Is a pure user-space approach. Users do not need to configure new file systems, load modules into their kernels or build special system components.
- Operates on stock binaries. No application modification or special build flags are required.
- Automatically detects libraries as they’re loaded, so there is no need for pre-generated lists of libraries. Spindle can scalably load the targets of dlopen calls, dependencies, or libraries loaded by forked child processes.
- Is very scalable. Under one benchmark, start-up performance without Spindle at 64 nodes was similar to start-up performance with Spindle at 1280 nodes—a performance improvement of 20X! And many applications will likely get better benefits from Spindle than this benchmark.
- Operates on Linux/x86_64 systems. Cray and BlueGene/Q ports are underway.
Spindle plugs into the system’s dynamic linker and intercepts these file operations. Instead of allowing every process to do file operations and flood the file system, one process (or a designated small number) will perform the file operations necessary for locating and loading dynamic libraries, then share the results of those operations with other processes in the job.
The results of those file operations will be things like file or directory contents. File contents are stored in a local location on each node, such as a ramdisk or SSD, and Spindle directs the application to load libraries from these locations rather than the shared file system."
https://computation.llnl.gov/projects/spindle
https://github.com/hpc/Spindle
https://computation.llnl.gov/projects/spindle/spindle-paper.pdf
No comments:
Post a Comment