"mpiFileUtils is a suite of MPI-based tools to manage large datasets,
which may vary from large directory trees to large files.
High-performance computing users often generate large datasets with
parallel applications that run with many processes (millions in some
cases). However those users are then stuck with single-process tools
like cp and rm to manage their datasets. This suite provides MPI-based
tools to handle typical jobs like copy, remove, and compare for such
datasets, providing speedups of up to 20-30x.
- dbcast - Broadcast files to compute nodes.
- dchmod - Change permissions and group access on files.
- dcmp - Compare files.
- dcp - Copy files.
- dfilemaker - Generate random files.
- drm - Remove files.
- dstripe - Restripe files.
- dwalk - List files.
Experimental Utilities
To enable experimental utilities, run configure with the enable experimental option.
./configure --enable-experimental
- dfind - Find files by path name (experimental).
- dgrep - Search contents of files (experimental).
- dparallel - Perform commands in parallel (experimental).
- dtar - Create file tape archives (experimental).
https://github.com/hpc/mpifileutils
No comments:
Post a Comment