The Gfarm filesystem solves performance and reliability problems in NFS and AFS by means of multiple file replicas. It not only prevents performance degradation due to access concentration, but also supports fault tolerance and disaster recovery.
A unique feature of Gfarm is that each filesystem node is also a client of the Gfarm filesystem.
Distributed access by filesystem nodes realizes super-scalable I/O performance.
There are several methods that can be used to access the Gfarm filesystem: - Using Gfarm commands and Gfarm native file I/O APIs You can use Gfarm specific features like file replication, filesystem node management, etc., via this method. - Using GfarmFS-FUSE (gfarm2fs) You can actually mount the Gfarm filesystem from Linux clients by using FUSE (http://fuse.sourceforge.net/). Unlike the other methods, this one is completely transparent from your application. - Gfarm Samba plugin This is a plugin for a Samba server to access Gfarm file system. Using the plugin, Windows clients can access Gfarm file system via Windows file sharing service. - Gfarm Hadoop plugin This is a plugin for Hadoop to access Gfarm file system. Using the plugin module, Hadoop MapReduce applications can access Gfarm file system by Gfarm URL. - Gfarm GridFTP DSI This is a plugin for Globus GridFTP server to access Gfarm file system. Using the plugin, GridFTP clients can access Gfarm file system.
A Gfarm system consists of the following kinds of nodes: - Client node A terminal node for users. - Filesystem node Filesystem nodes provide data storage and CPUs for the Gfarm system. On each filesystem node, the Gfarm filesystem daemon, called gfsd, is running to facilitate remote file operations and access control in the Gfarm filesystem, as well as to provide user authentication, file replication, node resource status monitoring, and control. - Metadata server node A metadata server node manages Gfarm filesystem metadata. On the metadata server node, a Gfarm filesystem metaserver (gfmd), and a backend database server such as an LDAP server (slapd) or a PostgreSQL server (postmaster) are running. The three types of nodes just introduced are not necessarily different hosts, i.e., you can use the same host for the above purposes, if the number of available hosts are limited. Physically, each file is replicated and dispersed across the disks of the filesystem nodes, and they will be accessed in parallel.
The Gfarm filesystem consists of the following software: - The libgfarm.a library A library that implements Gfarm APIs, including Gfarm file access, file replication, and file-affinity process scheduling. - gfmd - the Gfarm filesystem metadata server A metadata server for the Gfarm file system that runs on a metadata server node. It manages directory structure, file information, replica catalog, user/group information, and host information. Gfmd keeps the metadata in memory, but it stored a backend databse such as PostgreSQL server or OpenLDAP server, on background. - gfsd - the Gfarm filesystem daemon An I/O daemon for the Gfarm filesystem that runs on every filesystem node, which provides remote file operations with access control, as well as user authentication, file replication, and node resource status monitoring. - Gfarm command tools Gfarm command tools consist of filesystem commands such as gfls, gfrm, gfwhere and gfrep; a filesystem node management tool, gfhost; file management tools such as gfreg and gfexport; session key management tools, such as gfkey.
https://sourceforge.net/p/gfarm/wiki/Home/
https://en.wikipedia.org/wiki/Gfarm_file_system
No comments:
Post a Comment