Thursday, March 2, 2017

ESGF

"The Earth System Grid Federation (ESGF) Peer-to-Peer (P2P) enterprise system is a collaboration that develops, deploys and maintains software infrastructure for the management, dissemination, and analysis of model output and observational data. ESGF's primary goal is to facilitate advancements in Earth System Science.

A common ESGF Software Stack is deployed at each Node in the federation to provide services for data, metadata and user management. The installation can be configured to install all or part of the available services, depending on the site needs, and possibly to replicate some of the services across multiple servers at the same site. Specifically, the following flavors of ESGF Node can be installed:
  • Data Node: includes services for publishing and serving data, namely:
    • The ESGF Node Manager. The ESGF Node Manager is a web application that mediates the peer-to-peer interaction among all the Nodes in the federation. Its main purpose is create and expose the ESGF Registry, a document that contains critical inter-operability information such as the name and type of each Node, its available services and URL endpoints, its CA certificate, etc.
    • The ESGF Publisher, and associated Postgres relational database. The ESGF Publisher is a desktop application that allows to publish data into a Node. The publishing workflow starts with extracting metadata from files on disk, storing it on the Node database, creating THREDDS XML catalogs and finally publishing the catalogs to the Node publishing service. Postgres is a popular freely available relational database that is used in ESGF to store all metadata harvested from the ESGF publisher, as well as user account information.
    • The Thredds Data Server, configured with the ESGF security filters. The Thredds Data Server (TDS), developed by Unidata, represents the standard mechanism through which an ESGF Node delivers its data to the clients. The TDS includes functionality for serving data in a variety of forms and protocols: full files HTTP download, OpenDAP sub-setting, GIS products via WMS and WCS, etc. The ESGF installation procedure configures the TDS with a set of special ESGF filters that intercept any data request, and enforce the access control policies established for that dataset by interacting with the appropriate ESGF Security Services deployed throughout the federation.
    • The ESGF Security Services. The ESGF Security framework includes functionality for distributed access control throughout the federation. It is composed of client-side components (the access filters and Openid Relying Party) that protect access to the data, and server-side components (the Attribute and Authorization services) that can be queried to gather all necessary information to make an access control decision. The framework supports access both by browsers (via OpenID authentication), and desktop clients and libraries (via X509 certificates).
    • The GridFTP server. The GridFTP server, developed by the Globus alliance, is a high performance protocol for reliable data transfer. It includes a server, deployed on an ESGF Node, and a client-side library that the user must deploy on their desktop.
  • IdP Node: includes services for authenticating users:
    • The OpenID Identity Provider web application. The OpenID Identity Provider (IdP) allows users to register and authenticate with the system, including Single-Sign-On functionality for browser-based access throughout the federation.
    • The Globus SimpleCA and My Proxy server. The My Proxy server, developed by NCSA, is used to issue short term certificates that can be used by client libraries and toolkits to authenticate the user during a data product request. The certificates are signed by the locally installed Globus Simple Certificate Authority (CA).
  • Index Node: includes the applications necessary to index and search metadata:
    • The Apache Solr engine. Apache Solr is a high performance, scalable web application for storing and searching metadata.
    • The ESGF Search back-end services. The ESGF Search module includes facilities for harvesting external metadata repositories (such as the THREDDS XML catalogs produced by the ESGF Publisher), and for searching the distributed metadata indexes deployed within the federation.
    • The ESGF Web Portal application. The ESGF Web Portal is a web application that contains the user interface to many of the other ESGF modules. It exposes web pages for registering users, searching for data, downloading data etc.
  • Compute Node: includes services for data analysis and visualization, namely:
    • The Live Access Server. The Live Access Server (LAS), developed by NOAA/PMEL, is an analysis and visualization engine that allows users to request advanced data and imaging products from multiple ESGF Nodes at once. Internally, it relies on the TDS catalogs and OpenDAP services for configuration and remote data access. It can be configured with a pluggable visualization engine such as Ferret (the default) or UV-CDAT.



     http://esgf.llnl.gov/
     











No comments:

Post a Comment