[communities] GGF Proposal Submission
pkovatch at sdsc.edu
pkovatch at sdsc.edu
Wed Nov 30 19:06:52 CST 2005
proposers_name: Patricia Kovatch
affiliation: San Diego Supercomputer Center
email: pkovatch at sdsc.edu
proposed_title: eraGrid\'s Terabyte Moving Machines
session_type: Tutorial
proposed_duration: Half Day
target_audience: Anyone who is interesting in learning how to build a robust, high performance data
num_attendees: 30
abstract: Sharing Terabyte-sized data sets across the geographically distributed
resources of the TeraGrid posed certain challenges. Using standards-
based grid tools, tuning and scripting expertise along with a grid-enabled
wide area file system, TeraGrid set up a rich environment for data movement.
This tutorial will explain the data infrastructure, user tools, performance
tuning, policy issues and future plans. It will also present case studies of
applications making use of the environment.
synopsis: Session Goals:
Attendees will learn how to build and tune a grid with a rich data infrastructure using standards-based grid tools.
Outline:
The TeraGrid Project (15 mins)
History
Sites
Compute and Instrument Resources
The TeraGrid Network
Application motivation
Data-oriented resource map of TG
Standards-based TeraGrid Data Resources and User Tools
GridFTP (1 hour)
GridFTP dedicated data transfer nodes
GridFTP performance capabilities (multiple threads, striped)
TeraGrid CoPy (tgcp)
Reliable File Transfer (RFT)
Other data transfer tools
Batch moving of data
Bandwidth delay product
Performance tuning
Interoperability testing
Lessons Learned
Grid-enabled wide area parallel file systems (GPFS, PVFS, Lustre) (1 hour)
Metadata and dedicated hardware considerations
Cluster authentication and authorization
Grid-enabled user identification and UID/GID Mapping
Bandwidth delay product
Performance tuning
Co-scheduling of resources
Interoperability testing
Lessons Learned
Archival storage and data collection management (15 mins)
Archival storage (Unitree, HPSS, SAMQFS, TSM, DXUL)
Archival storage clients (uberftp, hsi, others)
Performance tuning
Data Collection Management Servers (SRB/RLS)
Data Collection Management (scopy, etc.)
GridFTP interface to SRB
Database offerings (15 mins)
Database client access
User Considerations (15 mins)
Common TeraGrid Software Stack (CTSS) and Environment Variables
When should I use which tool or approach?
Monitoring (15 mins)
System Tools
Diagnosing problems
Monitoring
Inca Test Harness
Policy Issues (15 mins)
Policies per site and how to use the policy command
Grid-enabled wide are file systems - allocations, quotas, purging
Data Allocation Committee
CTSS modification procedures
Case Studies/Usage Scenarios (15 mins)
BIRN
ENZO
NVO
SCEC
The Future (15 mins)
GridFTP Futures
Grid-enabled wide area parallel file system enhancements
Off-site backups for disaster recovery
Batch transfer of data
Grid Data Portals
Metascheduling, co-scheduling and data workflow for data
tech_requirements: None.
prereq_participants: Basic understanding of grid technologies.
advertise_suggestion: Email lists, web pages, word of mouth
More information about the communities
mailing list