[ogsa-wg] GT 4.0 GRAM docs for input to BES discussions
Karl Czajkowski
karlcz at univa.com
Thu Mar 10 01:06:29 CST 2005
Here is a compact summary of the GT 4.0 WS-GRAM interface and links to
further documentation. Please start by reading the documentation
unless you are already an expert on GRAM, XSD, WSDL, and WSRF concepts.
---------------------------------------------------------
GT 4.0 WS-GRAM documentation
Note, these documents are in draft form...
1) GRAM Key Concepts
http://www-unix.globus.org/toolkit/docs/development/4.0-drafts/execution/key/
2) WS-GRAM Approach
http://www-unix.globus.org/toolkit/docs/development/4.0-drafts/execution/key/WS_GRAM_Approach.html
3) Semantics and syntax of WSDL
http://www-unix.globus.org/toolkit/docs/development/4.0-drafts/execution/wsgram/WS_GRAM_Public_Interfaces.html#wsdl
4) Job Description Language
http://www-unix.globus.org/toolkit/docs/development/4.0-drafts/execution/wsgram/schemas/mjs_job_description.html
5) Links to more WS-GRAM docs than you can shake a stick at
http://www-unix.globus.org/toolkit/docs/development/4.0-drafts/execution/wsgram/
Because this is already a lot of documentation, the following is a
terse overview of the WSDL rendering of the core WS-GRAM job
discovery, submission, monitoring, and cancellation interface.
---------------------------------------------------------
PORTTYPE ManagedJobFactoryPortType
The Managed Job Factory Resource (MJFR) represents one localized job
scheduler or compute element. (In GT 4.0, there is a separate MJFR
for each deployed local scheduler adapter on a host.)
It is a WSRF style resource with a resource properties document to
represent the overall status and capabilities of the local compute
element:
<managedJobFactoryResourceProperties>
<localResourceManager>xsd:string</localResourceManager>
<globusLocation>xsd:string</globusLocation>
<hostCPUType>xsd:string</hostCPUType>?
<hostManufacturer>xsd:string</hostManufacturer>?
<hostOSName>xsd:string</hostOSName>?
<hostOSVersion>xsd:string</hostOSVersion>?
<scratchBaseDirectory>xsd:string</scratchBaseDirectory>?
<delegationFactoryEndpoint>wsa:EndpointReferenceType</delegationFactoryEndpoint>
<stagingDelegationFactoryEndpoint>wsa:EndpointReferenceType</stagingDelegationFactoryEndpoint>?
<condorArchitecture>xsd:string</condorArchitecture>?
<condorOS>xsd:string</condorOS>?
<gluece:GLUECE>
<gluece:Cluster Name=xsd:string UniqueID=xsd:string InformationServiceURL=xsd:anyURI>
<SubCluster/>*
xsd:any##other*
</gluece:Cluster>*
<ComputingElement>
<Info/>?
<State/>?
<Policy/>?
<Job/>*
<AccessControlBase/>?
xsd:any##other*
</ComputingElement>*
xsd:any##other*
</gluece:GLUECE>?
<gluece:GLUECESummary/>?
<ServiceMetaDataInfo>
<startTime>xsd:dateTime</startTime>
<version>xsd:string</version>
</ServiceMetaDataInfo>
</managedJobFactoryResourceProperties>
These properties allow inspection of the underlying compute
platform using an XSD rendering of the GLUE schema:
gluece:GLUECE, gluece:GLUECESummary
(Please see the GLUE schema for more information, using ad-hoc GRAM
properties:
hostCPUType, hostManufacturer, hostOSName, hostOSVersion
while a few provide introspection on the GRAM deployment itself:
localResourceManager, globusLocation, scratchBaseDirectory,
ServiceMetaDataInfo
The two EPRs are used by a client to discover where to delegate
credentials that will be referenced by future job submissions:
delegationFactoryEndpoint, stagingDelegationFactoryEndpoint.
----------
OPERATION job:createManagedJob
Request creation of a Managed Executable Job Resource whose EPR will
be returned in the response.
INPUT
message: createManagedJobInputMessage has one part:
<createManagedJob>
<InitialTerminationTime>xsd:dateTime</InitialTerminationTime>?
<JobID>wsa:AttributedURI</JobID>?
<wsnt:Subscribe></wsnt:Subscribe>?
<desc:job> ... </desc:job>
</createManagedJob>
The optional JobID element is used to request idempotent invocation
semantics in a binding-independent manner. The optional
wsnt:Subscribe element is used to request automatic subscription to
the newly created Managed Job.
This call can also create a Managed Multi-Job Resource, i.e. a
co-allocated job spread across multiple WS-GRAM hosts, because the job
element is actually in an XSD choice with a multijob element.
OUTPUT
message: createManagedJobOutputMessage has one part:
<createManagedJobResponse>
<NewTerminationTime>xsd:dateTime</NewTerminationTime>
<CurrentTime>xsd:dateTime</CurrentTime>
<managedJobEndpoint>wsa:EndpointReferenceType</managedJobEndpoint>
<subscriptionEndpoint>wsa:EndpointReferenceType</subscriptionEndpoint>?
</createManagedJobResponse>
The optional subscriptionEndpoint is returned if
-----------
Other operations, composed from the WSRF service environment, are:
[From WS-ResourceProperties] -- get access to factory status information
GetResourceProperty
QueryResourceProperties
GetMultipleResourceProperties
---------------------------------------------------------
PORTTYPE ManagedExecutableJobPortType
A Managed Executable Job Resource (MEJR) represents one job that has
already been submitted by a client.
It is a WSRF style resource with a resource properties document to
represent status of the job:
<managedExecutableJobResourceProperties>
<stdoutURL>xsd:anyURI</stdoutURL>?
<stderrURL>xsd:anyURI</stderrURL>?
<credentialPath>xsd:string</credentialPath>?
<exitCode>xsd:int<exitCode/>?
<serviceLevelAgreement>
<desc:job> ... </desc:job>
</serviceLevelAgreement>
<Capacity>xsd:int</Capacity>
<userSubject>xsd:string</userSubject>
<fault/>
<TopicExpressionDialects>xsd:anyURI</TopicExpressionDialects>
<Topic Dialect=xsd:anyURI>
xsd:any?
</Topic>+
<TerminationTime>xsd:dateTime</TerminationTime>
<localUserId>xsd:string</localUserId>
<CurrentTime>xsd:dateTime</CurrentTime>
<holding>xsd:boolean</holding>
<RegistrantData>xsd:base64Binary</RegistrantData>
<RendezvousCompleted>xsd:boolean</RendezvousCompleted>
<FixedTopicSet>xsd:boolean</FixedTopicSet>
<state>Unsubmitted|StageIn|Pending|Active|Suspended|StageOut|Cleanup|Done|Failed</state>
</managedExecutableJobResourceProperties>
These properties relate to job output file management:
stdoutURL, stderrURL
delegated credential management:
credentialPath
parallel task rendezvous (for MPICH-G2):
Capacity, RegistrantData, RendezvousCompleted
job status:
exitCode, fault, holding, state
job introspection:
credentialPath, serviceLevelAgreement, localUserId
for WSRF introspection:
TopicExpressionDialects, Topic, FixedTopicSet,
TerminationTime, CurrentTime.
----------
OPERATION exec:release
Releases job from hold state. The hold state is an optional behavior
selected in the job description to prevent post-execution file
deletions (clean-up) from occuring while a remote client is still
attempting to access the files. The release operation permits the
normal clean-up to occur.
INPUT
message: releaseInputMessage has one (empty) part:
<release/>
OUTPUT
message: releaseOutputMessage has one (empty) part:
<releaseResponse/>
-----------
In addition, a rendezvous provides an additional operation:
[From GT4 rendezvous manager type] -- support for bootstrapping MPICH-G2 etc.
register
Other operations, composed from the WSRF service environment, are:
[From WS-ResourceLifetime] -- schedule termination of a job
SetTerminationTime
Destroy
[From WS-ResourceProperties] -- get access to job status information
GetResourceProperty
QueryResourceProperties
GetMultipleResourceProperties
[From WS-BaseNotification] -- subscribe for job status notifications
Subscribe
GetCurrentMessage
---------------------------------------------------------
Job description document syntax for use in creating a Managed
Executable Job Resource:
<job>
<factoryEndpoint>wsa:EndpointReferenceType</factoryEndpoint>?
<jobCredentialEndpoint>wsa:EndpointReferenceType</jobCredentialEndpoint>?
<stagingCredentialEndpoint>wsa:EndpointReferenceType</stagingCredentialEndpoint>?
<localUserId> ... </localUserId> [0..1]
<holdState> ... </holdState> [0..1]
<executable>xsd:string</executable>?
<directory>xsd:string</directory>?
<argument>xsd:string</argument>*
<environment>
<name>xsd:string</name>
<value>xsd:string</value>
</environment>*
<stdin>xsd:string</stdin>?
<stdout>xsd:string</stdout>?
<stderr>xsd:string</stderr>?
<count>xsd:positiveInteger</count>?
<libraryPath>xsd:string</libraryPath>*
<hostCount>xsd:positiveInteger</hostCount>?
<project>xsd:string</project>?
<queue>xsd:string</queue>?
<maxTime>xsd:long</maxTime>?
<maxWallTime>xsd:long</maxWallTime>?
<maxCpuTime>xsd:long</maxCpuTime>?
<maxMemory>xsd:nonNegativeInteger</maxMemory>?
<minMemory>xsd:nonNegativeInteger</minMemory>?
<jobType>mpi|single|multiple|condor</jobType>?
<fileStageIn>rft:TransferRequestType</fileStageIn>?
<fileStageOut>rft:TransferRequestType</fileStageOut>?
<fileCleanUp>rft:DeleteRequestType</fileCleanUp>?
<extensions>xsd:any##other</extensions>?
</job>
An extended form of this syntax consists of an array of the above
descriptions to define a "multi-job".
--
Karl Czajkowski
karlcz at univa.com
More information about the ogsa-wg
mailing list