[glue-wg] GLUE 2.0 Specification - draft 41 - Working GroupLastCall

Burke, S (Stephen) S.Burke at rl.ac.uk
Thu May 15 09:05:20 CDT 2008


>  OK, that's all the main entities, I'll get to the Computing part
> tomorrow.

or the day after ...

  For the Computing diagram, I'm unsure about the multiplicities. I can
reasonably believe that you must have a Share and an EE, but I'm not
entirely sure if the Endpoint should be mandatory at the schema level,
e.g. would you ever want to publish something where you only had local
submission? And it seems slightly strange that there is no direct
relation from EE to Service, although it's the result of how the main
entities are related - it will be interesting to see what this looks
like in LDAP. Also I don't understand why there is a relation from
Benchmark and ApplicationEnvironment to ComputingManager, I would expect
them to only relate to the EE.

  ComputingService: I think we should explicitly say what the contraints
are on the *Jobs attributes, i.e. that Total is the sum of the others (I
assume?). Also are these numbers supposed to include non-Grid jobs? In
the table the relations to Endpoint and Share have multiplicity * but
that doesn't correspond to the diagram (as above), i.e. it should be
1..*.

  ComputingEndpoint: the description says that it may be used for things
like reservation and proxy manipulation, but that seems a bit odd to me,
e.g. the attributes (Staging and JobDescription) probably wouldn't be
relevant for such things. Maybe this raises a more general point: can a
ComputingService only be composed of ComputingEndpoints, or can it have
generic Endpoints too?

 ComputingShare: MaxMemory says that it's the maximum RAM, but is it in
fact the total memory size whether RAM or swap? The Tag says that it's
user-defined but it can presumably also be grid-defined. Anyway can this
not just be OtherInfo? Also the association says ComputingResource and
not ExecutionEnvironment.

  ComputingManager: I think the Homogeneity attribute would be better to
be called Homogeneous if it's a boolean, i.e. True means it is
homogeneous. Why is there a NetworkInfo attribute here, surely this is a
property of the EE and not the LRMS? Is there a relation between TmpDir
and the WorkingArea variables - and again, is this a property of the
LRMS or the EE? What is the difference between WorkingArea and Cache?
Are the WorkingAreaTotal and Free supposed to be dynamic attributes? If
so, what do they mean given that WN-local working areas could all have
different sizes, and how could anyone write an info provider? Similarly
for CacheTotal and CacheFree - presumably they are WN-local areas, so
what does an LRMS-level value mean? (Basically I think the descriptions
are not adequate here ...) The WorkingAreaLifetime is presumably a
minimum value, i.e. the files might stay longer. In the associations,
the EE is marked as multiplicity 1..* but the description says "zero or
more".

  Benchmark: as above, I don't understand why it's related to the
ComputingManager, it's a hardware property. There is no explanatory text
for this object, I think there should be something.

  EE: Seems to be missing the attributes inherited from Resource. The
marking of a few properties (Platform, MainMemorySize, OSFamily and
ConnectivityIn/Out) as mandatory seems rather arbitrary - I can see an
argument for OSFamily to be important enough to be mandatory, but not
the others. I think it could be made clearer exactly what is meant by an
instance of an EE. The multiplicity for the association to
ComputingManager is 1 which seems wrong given that ComputingManager
itself is optional, at least according to the diagram.

  ApplicationEnvironment: I think the Name attribute should be named
something different, given that everywhere else Name just means a
human-readable description with no semantics (which appears to be the
definition of the Description attribute here). Also as someone pointed
out we may need more information about the application, e.g. compiler
options. The explanation of Repository doesn't really enable me to
understand what it means. Again the ComputingManager association is
marked as mandatory when the object itself is optional, and as before I
don't understand why this relation should exist at all since AE is a
property of the EE and not the LRMS. In general I think this is one area
where some examples would help a lot.

  ApplicationHandle: there is no explanatory text, and I think it needs
some.

  ComputingActivity: is the Owner generally supposed to be a DN? If so I
would suggest typing it as DN_t rather than String, and indicating
anonymity with a Paul-style special DN like /O=Grid/CN=ANONYMOUS.
UserMainMemory - again this says RAM but I suspect it means the total
memory. ProxyExpirationTime: for VOMS proxies this is complex as every
AC has its own expiration time as well as the one for the entire proxy.
Should this be the smallest of all those times? For SubmissionHost, I
don't understand why this would have a port, or indeed what the "e.g."
part in the description means at all. The associations to Share and EE
are marked as optional, shouldn't they be mandatory?

  CS2SS: is missing any explanatory text.

  That's it for computing, storage to follow ...

Stephen



More information about the glue-wg mailing list