[ogsa-d-wg] Notes from the London OGSA-Data face-to-face Meeting.
Mario Antonioletti
mario at epcc.ed.ac.uk
Wed Jun 1 08:52:06 CDT 2005
OGSA Data face to face - 26/05/05
=================================
Notes: Mario Antonioletti
Attendees:
Mark Morgan [MM], University of Virginia
Hiro Kishimoto [HK], Fujitsu
Andrew Grimshaw [AG], University of Virginia
Stephen McGough [SMcG], LeSC
Ann Chervenak [AC], USC/ISI
Bill Allcock [BA], ANL
Peter Kunszt [PK], CERN
Dave Berry [DB], NeSC
Mario Antonioletti [MA], EPCC
Neil Chue Hong [NCH], EPCC
Stephen Davey [SD], NeSC
Michael Drescher [MD], Fujitsu
Jay Unger [JU], IBM
Andreas Savva [AS], Fujitsu
Susan Malaika [SM], IBM (on the phone)
Tom Maguire [TM], IBM (on the phone)
Summary of salient points:
* ByteIO needs a truncate operation to avoid the truncAppend with
0 bytes offset.
* In ByteIO truncate needs an offset from which the truncate takes
place.
* In ByteIO consider stating whether data is coming from a stream
(or not).
* Requirement to discuss a streams interface in a future telcon (either
ByteIO of data WG).
* In the data replication section the distinction between replication
and caching needs to be made.
* Talk to the JSDL to see if the data movement descriptions can be used
to describe data movement without over complicating JSDL or requiring
them to become a full blown work flow language.
* Move away from using the volatile/temporary, permanent and durable
space classification in the storage management section, possibly with
QoS and lifetime.
Actions:
[PK] Produce a new draft of the storage management section.
[SD] Write up some of the use cases (for an appendix?).
[AG] Produce a picture of the OGSA Data Architectural stack.
[BA] Go through the data transfer section.
Presentations given at the meeting are available from:
http://forge.gridforum.org/docman2/ViewProperties.php?group_id=150&category_id=1064&document_content_id=3844
Where these are available at the time of posting I have put explicit
links to the corresponding talks.
+---
MM talks about ByteIO.
Presentation:
http://forge.gridforum.org/projects/ogsa-d-wg/document/ByteIO_status_and_introduction/en/1
- Need for this has been expressed by several groups.
- Want something simple.
- ByteIO is an interface, not a service.
HK: in BES had a discussion about services/interfaces. If an
interface supports a capability or metadata in support of that
interface then it is a service.
AG: yes, define the interface but also state resource properties -
when taken together this is a service.
MM: I don't view that way. That is not a service.
AG: does it have portTypes we can agree on and resource properties?
The implementation is something else.
MM: We want it to be composable. Do not want for it to stand on its own.
It will have properties.
PK: the client and service API do not have to be the same - you said -
nevertheless to implement an interface the service should allow
the implementation to provide both.
...
HK: Is this WG going to based on the basic profile?
MM: yes.
MM shows some UML of the interface based on what was presented at GGF.
...
BA: why not have a truncate? It's a hack if you want to truncate and
append 0 bytes to achieve the same thing.
MM: not going to fight that.
PK: want to keep the service side API as simple and as small as
possible and have the richness in the client side API.
BA: have a concern that there is no open and close in this ... realise
that it is intentional.
AG: the assumption here is that you will use RNS to open these files.
Once you have the name then you do not need to open it - that is
what basically gives you the inode in Unix. The client side will
open a file by finding the name - the security aspects is out of
scope.
MM: open for read and open for write is a client side issue ...
BA: it is authorisation ...
AG: but that will be addressed at the client side.
BA: so can only determine what my level of authorisation by trying it
out and seeing if it fails?
AG: yes ...
BA: do you have a stat?
MM: yes.
HK: does the truncate need an offset? (truncate to a point).
MM: no.
HK: POSIX does have an offset. How does it map?
MM: that is a good point.
...
BA: is what you envision here is that I send a SOAP message - and ask
for 1Mb of data - how does it do that?
MM moves on to the Bulk Data Transfer slide.
PK: how much data are you talking about?
MM: whatever the client asks for.
PK: for large amounts of data you also need to concern yourself with
the network and manage the network resource.
AG: trying to define a simple API for block reads. Different people
can choose to implement that in different ways.
BA: you don't have GridFTP in your list of protocols ....
MM: not limiting the data transfer protocols ...
BA: I have a system that can do that today ... want to make sure that
it fits.
AG: this is not an ftp based data transfer ... we are trying to come
out with a basic interface that offers basic POSIX like operations
... you can use GridFTP to implement this ...
...
BA: what I see is 150 ms round-trip, 3 s of deserialisation time, then
it opens a GridFTP session, brings back the data and then it
closes the session as you do not have support for sessions. Your
performance is going to be horrid.
DB: would the end point not be created when you open the GridFTP
session which has that interface ...
BA: so you save on the session start up but you still have the send a
message and get the block back ... what is the most common usage
scenario?
AG: there will be other pieces or other libraries that it will
interact with that will be able to emulate NFS, CIF, etc which
will manage coherence and other such stuff....
BA: you cannot ignore the use case where you take the POSIX interface
and try to do that on a remote file.
AG: there will be all sorts of latencies.
MM: we have this interface that works with various systems ...
BA: just want to make sure that a GridFTP is taken into account ...
as long as we can fit this in ...
NCH: there is going to be discussed in the spec.
PK: If you talk about GridFTP then someone will implement on top or
below GridFTP?
MM: trying to provide a simple interface to allow people to talk to
files. If you have high performance requirements you may or may
not want to use this. This is a general way of doing things.
BA: this is an optional interface ... if you guys want to come up with
this and if it can be implemented on top of GridFTP then your
performance will suck but we may be able to work round that and
try to optimise it. There is nothing that prevents it from
bypassing this and using another mechanism.
MM: absolutely.
DB: but should take into account if there are things that will make it
performant.
BA: there will probably be a SOAP control channel for GridFTP in the
near future.
AG: that is similar to an issue here - you do not want to use SOAP to
do the bulk control transfer. Avaki did this using SOAP for the
control channel and sockets for the data transfer.
MM: not all of the options offered here are right for everyone.
Any WS-* that can be applied to the data transfer.
Added a QName at the end of the functions that allows the transfer
mechanism to be specified.
DB: the problem with in-line being the default for the transfer is that
it will crash the server when too much data is required.
PK: but the implementation will protect you from this.
...
BA: how do you know the encoding of in-lined data?
MM: we will profile what that means ...
BA: Could offer RFT.
...
MM: for properties will start off with POSIX stat and add on to that.
In Linux /proc has to deal with a lot of the same issues.
DB: why would you worry about the file create time?
MM: you may or may not want to worry about this...
PK: does this not come from RNS?
AG: RNS may not have that information. Would rather keep this
metadata, at least logically, with the file as in the naming
scheme you might have different paths to this and you would want
the metadata to only come from one destination.
PK: maybe it should be both. If you have data at the data resource
where can I get it from a replica....
AG: the EPR that is kept in the directory structure - don't recall if
there can be multiples of the same thing ... I don't think that
replica management has been addressed in that ... RNS would have
an EPR pointing to some service doing the replication as opposed
to pointing it to the files themselves.
...
DB: The point I was trying to make earlier is that although all these
Properties may be in the EPR, they don't have to be held in the
ByteIO interface
Suppose I want to know whether the data resource is writable.
AG: that depends on who you are ... some things are fundamentally
unwritable but some depend on who you are.
PK: what are the properties for other data resources? For instance a
BLOB in a data resource.
NCH: you do not have to necessarily know and there are other
interfaces that will hold this.
...
MM goes on to talk about use cases.
File use case.
PK: would it make sense, for streams, to specify in the read that it
is not seekable?
MM: assuming the interfaces presented what would it give you?
PK: it would be much cleaner for a client.
MM: reason why we shy away from streaming in this WG as we are trying
to satisfy what people need right now.
AG: I agree with what peter is saying - if I read I just consume it
and the next person will read the same data. This would not be
true for streams.
NCH: it is a simplification of the interface.
AG: If I want to cat a file that is going out to a service ...
BA: there is always the debate as to whether you ignore capabilities
you cannot perform on the stream or fault them ...
PK: does it make sense to explicitly state that you are a stream?
It makes the client's life easier.
AG: extending the interface makes sense but, playing devil's advocate,
if wanted to cat do I have to be aware that there are streams?
PK: you can write cat in different ways - can cat with a stream or you
can do the cat yourself.
...
Happy with simplified semantics.
AG: The stream interface needs to be discussed in a telcon soon.
...
DB: the sensor use case was basically covered by what MM said about
streaming in his discussion.
+---
NCH takes over to discuss a DAIS use case.
Presentation:
http://forge.gridforum.org/projects/ogsa-d-wg/document/DAIS_Use_Case/en/1
ByteIO in this instance is treating data as a virtual file.
NCH discusses a use case where data to be entered into a data resource
is conveyed to the data service using the ByteIO interface.
PK: in GSM - if you put stuff in storage - you have to say how much
data you are going to put in. Also, what seems to be missing here
is your Virtualization description of how you access the data, the
structure - is there not a need to describe this. In the insert
you would have to state the format ...
AG: ByteIO is working as an HTTP GET.
BA: this should be an execute insert.
NCH: this is not a use case where you treat a database as a file.
Some discussion as to whether pushing data to a data service and then
put it in the database. Not a model that is currently supported by
DAIS.
For the "complex insert" use case.
PK: The pull model is simpler as a service provider but from the
client side it is more difficult as you do not have a service to
serve the data.
For medical imaging they pull an image from the database, they do
some processing and then want to push the image back to the
database.
BA: if they pulled the data out of the database they should be able to
push the data back out to the database.
...
Moving on to the "Simple copy" use case.
BA: why would you do this using ByteIO?
NCH: 'cause I needed a use case.
PK: if you do copying around then you need to do it at the transfer
level - you need to manage the storage and manage the network.
NCH: there are implementation layers that are not explicitly shown
here.
MM: ByteIO telcons are every second Tuesday at 11 easter time, 4pm UK
time.
...
+---
DB puts a copy of the agenda for the remainder of the meeting up.
Some agenda bashing ensued and a slightly modified agenda was
agreed for the rest of the meeting.
DB gives a quick synopsis as to what went on at the OGSA face to face
earlier on in the week and the data issues that were presented to the
group (all in the slides).
Presentation:
http://forge.gridforum.org/projects/ogsa-d-wg/document/OGSA-D_status_and_introduction/en/1
A brief summary of what has been put in each of the sections by the
corresponding authors.
+---
MA gives a brief overview of what he has put in relation to data
access in the OGSA Data Architecture straw man document.
Presentation:
http://forge.gridforum.org/projects/ogsa-d-wg/document/Access_section_status/en/1
A discussion follows on what is the scope.
AG: client should be able to manipulate the basic service. May also
want clients that access higher level services, e.g. NFS and
CIF. We have to be clear of as to what layer of the stack we are
talking about. Some folks will want transparency while other
people will want the performance level that talks to "bare"
services.
PK: can use this to structure the document.
DB: the intention has always been to have a document that talks about
scenarios.
AG draws a picture on a white board:
[Powerpoint version of these diagrams subsequently drawn by DB and made
available at
http://forge.gridforum.org/projects/ogsa-d-wg/document/Architecture_drawing/en/1
]
+-------+
| client|
+-------+
| libs |
+-------+
| +----------------+
+--------->|Access Services | (Out of scope but
| |(NFS, CIF) | it is a use case)
| +----------------+
|
| +--------------------+
+--------->|caching/replication | (in scope)
| +--------------------+
|
| +---------+
+--------->|transport| (in scope)
| +---------+
|
| +--------------------------+
+--------->| bare services, files,etc | (in scope)
+--------------------------+
Consumers at the top. Producers at the bottom of the stack. A consumer
at the top can talk to any of the layers below depending on the
performance/abstraction required.
DB draws a picture of his conception of the architectural stack:
+-----------------------+
| Federation |
+-----------------------+
| Replication/Caching |
+-----------------------+
| Access (client access)|
+-----------------------+
| Storage |
+-----------------------+
| Transfer |
+-----------------------+
| Data + Description |
+-----------------------+
DB noted that this is more an axis of increasing capability rather than
an architecture per se.
Discussion followed regarding what federation means and whether this
should include transformation.
DB: maybe at the end of each section we should have a diagram of the
architecture thus far.
JU: have a concern about that ordering - replication and caching can
be viewed as the replication and caching of the individual sources
or of virtualised objects. There is no strict layering or you have
to assume one. In IBM have decided that replication and caching
takes place above virtualisation. That ordering gives you more
flexibility than the other way round.
Also talk about federation and transformation being infinitely
recursive.
...
Discussion follows as to whether virtualisation is above or below the
replication and the different pros and cons that the different choices
entail.
...
+---
AC gives a summary of the data replication section.
Presentation:
http://forge.gridforum.org/projects/ogsa-d-wg/document/Replication_section_status/en/1
AG: does copying involve some or all of the data - does that include
caches?
AC: we do include subsets.
JU: depending on the source type there are a number of ways to
establish subsets.
AC: yes ... have not made a clear distinction of replication and
caching.
JU: a way I can look at is that replication is an explicitly planned
and meta data managed activity - someone has explicitly created
two or more copies and established what the relationship between
them is through metadata whilst caching the data movement is an
automatic and temporary instance.
BA: to me, it's the scoping and visibility of it. Caching is not
visible to you.
AG: but if you make caching first class citizens then you can tie the
two mechanisms together.
JU: so caching would have no explicit RNS name and lifetime.
caching is an exploitation of replication that has no RNS name or
explicit lifetime.
AG: but there will be users who can/want to see that.
JU: It seems like it will be possible to define caching as a limited
scoping of replication.
DB: I made a distinction between replication and caching ... whether
the primary source knows of the cache.
PK: they both use the storage resource to do the caching.
JU: want to explicitly define the subset relationship between caching
and replica management.
Did you add the peer-to-peer model?
AC: not as well defined, need to flesh this out.
how to name replicas and partial replicas?
JU: RNS.
AC: the RNS spec is all hierarchical ...
AG: I personally would not do it in RNS. I would create a WS-Name that
says that this is this file and the implementation would know the
replication.
What you have been discussing - is there an interface or is this a
proposal? So for consistency you could say Unix consistency which,
when you wrote to a catalogue, it would write through.
AC: want to be able to define consistency models that could be
supported or the service states that it does not support that
model.
AG: to fit inside OGSA the names should be RNS, the abstract names
would be WS-Names and the low level addresses would also be
WS-Names with different IDs.
HK: the third one should not be an address?
AG: the third name can be an address ... you could say that it is an
EPR and not a name.
PK: RNS is scoped for humans, the replica discovery would be to get
the replica - what are the semantics? If you make the comparison
to a file system - one is mutable and one is not?
AG: The first abstract name would be similar to an inode. It depends
on whether you want replication to be visible to the client.
PK draws a picture on the white board:
+-- -+
/ |
/ |
Human name ---> "inode" (GUID) ---- +- Replicas
\ |
\ |
+-- -+
User assigned System assigned sys/user assigned
AG adds
EPR
<
To:URL
AbstarctName
Resolver:EPR
>
How do you resolve the GUID to a resolver?
Can use an AbstractName which needs to be resolved that
then allows you to pick the right replica.
JU: the back end could be WS-Names. WS-Names are EPRs.
If you want to make the EPRs visible then you map these
to human names.
AG: we did it both ways in legion. Could make both visible to the
client which is easier to implement or can only make one version
visible through a guid like thing and make that late binding which
is harder to implement.
...
AG: there is a versioning model that the replicas can check for
consistency in a lazy way ...
AC: we did not specify it to that level.
AG: used a consistency window in Avaki and has worked that well.
AC: you could do it both ways.
PK: mentioned that replicas are created via at the replica factory and
then specify the consistency at the creation point - does the
factory then manage that?
JU: it should be an unimportant as to who manages it.
PK: as the client I might care.
JU: but you talk to the EPR so you don't care who you talk to. The URL
may not point to the factory - it could talk to another service.
PK: if I create a new replica where do I go?
JU: the factory could have talked to another service managing the
replication and got it to create the replica (as an EPR) and sent
it back to you or the factory could be doing the actual creation
but this is irrelevant to you - you do not need to know.
...
AG: could have a replica manager in the middle or it could all be
separate.
JU: if you view the factory as a management interface could have an EPR
with only the management interface - could then use the human
readable name to actually access the replica.
DB: (to Ann) how do you see this chapter moving to describe specific
interfaces?
AC: Don't have an answer to that. Allen may have a better view on
that.
** Lunch
+---
DB introduces the next session on Data Transfer. No slides for this
part of the agenda.
There may be a link in to infod on this. There are a number of issues
- protocol selection
- descriptions of source/sink interfaces
- transfer controlled by a separate service in every case or only
a subset of cases?
- transferring bytes with no interpretation
BA: get a strong argument for me - for data transfer you just transfer
bytes period.
...
DB: when setting up a pipeline you may do some negotiation.
BA: that is a higher level thing. It ultimately all translates to
using a socket.
DB: but the architecture has to translate to the picture that AG
drew.
BA: there needs to be a distinction on what you are building on.
DB: currently have a byte level view and a structural level view ...
Tried to separate the operation level parameters - need to see
what is specified in the source interface and what specifies the
transfer.
BA: at what level are you talking about?
DB: all levels.
BA: so the interface looks the same at all levels?
DB: no ...
BA: when you do this negotiation there should be the inclusion of a
schema that you use. ... Malcolm and I have talked about
negotiating the protocol on locale and if you are in the same
container then all you need to do is send the pointer to your
buffer.
DB: so how does this separate from scheduling?
BA: would hope that a lot of these things are hidden under the covers
as you negotiate with each other.
....
AG: if you pass a pointer then you would have to pass a copy ... it
breaks the web service model.
BA: Malcolm would not be happy with that answer ...
AG: but if you are going to use Web Services ...
BA: maybe not a good choice ... the pointer may not work but the
content exchange model based on local would be useful
AG: I would agree but I would think that it is out of scope.
DB: you could negotiate .... if memory-to-memory copy is not part of
the allowed protocol ....
AB: the hard part of this is how to figure out locale ...
DB: but locale has already been figured to be an interesting property
to have.
Other issue is on how to link this to infod. The infod uses a
publish/subscribe model. At GGF13 pushed this to see if a data
transfer. Looked at a use case...
SD: can't you just use WS-Addressing to do this.
DB: I don't know if other people have used WS-Addressing for this.
PK: in the storage resource manager there is a copy that does data
transfer. It internally manages the data transfer.
Discussing how these additional things are taken into account
in EGEE.
...
DB: you come from the large data/large data movement ... you mentioned
the linking of that into scheduling. OGSA have been asking us what
information from data should go into scheduling and this could be
something.
PK: in EGEE want to associate as much as possible in the computation
and data areas so that you do not have to do the same thing twice.
AG: have the same in BES we call these things activities, could be a
computation or it could be a data resource. Try to get as much
info on these to manage decisions.
PK: eventually you map into whatever transfer service you have access
to ... then you link into a scheduler.
AG: could also link to a transformation....
PK: these transformations are coupled into the data transfer ....
DB: encryption and compression are explicitly mentioned in the document.
...
PK: you can view any computation that has input and output data as a
transformation. The Griffin project do this - no diff between this
and encryption.
AG: that was where it was in Avaki...
...
Discussion as to whether the role of an operation is expressed in the
operation name or if it is an attribute.
Need to check whether infod have looked at the scheduling to transfer
large amounts of data.
...
DB: Is it the case that you first need a mechanism to negiotiate and define
the transfer, and then a separate mechanism to actually perform the
transfer (possibly after some scheduling phase)?
...
Talk about the different levels of encapsulation.
...
PK: needed something that can manage the data transfers and book the
bandwidth.
(to BA) what is a suitable interface for data transfer?
AB: there is no obvious transfer interface out there... that's why we
used the XIO interface to GridFTP ... synchronously ..
[ Note from DB: this misses part of the discussion. PK and BA agreed
that an API for async transfer was needed. They said that there
wasn't an obvious API for synchronous transfer. ]
DB: Do we need an interface for synchronous transfer, provided we have
one for async?
AG: In Legion if you had read from one object to write to another
it would go directly between these objects, no intermediary.
DB: so is there a need for one?
...
PK: submit, cancel, suspend, ... have analogies to data.
AG: a limited JSDL job with no execution.
BA: do not follow JSDL but the XML for the GRAM job contains the
whole RFT description for a job transfer - exactly what would
happen with no execution.
AG: I don't remember the richness of the JSDL data transfer
capabilities....
BA: would it be possible to put the profile in the job description
schema? Could we say that as part of the job you have to describe
the staging? Can you say the mechanism that is going to be used
for movement? So if you use a given movement you include a schema
...
SMcG: to deal with the simple case you can use a URI. Thought people
would want more complex types so we put in an xsd:any to allow
people to put the information you want in.
Unfortunately most tooling does not check what you put in
against any schema submission.
PK: so it might be worthwhile to go through JSDL and possibly see if
it requires to be extended.
AG: this is one of the things that the BES is looking at to see if
it can be used and made friendlier. ...
Now that data transport - as in staging - is back in scope ...
JU: the current BES container model is a 3 phase execution: stage,
execute, de-stage and the stage/de-stage spec in JSDL identify any
sources and/or sinks.
AG: you could just send in jobs that just have the stage in or just an
execute or just the stage out so you could use that for data
transfer.
JU: Don't want to allow BES to get more complicated as a work flow
job. If you want more complicated interactions then you have to
architecture something else or use something like BPEL.
BA: but if you want to treat transport as a job itself then, as long
as JSDL is rich enough, then you can pass this to GRAM and invoke
RFT using JSDL.
JU: that is all useful but the problem is that if you want to schedule
a transfer but want to guarantee that that takes place before or
after some event then you can't do this in JSDL .... you can use a
choreography to achieve this or through a third party service but
I do not want to make JSDL more complicated ...
....
Discussion as to whether BES can be used to describe the data transfer.
....
AG: there is a need for this.
MM: surprised that the GridFTP parameters cannot be put in the URI?
...
AG: maybe can use the execution part to do what you are looking for.
BA: if you know this is going to be there as a capability you can
either decide to break this into three stages and do something
else to do the managing but as you have decided to allow me to
move data then you need to make it rich enough to allow me to
do that.
AG: you need to talk to the JSDL people.
...
Need to make data transfer a first class citizen.
...
** Break
+---
PK talks about the data storage management section that he wrote.
He projects a version of his document so no slides.
...
In GSM define three different types of spaces (which have semantics):
- volatile or temporary space - like /tmp.
- permanent space - you expect your data to be kept so requires
higher QoS (have backups).
- durable space - something in between. This would go into caching.
It cannot be removed for an agreed length of time.
AG: what about failure?
PK: volatile and temporary if it's gone, it's gone. For durable you
try and recover it.
BA: hate this concept.
AC: why not just use lifetime?
PK: the concept is there in the interface. To the user this may be
more useful. It is probably cleaner to use lifetime management and
give them a name. It helps - for instance in temporary space if
you give it a file to keep and ask it to be there for a day you
then get 0 guaranteed time back (i.e. the file could be removed at
any point).
You have 2 lifetimes - the guaranteed lifetime and the requested
lifetime.
Durable would allow pinning on temporary space where you get a
guaranteed lifetime.
AG: lifetimes may not be appropriate if the machine goes down for a
week then it might all get wiped when the system comes back up.
PK: durable is only a temporary storage until it moves to another
space.
AC: in durable it sends you an email if the file is to be deleted?
...
BA: for the cached/temporary space the user should understand - don't
put it there if it's important to you. The permanent storage then
asks for a different policy. It's all just QoS and lifetime
management. It then becomes reasonably straight forward.
PK: personally I agree with you but other people in the community feel
that the semantics are important. I think that this is a mismatch
and they should explain to the client what these things mean.
AG: why not make it that it calls it permanent storage and just call
it temporary space...
BA: they don't trust disk space - permanent space is tapes for them
...
AC: do we want to put that concept of permanent space in this
document?
PK: Have the same issue in SRM.
...
BA: if you ask for 1Tb of permanent space but you only use 100Mb then
you could allow others to use the 900Mbs as temporary space ...
this allows you to do that. If the other user then requires the
other 900Mbs then the files stored in there just get wiped.
PK: do not have lifetime in the document ... should probably add it.
Also could add advanced allocation.
AG: don't think that you can regard permanent as boolean. You really
have a range which have different QoS. Classics are availability,
probability of failure, lifetime, ....
...
BA: should have a designation as to whether it is a volatile or not -
whether I can blow it without your permission or not.
...
PK: average latency, cost/Gb ... how do you represent that?
AG: saying something is durable depends on your fault model. Does
durability mean survivability or something else?
PK: might be better to have flag to see whether it is temporary or
not?
BA: yes, volatile yes or no.
PK: whether there is a back-up?
AG: what is the latency between it going from disk to back-up.
AC: prefer this notation rather than what you have there. Whether it
stays in SRM is another issue.
...
PK: advance reservation in the GSM - people thought it should not be
done. Should it be done in this group? Lots of people do not
believe in this ... as long as there is no payment it is difficult
to enforce.
HK: do you have a use case?
PK: all use cases can be re-phrased in other ways.
AC: it would be a nice capability.
AG: you can do that by calling a system administrator .... it should
be a capability. You would make an agreement about storage.
PK: you would use WS-Agreement?
AG: you could.
PK: availability - also QoS: does it support quotas? lifetime
management? transfer? Trying to convince people in GSM that
things should not only be done in the storage system. In this
document the storage could give hints to another transfer service
to schedule the transfers. Let it know the files that you want to
transfer out will be on disk ... the storage systems have their
own staging pools and they have to optimise that. If you have tape
you need to do all this. Conceptually the capability it should
have is to expose these hints to the transfer layer.
HK: I think these are important characteristics but these do not quite
match the title ...
PK: produced this document very quickly.
HK: you also need input data.
DB: maybe should put in a different section that states information
that is relevant to scheduling.
PK: there is also caching.
It would be nice to have a concept map.
AC: the job scheduler confuses me a little.
PK: BES
AG: something else other than BES will be doing the scheduling.
...
PK: if you have the ability to do reservation then a job scheduling
manager would just call these capabilities - have a use case where
you have the storage metadata can be used be a scheduler to
determine whether it can do the job.
AC: the kind of thing that a workflow manager can use.
PK: storage reservation ... how do we expose this in terms or a
capability of a storage?
AG: you can unify these - go to the storage management factory and
agree on how much space you will have and for how long. The lease
space allows you to extend the reservation, etc you can relate all
of this to agreement.
PK: up to know what has been said have tried to support the semantics
of mass storage systems but could be supported by any object
store.
Have tried to describe the differences between a file store and a
database store.
AG: a lot of DBs do their own storage management. Trying to impose
yourself between a database and its storage management would be a
challenge.
JU: trying to use a file system for a DBMS would be a problem but
trying to impose yourself between the DBMS and the disk would
not. How this is done varies from database and database.
BA: it is still writing to disk store.
PK: the pattern I'm talking about is that the client would make to the
database and reserve some space.
JU: it depends how you do this. DBMSs deal with a virtual disk.
In lots of NAS and SANs you can request a virtual disk of
any size (a block server with a disk). This only makes sense
for a small number of databases.
PK: this simplifies my life as I can then restrict this section to
files. How would I specify it now?
AG: trying to do this will complicate your life.
PK: so it might be worth throwing out.
DB: worth talking to a couple of other folks.
JU: SANs deal with a virtualized LAN...
AG: this gets you into virtualized SANs....
JU: for grids if there is little penalty for creating a virtualized
disk why not use that as opposed to be tied down to file space.
AG: Your allocated a block of storage and a handle to it ..
JU: do you want to write a handle of that or type it. If it is in a
virtual LAN which only understands block I/O.
PK: it does not matter in that respect as you get the SURL and
translate that to a file handle. If it is a virtual disk then you
get ...
JU: so it is typed.
PK: but you can get back a different type.
...
PK: discussion about what the type of data that you are retrieving
through the storage - is it a file? My answer that is specific to
a protocol. It could be a stream. It does not matter as long as
you can talk to it. If you just want to access it through POSIX
I/O then that is fine.
...
JU: this talking about access to storage .... you can get a chunk of
storage and getting a handle you can access through a protocol.
You can also get access to management ops that affect that
storage.
PK: this is about managing storage as a resource -
allocating/deallocating...
JU: so there is an I/O performance that is associated with the size.
PK: one of the parameters may be the latency and bandwidth.
JU: if you have an application that is accessing a detector - the
nature of that application may be that it needs 200 Mb/s to write
the data from the detector but once it is written the other
applications may only require 20 Mb/s to access it so saying that
you slow it down is fine.
AG: that's a QoS issue
JU: so all I'm saying is not to restrict yourself to
allocation/deallocation... the metadata associated with the
storage are broader.
AG: that is something that could go into the agreement....
...
HK: using POSIX examples - you have some storage, the user appends to
it. The space allocation happened when they accessed the storage
....
PK: if you have an app that needs to extend its quota you would have
to do that before you started writing to the system.
HK: checking when the allocation takes place.
...
DB: might be worth moving to a wrap up stage and discuss how to
proceed the document.
...
Discussion on how to proceed. MA suggests fleshing out AG's stack and
possibly using the concept maps. Can also do the individual sections.
SD: can talk about overlapping sections...
DB: maybe bubble up things from the different section. Add a new
section or push them up to OGSA.
...
AG: I could try to draw a picture.
DB: that would be good.
...
DB: aim to submit the present document as is.
SD: is it worth putting the use cases as an appendix?
MA: can see if you can implement the use cases based on the technology
described in the document.
AG: also have the BES use cases.... don't think they have been written
as yet.
...
DB: have volunteers to write the storage management section [PK],
write use cases [SD], do a picture of the architecture [AG],
revise the data access section [SM], go through the data transfer
section [BA].
AG: (to PK) would like to synchronize the work that you guys are doing
with stuff coming out of RNS, BES and stuff ...
Get people to write what the security requirements that are
relevant to data only.
+---
+-----------------------------------------------------------------------+
|Mario Antonioletti:EPCC,JCMB,The King's Buildings,Edinburgh EH9 3JZ. |
|Tel:0131 650 5141|mario at epcc.ed.ac.uk|http://www.epcc.ed.ac.uk/~mario/ |
+-----------------------------------------------------------------------+
More information about the ogsa-d-wg
mailing list