[ogsa-d-wg] Notes from the London OGSA-Data face-to-face Meeting.

Mario Antonioletti mario at epcc.ed.ac.uk
Wed Jun 1 08:52:06 CDT 2005


OGSA Data face to face - 26/05/05
=================================

Notes: Mario Antonioletti

Attendees:

     Mark Morgan        [MM], University of Virginia
     Hiro Kishimoto     [HK], Fujitsu
     Andrew Grimshaw    [AG], University of Virginia
     Stephen McGough    [SMcG], LeSC
     Ann Chervenak      [AC], USC/ISI
     Bill Allcock       [BA], ANL
     Peter Kunszt       [PK], CERN
     Dave Berry         [DB], NeSC
     Mario Antonioletti [MA], EPCC
     Neil Chue Hong     [NCH], EPCC
     Stephen Davey      [SD], NeSC
     Michael Drescher   [MD], Fujitsu
     Jay Unger          [JU], IBM
     Andreas Savva      [AS], Fujitsu
     Susan Malaika      [SM], IBM (on the phone)
     Tom Maguire        [TM], IBM (on the phone)

Summary of salient points:

 * ByteIO needs a truncate operation to avoid the truncAppend with
   0 bytes offset.
 * In ByteIO truncate needs an offset from which the truncate takes
   place.
 * In ByteIO consider stating whether data is coming from a stream 
   (or not).
 * Requirement to discuss a streams interface in a future telcon (either
   ByteIO of data WG).
 * In the data replication section the distinction between replication
   and caching needs to be made.
 * Talk to the JSDL to see if the data movement descriptions can be used
   to describe data movement without over complicating JSDL or requiring
   them to become a full blown work flow language.
 * Move away from using the volatile/temporary, permanent and durable
   space classification in the storage management section, possibly with
   QoS and lifetime.

Actions:

[PK] Produce a new draft of the storage management section.

[SD] Write up some of the use cases (for an appendix?).

[AG] Produce a picture of the OGSA Data Architectural stack.

[BA] Go through the data transfer section.

Presentations given at the meeting are available from:

http://forge.gridforum.org/docman2/ViewProperties.php?group_id=150&category_id=1064&document_content_id=3844

Where these are available at the time of posting I have put explicit
links to the corresponding talks.

+---
MM talks about ByteIO.

Presentation:

http://forge.gridforum.org/projects/ogsa-d-wg/document/ByteIO_status_and_introduction/en/1

     - Need for this has been expressed by several groups.
     - Want something simple.
     - ByteIO is an interface, not a service.

HK: in BES had a discussion about services/interfaces.  If an
    interface supports a capability or metadata in support of that
    interface then it is a service.

AG: yes, define the interface but also state resource properties -
    when taken together this is a service.

MM: I don't view that way. That is not a service.

AG: does it have portTypes we can agree on and resource properties?
    The implementation is something else.

MM: We want it to be composable. Do not want for it to stand on its own.
    It will have properties.

PK: the client and service API do not have to be the same - you said -
    nevertheless to implement an interface the service should allow
    the implementation to provide both.

...

HK: Is this WG going to based on the basic profile?

MM: yes.

MM shows some UML of the interface based on what was presented at GGF.

...

BA: why not have a truncate? It's a hack if you want to truncate and
    append 0 bytes to achieve the same thing.

MM: not going to fight that.

PK: want to keep the service side API as simple and as small as
    possible and have the richness in the client side API.

BA: have a concern that there is no open and close in this ... realise
    that it is intentional.

AG: the assumption here is that you will use RNS to open these files.
    Once you have the name then you do not need to open it - that is
    what basically gives you the inode in Unix. The client side will
    open a file by finding the name - the security aspects is out of
    scope.

MM: open for read and open for write is a client side issue ...

BA: it is authorisation ...

AG: but that will be addressed at the client side.

BA: so can only determine what my level of authorisation by trying it
    out and seeing if it fails?

AG: yes ...

BA: do you have a stat?

MM: yes.

HK: does the truncate need an offset? (truncate to a point).

MM: no.

HK: POSIX does have an offset. How does it map?

MM: that is a good point.

...

BA: is what you envision here is that I send a SOAP message - and ask
    for 1Mb of data - how does it do that?

MM moves on to the Bulk Data Transfer slide.

PK: how much data are you talking about?

MM: whatever the client asks for.

PK: for large amounts of data you also need to concern yourself with
    the network and manage the network resource.

AG: trying to define a simple API for block reads. Different people
    can choose to implement that in different ways.

BA: you don't have GridFTP in your list of protocols ....

MM: not limiting the data transfer protocols ...

BA: I have a system that can do that today ... want to make sure that
    it fits.

AG: this is not an ftp based data transfer ... we are trying to come
    out with a basic interface that offers basic POSIX like operations
    ... you can use GridFTP to implement this ...

...

BA: what I see is 150 ms round-trip, 3 s of deserialisation time, then
    it opens a GridFTP session, brings back the data and then it
    closes the session as you do not have support for sessions. Your
    performance is going to be horrid.

DB: would the end point not be created when you open the GridFTP
    session which has that interface ...

BA: so you save on the session start up but you still have the send a
    message and get the block back ... what is the most common usage
    scenario?

AG: there will be other pieces or other libraries that it will
    interact with that will be able to emulate NFS, CIF, etc which
    will manage coherence and other such stuff....

BA: you cannot ignore the use case where you take the POSIX interface
    and try to do that on a remote file.

AG: there will be all sorts of latencies.

MM: we have this interface that works with various systems ...

BA: just want to make sure that a GridFTP is taken into account ...
    as long as we can fit this in ...

NCH: there is going to be discussed in the spec.

PK: If you talk about GridFTP then someone will implement on top or
    below GridFTP?

MM: trying to provide a simple interface to allow people to talk to
    files.  If you have high performance requirements you may or may
    not want to use this. This is a general way of doing things.

BA: this is an optional interface ... if you guys want to come up with
    this and if it can be implemented on top of GridFTP then your
    performance will suck but we may be able to work round that and
    try to optimise it. There is nothing that prevents it from
    bypassing this and using another mechanism.

MM: absolutely.

DB: but should take into account if there are things that will make it
    performant.

BA: there will probably be a SOAP control channel for GridFTP in the
    near future.

AG: that is similar to an issue here - you do not want to use SOAP to
    do the bulk control transfer. Avaki did this using SOAP for the
    control channel and sockets for the data transfer.

MM: not all of the options offered here are right for everyone.

    Any WS-* that can be applied to the data transfer.

    Added a QName at the end of the functions that allows the transfer
    mechanism to be specified.

DB: the problem with in-line being the default for the transfer is that
    it will crash the server when too much data is required.

PK: but the implementation will protect you from this.

...

BA: how do you know the encoding of in-lined data?

MM: we will profile what that means ...

BA: Could offer RFT.

...

MM: for properties will start off with POSIX stat and add on to that.

    In Linux /proc has to deal with a lot of the same issues.

DB: why would you worry about the file create time?

MM: you may or may not want to worry about this...

PK: does this not come from RNS?

AG: RNS may not have that information. Would rather keep this
    metadata, at least logically, with the file as in the naming
    scheme you might have different paths to this and you would want
    the metadata to only come from one destination.

PK: maybe it should be both. If you have data at the data resource
    where can I get it from a replica....

AG: the EPR that is kept in the directory structure - don't recall if
    there can be multiples of the same thing ... I don't think that
    replica management has been addressed in that ... RNS would have
    an EPR pointing to some service doing the replication as opposed
    to pointing it to the files themselves.

...

DB: The point I was trying to make earlier is that although all these 
    Properties may be in the EPR, they don't have to be held in the 
    ByteIO interface

    Suppose I want to know whether the data resource is writable.

AG: that depends on who you are ... some things are fundamentally
    unwritable but some depend on who you are.

PK: what are the properties for other data resources? For instance a
    BLOB in a data resource.

NCH: you do not have to necessarily know and there are other
     interfaces that will hold this.

...

MM goes on to talk about use cases.

File use case.

PK: would it make sense, for streams, to specify in the read that it
    is not seekable?

MM: assuming the interfaces presented what would it give you?

PK: it would be much cleaner for a client.

MM: reason why we shy away from streaming in this WG as we are trying
    to satisfy what people need right now.

AG: I agree with what peter is saying - if I read I just consume it
    and the next person will read the same data. This would not be
    true for streams.

NCH: it is a simplification of the interface.

AG: If I want to cat a file that is going out to a service ...

BA: there is always the debate as to whether you ignore capabilities
    you cannot perform on the stream or fault them ...

PK: does it make sense to explicitly state that you are a stream?
    It makes the client's life easier.

AG: extending the interface makes sense but, playing devil's advocate,
    if wanted to cat do I have to be aware that there are streams?

PK: you can write cat in different ways - can cat with a stream or you
    can do the cat yourself.

...

     Happy with simplified semantics. 

AG: The stream interface needs to be discussed in a telcon soon.

...

DB: the sensor use case was basically covered by what MM said about
    streaming in his discussion.

+---

NCH takes over to discuss a DAIS use case.

Presentation:

http://forge.gridforum.org/projects/ogsa-d-wg/document/DAIS_Use_Case/en/1

ByteIO in this instance is treating data as a virtual file.

NCH discusses a use case where data to be entered into a data resource
is conveyed to the data service using the ByteIO interface.

PK: in GSM - if you put stuff in storage - you have to say how much
    data you are going to put in. Also, what seems to be missing here
    is your Virtualization description of how you access the data, the
    structure - is there not a need to describe this. In the insert
    you would have to state the format ...

AG: ByteIO is working as an HTTP GET.

BA: this should be an execute insert.

NCH: this is not a use case where you treat a database as a file.

Some discussion as to whether pushing data to a data service and then
put it in the database. Not a model that is currently supported by
DAIS.

For the "complex insert" use case.

PK: The pull model is simpler as a service provider but from the
    client side it is more difficult as you do not have a service to
    serve the data.

    For medical imaging they pull an image from the database, they do
    some processing and then want to push the image back to the
    database.

BA: if they pulled the data out of the database they should be able to
    push the data back out to the database.

...

Moving on to the "Simple copy" use case.

BA: why would you do this using ByteIO?

NCH: 'cause I needed a use case.

PK: if you do copying around then you need to do it at the transfer
    level - you need to manage the storage and manage the network.

NCH: there are implementation layers that are not explicitly shown
     here.

MM: ByteIO telcons are every second Tuesday at 11 easter time, 4pm UK
    time.

...

+---

DB puts a copy of the agenda for the remainder of the meeting up.
Some agenda bashing ensued and a slightly modified agenda was 
agreed for the rest of the meeting.

DB gives a quick synopsis as to what went on at the OGSA face to face
earlier on in the week and the data issues that were presented to the
group (all in the slides).

Presentation:
http://forge.gridforum.org/projects/ogsa-d-wg/document/OGSA-D_status_and_introduction/en/1

A brief summary of what has been put in each of the sections by the
corresponding authors.

+---

MA gives a brief overview of what he has put in relation to data
access in the OGSA Data Architecture straw man document.

Presentation:
http://forge.gridforum.org/projects/ogsa-d-wg/document/Access_section_status/en/1

A discussion follows on what is the scope.

AG: client should be able to manipulate the basic service.  May also
    want clients that access higher level services, e.g. NFS and
    CIF. We have to be clear of as to what layer of the stack we are
    talking about. Some folks will want transparency while other
    people will want the performance level that talks to "bare"
    services.

PK: can use this to structure the document.

DB: the intention has always been to have a document that talks about
    scenarios.


AG draws a picture on a white board:

[Powerpoint version of these diagrams subsequently drawn by DB and  made 
available at 
http://forge.gridforum.org/projects/ogsa-d-wg/document/Architecture_drawing/en/1 
]

        +-------+
        | client|
        +-------+
        | libs  |
        +-------+
            |          +----------------+
            +--------->|Access Services | (Out of scope but 
            |          |(NFS, CIF)      |  it is a use case)
            |          +----------------+
            |
            |          +--------------------+
            +--------->|caching/replication | (in scope)
            |          +--------------------+
            |
            |          +---------+
            +--------->|transport| (in scope)
            |          +---------+   
            |
            |          +--------------------------+
            +--------->| bare services, files,etc | (in scope)
                       +--------------------------+

Consumers at the top. Producers at the bottom of the stack. A consumer
at the top can talk to any of the layers below depending on the
performance/abstraction required.

DB draws a picture of his conception of the architectural stack:

      +-----------------------+
      | Federation            |
      +-----------------------+
      | Replication/Caching   |
      +-----------------------+
      | Access (client access)|
      +-----------------------+
      | Storage               |
      +-----------------------+
      | Transfer              |
      +-----------------------+
      | Data + Description    |
      +-----------------------+

DB noted that this is more an axis of increasing capability rather than
an architecture per se.

Discussion followed regarding what federation means and whether this
should include transformation.

DB: maybe at the end of each section we should have a diagram of the
    architecture thus far.

JU: have a concern about that ordering - replication and caching can
    be viewed as the replication and caching of the individual sources
    or of virtualised objects. There is no strict layering or you have
    to assume one. In IBM have decided that replication and caching
    takes place above virtualisation.  That ordering gives you more
    flexibility than the other way round.

    Also talk about federation and transformation being infinitely
    recursive.

...

Discussion follows as to whether virtualisation is above or below the
replication and the different pros and cons that the different choices
entail.

...

+---

AC gives a summary of the data replication section.

Presentation:
http://forge.gridforum.org/projects/ogsa-d-wg/document/Replication_section_status/en/1

AG: does copying involve some or all of the data - does that include
    caches?

AC: we do include subsets.

JU: depending on the source type there are a number of ways to
    establish subsets.

AC: yes ... have not made a clear distinction of replication and
    caching.

JU: a way I can look at is that replication is an explicitly planned
    and meta data managed activity - someone has explicitly created
    two or more copies and established what the relationship between
    them is through metadata whilst caching the data movement is an
    automatic and temporary instance.

BA: to me, it's the scoping and visibility of it. Caching is not
    visible to you.

AG: but if you make caching first class citizens then you can tie the
    two mechanisms together.

JU: so caching would have no explicit RNS name and lifetime.

    caching is an exploitation of replication that has no RNS name or
    explicit lifetime.

AG: but there will be users who can/want to see that.

JU: It seems like it will be possible to define caching as a limited
    scoping of replication.

DB: I made a distinction between replication and caching ... whether
    the primary source knows of the cache.

PK: they both use the storage resource to do the caching.

JU: want to explicitly define the subset relationship between caching
    and replica management.

    Did you add the peer-to-peer model?

AC: not as well defined, need to flesh this out.

    how to name replicas and partial replicas?

JU: RNS.

AC: the RNS spec is all hierarchical ...

AG: I personally would not do it in RNS. I would create a WS-Name that
    says that this is this file and the implementation would know the
    replication.

    What you have been discussing - is there an interface or is this a
    proposal?  So for consistency you could say Unix consistency which,
    when you wrote to a catalogue, it would write through.

AC: want to be able to define consistency models that could be
    supported or the service states that it does not support that
    model.

AG: to fit inside OGSA the names should be RNS, the abstract names
    would be WS-Names and the low level addresses would also be
    WS-Names with different IDs.

HK: the third one should not be an address?

AG: the third name can be an address ... you could say that it is an
    EPR and not a name.

PK: RNS is scoped for humans, the replica discovery would be to get
    the replica - what are the semantics? If you make the comparison
    to a file system - one is mutable and one is not?

AG: The first abstract name would be similar to an inode. It depends
    on whether you want replication to be visible to the client.

PK draws a picture on the white board:
                                      +--    -+
                                     /        |
                                    /         |
    Human name ---> "inode"  (GUID) ----      +- Replicas
                                    \         |
                                     \        |
                                      +--    -+  
     User assigned   System assigned      sys/user assigned



AG adds 

    EPR
    <
     To:URL
     AbstarctName
     Resolver:EPR
     >

     How do you resolve the GUID to a resolver?
     Can use an AbstractName which needs to be resolved that 
     then allows you to pick the right replica.

JU: the back end could be WS-Names. WS-Names are EPRs.
    If you want to make the EPRs visible then you map these 
    to human names.

AG: we did it both ways in legion. Could make both visible to the
    client which is easier to implement or can only make one version
    visible through a guid like thing and make that late binding which
    is harder to implement.

...

AG: there is a versioning model that the replicas can check for
    consistency in a lazy way ...

AC: we did not specify it to that level.

AG: used a consistency window in Avaki and has worked that well.

AC: you could do it both ways.

PK: mentioned that replicas are created via at the replica factory and
    then specify the consistency at the creation point - does the
    factory then manage that?

JU: it should be an unimportant as to who manages it.

PK: as the client I might care.

JU: but you talk to the EPR so you don't care who you talk to. The URL
    may not point to the factory - it could talk to another service.

PK: if I create a new replica where do I go?

JU: the factory could have talked to another service managing the
    replication and got it to create the replica (as an EPR) and sent
    it back to you or the factory could be doing the actual creation
    but this is irrelevant to you - you do not need to know.

...

AG: could have a replica manager in the middle or it could all be
    separate.

JU: if you view the factory as a management interface could have an EPR
    with only the management interface - could then use the human
    readable name to actually access the replica.

DB: (to Ann) how do you see this chapter moving to describe specific
    interfaces?

AC: Don't have an answer to that. Allen may have a better view on
    that.

** Lunch

+---

DB introduces the next session on Data Transfer. No slides for this
part of the agenda.

There may be a link in to infod on this. There are a number of issues

   - protocol selection
   - descriptions of source/sink interfaces
   - transfer controlled by a separate service in every case or only
     a subset of cases?
   - transferring bytes with no interpretation

BA: get a strong argument for me - for data transfer you just transfer
    bytes period.

...

DB: when setting up a pipeline you may do some negotiation.

BA: that is a higher level thing. It ultimately all translates to
    using a socket.

DB: but the architecture has to translate to the picture that AG
    drew.

BA: there needs to be a distinction on what you are building on.

DB: currently have a byte level view and a structural level view ...
 
    Tried to separate the operation level parameters - need to see
    what is specified in the source interface and what specifies the
    transfer.

BA: at what level are you talking about?

DB: all levels.

BA: so the interface looks the same at all levels?

DB: no ...

BA: when you do this negotiation there should be the inclusion of a
    schema that you use. ... Malcolm and I have talked about
    negotiating the protocol on locale and if you are in the same
    container then all you need to do is send the pointer to your
    buffer.

DB: so how does this separate from scheduling?

BA: would hope that a lot of these things are hidden under the covers
    as you negotiate with each other.

....

AG: if you pass a pointer then you would have to pass a copy ...  it
    breaks the web service model.

BA: Malcolm would not be happy with that answer ...

AG: but if you are going to use Web Services ...

BA: maybe not a good choice ... the pointer may not work but the
    content exchange model based on local would be useful

AG: I would agree but I would think that it is out of scope.

DB: you could negotiate .... if memory-to-memory copy is not part of
    the allowed protocol ....

AB: the hard part of this is how to figure out locale ...

DB: but locale has already been figured to be an interesting property
    to have.

    Other issue is on how to link this to infod. The infod uses a
    publish/subscribe model. At GGF13 pushed this to see if a data
    transfer. Looked at a use case...

SD: can't you just use WS-Addressing to do this.

DB: I don't know if other people have used WS-Addressing for this.

PK: in the storage resource manager there is a copy that does data
    transfer. It internally manages the data transfer.

    Discussing how these additional things are taken into account
    in EGEE.

...

DB: you come from the large data/large data movement ... you mentioned
    the linking of that into scheduling. OGSA have been asking us what
    information from data should go into scheduling and this could be
    something.

PK: in EGEE want to associate as much as possible in the computation
    and data areas so that you do not have to do the same thing twice.

AG: have the same in BES we call these things activities, could be a
    computation or it could be a data resource. Try to get as much
    info on these to manage decisions.

PK: eventually you map into whatever transfer service you have access
    to ... then you link into a scheduler.

AG: could also link to a transformation....

PK: these transformations are coupled into the data transfer ....

DB: encryption and compression are explicitly mentioned in the document.

...

PK: you can view any computation that has input and output data as a
    transformation. The Griffin project do this - no diff between this
    and encryption.

AG: that was where it was in Avaki...

...

Discussion as to whether the role of an operation is expressed in the
operation name or if it is an attribute.

Need to check whether infod have looked at the scheduling to transfer
large amounts of data.

...

DB: Is it the case that you first need a mechanism to negiotiate and define
    the transfer, and then a separate mechanism to actually perform the
    transfer (possibly after some scheduling phase)?
    
...

Talk about the different levels of encapsulation.

...

PK: needed something that can manage the data transfers and book the
    bandwidth.

    (to BA) what is a suitable interface for data transfer?

AB: there is no obvious transfer interface out there... that's why we
    used the XIO interface to GridFTP ... synchronously ..

[ Note from DB: this misses part of the discussion.  PK and BA agreed
  that an API for async transfer was needed.  They said that there
  wasn't an obvious API for synchronous transfer.  ]

DB: Do we need an interface for synchronous transfer, provided we have
    one for async?

AG: In Legion if you had read from one object to write to another
    it would go directly between these objects, no intermediary.

DB: so is there a need for one?

...

PK: submit, cancel, suspend, ... have analogies to data.

AG: a limited JSDL job with no execution.

BA: do not follow JSDL but the XML for the GRAM job contains the
    whole RFT description for a job transfer - exactly what would
    happen with no execution.

AG: I don't remember the richness of the JSDL data transfer
    capabilities....

BA: would it be possible to put the profile in the job description
    schema? Could we say that as part of the job you have to describe
    the staging? Can you say the mechanism that is going to be used
    for movement? So if you use a given movement you include a schema
    ...

SMcG: to deal with the simple case you can use a URI. Thought people
      would want more complex types so we put in an xsd:any to allow
      people to put the information you want in.

      Unfortunately most tooling does not check what you put in
      against any schema submission.

PK: so it might be worthwhile to go through JSDL and possibly see if
    it requires to be extended.

AG: this is one of the things that the BES is looking at to see if
    it can be used and made friendlier. ...

    Now that data transport - as in staging - is back in scope ...

JU: the current BES container model is a 3 phase execution: stage,
    execute, de-stage and the stage/de-stage spec in JSDL identify any
    sources and/or sinks.
    
AG: you could just send in jobs that just have the stage in or just an
    execute or just the stage out so you could use that for data
    transfer.

JU: Don't want to allow BES to get more complicated as a work flow
    job.  If you want more complicated interactions then you have to
    architecture something else or use something like BPEL.

BA: but if you want to treat transport as a job itself then, as long
    as JSDL is rich enough, then you can pass this to GRAM and invoke
    RFT using JSDL.

JU: that is all useful but the problem is that if you want to schedule
    a transfer but want to guarantee that that takes place before or
    after some event then you can't do this in JSDL .... you can use a
    choreography to achieve this or through a third party service but
    I do not want to make JSDL more complicated ...

....
Discussion as to whether BES can be used to describe the data transfer.
....

AG: there is a need for this.

MM: surprised that the GridFTP parameters cannot be put in the URI?

...

AG: maybe can use the execution part to do what you are looking for.

BA: if you know this is going to be there as a capability you can 
    either decide to break this into three stages and do something
    else to do the managing but as you have decided to allow me to
    move data then you need to make it rich enough to allow me to
    do that.

AG: you need to talk to the JSDL people.

...

Need to make data transfer a first class citizen.

...

** Break

+---

PK talks about the data storage management section that he wrote.
He projects a version of his document so no slides.

...

In GSM define three different types of spaces (which have semantics):

   - volatile or temporary space - like /tmp.
   - permanent space - you expect your data to be kept so requires 
     higher QoS (have backups).
   - durable space - something in between. This would go into caching.
     It cannot be removed for an agreed length of time.

AG: what about failure?

PK: volatile and temporary if it's gone, it's gone. For durable you
    try and recover it.

BA: hate this concept.

AC: why not just use lifetime?

PK: the concept is there in the interface. To the user this may be
    more useful. It is probably cleaner to use lifetime management and
    give them a name. It helps - for instance in temporary space if
    you give it a file to keep and ask it to be there for a day you
    then get 0 guaranteed time back (i.e. the file could be removed at
    any point).

    You have 2 lifetimes - the guaranteed lifetime and the requested
    lifetime. 

    Durable would allow pinning on temporary space where you get a
    guaranteed lifetime.

AG: lifetimes may not be appropriate if the machine goes down for a
    week then it might all get wiped when the system comes back up.

PK: durable is only a temporary storage until it moves to another
    space.

AC: in durable it sends you an email if the file is to be deleted?

...

BA: for the cached/temporary space the user should understand - don't
    put it there if it's important to you. The permanent storage then
    asks for a different policy. It's all just QoS and lifetime
    management. It then becomes reasonably straight forward.

PK: personally I agree with you but other people in the community feel
    that the semantics are important. I think that this is a mismatch
    and they should explain to the client what these things mean.

AG: why not make it that it calls it permanent storage and just call
    it temporary space...

BA: they don't trust disk space - permanent space is tapes for them
    ...

AC: do we want to put that concept of permanent space in this
    document?

PK: Have the same issue in SRM.

...

BA: if you ask for 1Tb of permanent space but you only use 100Mb then
    you could allow others to use the 900Mbs as temporary space ...
    this allows you to do that. If the other user then requires the
    other 900Mbs then the files stored in there just get wiped.

PK: do not have lifetime in the document ... should probably add it.
    Also could add advanced allocation.

AG: don't think that you can regard permanent as boolean. You really
    have a range which have different QoS. Classics are availability,
    probability of failure, lifetime, ....

...

BA: should have a designation as to whether it is a volatile or not -
    whether I can blow it without your permission or not.

...

PK: average latency, cost/Gb ... how do you represent that?

AG: saying something is durable depends on your fault model. Does
    durability mean survivability or something else?

PK: might be better to have flag to see whether it is temporary or
    not?

BA: yes, volatile yes or no.

PK: whether there is a back-up?

AG: what is the latency between it going from disk to back-up.

AC: prefer this notation rather than what you have there. Whether it
    stays in SRM is another issue.

...

PK: advance reservation in the GSM - people thought it should not be
    done. Should it be done in this group? Lots of people do not
    believe in this ... as long as there is no payment it is difficult
    to enforce.

HK: do you have a use case?

PK: all use cases can be re-phrased in other ways.

AC: it would be a nice capability.

AG: you can do that by calling a system administrator .... it should
    be a capability.  You would make an agreement about storage.

PK: you would use WS-Agreement?

AG: you could.

PK: availability - also QoS: does it support quotas? lifetime
    management?  transfer? Trying to convince people in GSM that
    things should not only be done in the storage system. In this
    document the storage could give hints to another transfer service
    to schedule the transfers. Let it know the files that you want to
    transfer out will be on disk ... the storage systems have their
    own staging pools and they have to optimise that. If you have tape
    you need to do all this. Conceptually the capability it should
    have is to expose these hints to the transfer layer.

HK: I think these are important characteristics but these do not quite
    match the title ...

PK: produced this document very quickly.

HK: you also need input data.

DB: maybe should put in a different section that states information
    that is relevant to scheduling.

PK: there is also caching.

    It would be nice to have a concept map.

AC: the job scheduler confuses me a little.

PK: BES

AG: something else other than BES will be doing the scheduling.

...

PK: if you have the ability to do reservation then a job scheduling
    manager would just call these capabilities - have a use case where
    you have the storage metadata can be used be a scheduler to
    determine whether it can do the job.

AC: the kind of thing that a workflow manager can use.

PK: storage reservation ... how do we expose this in terms or a
    capability of a storage?


AG: you can unify these - go to the storage management factory and
    agree on how much space you will have and for how long. The lease
    space allows you to extend the reservation, etc you can relate all
    of this to agreement.

PK: up to know what has been said have tried to support the semantics
    of mass storage systems but could be supported by any object
    store.

    Have tried to describe the differences between a file store and a
    database store.

AG: a lot of DBs do their own storage management. Trying to impose
    yourself between a database and its storage management would be a
    challenge.

JU: trying to use a file system for a DBMS would be a problem but
    trying to impose yourself between the DBMS and the disk would
    not. How this is done varies from database and database.

BA: it is still writing to disk store.

PK: the pattern I'm talking about is that the client would make to the
    database and reserve some space.

JU: it depends how you do this. DBMSs deal with a virtual disk.
    In lots of NAS and SANs you can request a virtual disk of
    any size (a block server with a disk). This only makes sense
    for a small number of databases.

PK: this simplifies my life as I can then restrict this section to
    files. How would I specify it now?

AG: trying to do this will complicate your life.

PK: so it might be worth throwing out.

DB: worth talking to a couple of other folks.

JU: SANs deal with a virtualized LAN...

AG: this gets you into virtualized SANs....

JU: for grids if there is little penalty for creating a virtualized
    disk why not use that as opposed to be tied down to file space.

AG: Your allocated a block of storage and a handle to it ..

JU: do you want to write a handle of that or type it. If it is in a
    virtual LAN which only understands block I/O.

PK: it does not matter in that respect as you get the SURL and
    translate that to a file handle. If it is a virtual disk then you
    get ...

JU: so it is typed.

PK: but you can get back a different type.

...

PK: discussion about what the type of data that you are retrieving
    through the storage - is it a file? My answer that is specific to
    a protocol.  It could be a stream. It does not matter as long as
    you can talk to it. If you just want to access it through POSIX
    I/O then that is fine.

...

JU: this talking about access to storage .... you can get a chunk of
    storage and getting a handle you can access through a protocol.
    You can also get access to management ops that affect that
    storage.

PK: this is about managing storage as a resource -
    allocating/deallocating...

JU: so there is an I/O performance that is associated with the size.

PK: one of the parameters may be the latency and bandwidth.

JU: if you have an application that is accessing a detector - the
    nature of that application may be that it needs 200 Mb/s to write
    the data from the detector but once it is written the other
    applications may only require 20 Mb/s to access it so saying that
    you slow it down is fine.

AG: that's a QoS issue

JU: so all I'm saying is not to restrict yourself to
    allocation/deallocation... the metadata associated with the
    storage are broader.

AG: that is something that could go into the agreement....

...

HK: using POSIX examples - you have some storage, the user appends to
    it.  The space allocation happened when they accessed the storage
    ....

PK: if you have an app that needs to extend its quota you would have
    to do that before you started writing to the system.

HK: checking when the allocation takes place.

...

DB: might be worth moving to a wrap up stage and discuss how to
    proceed the document.

...

Discussion on how to proceed. MA suggests fleshing out AG's stack and
possibly using the concept maps. Can also do the individual sections.


SD: can talk about overlapping sections...

DB: maybe bubble up things from the different section. Add a new
    section or push them up to OGSA.

...

AG: I could try to draw a picture.

DB: that would be good.

...

DB: aim to submit the present document as is.

SD: is it worth putting the use cases as an appendix?

MA: can see if you can implement the use cases based on the technology
    described in the document.

AG: also have the BES use cases.... don't think they have been written
    as yet.

...

DB: have volunteers to write the storage management section [PK],
    write use cases [SD], do a picture of the architecture [AG],
    revise the data access section [SM], go through the data transfer
    section [BA].

AG: (to PK) would like to synchronize the work that you guys are doing
    with stuff coming out of RNS, BES and stuff ...

    Get people to write what the security requirements that are
    relevant to data only.


+---

+-----------------------------------------------------------------------+
|Mario Antonioletti:EPCC,JCMB,The King's Buildings,Edinburgh EH9 3JZ.   |
|Tel:0131 650 5141|mario at epcc.ed.ac.uk|http://www.epcc.ed.ac.uk/~mario/ |
+-----------------------------------------------------------------------+







More information about the ogsa-d-wg mailing list