[dmis-bof] Vote for submission of Charter

Arie Shoshani shoshani at lbl.gov
Tue Mar 21 17:35:16 CST 2006


Michel, Bill,

Thanks for the clarifications.  That is very helpful.  I think the 
Charter as you have it now articulates the intention is sufficiently well.

Now, for the sake of discussion I'll like to pursue the question of 
storage and data movement a bit further.

You say:
"When requesting a data movement, I have certain expectations that I 
would like to be met. It is simply not always true that a service can 
meet these expectations; for example if I request a data transfer to be 
carried out using GridFTP but the service supports only MTOM, then the 
request would fail."

Now, suppose that the expectation is that the data will be moved to some 
storage system and that the entire volume is 1 TB.  Which component will 
take care of making sure that the user is allowed to move a TB, and that 
there is enough space allocated?  I don't think the DMIS should do that. 

It seems from what you suggested below (the "hierarchical model"), that 
in this case the data movement call could be made to an SRM (which will 
allocate the space) and it will call the DMIS for the actual transfer.  
Is this how you envision this working?

Another expectation could be that the TB movement should happen in 24 
hours.  Would the DMIS try to estimate whether this can be achieved?

It seems that in order to provide DMIS with "the parameters that 
directly affect the data movement itself sorted out before the data 
movement will actually take place" (to quote what you said below), 
another fairly complex service would be needed to set it up.  Maybe in 
simple use cases, one can assume certain parameters, so they do not have 
to be setup.

Finally, I agree with your comment on what resources the DMIS could 
report on (resources it has consumed to fulfil a data movement request, 
e.g. bandwidth usage, CPU cycles(?), total bytes moved, etc.).  While 
this may not be easy to implement, it will be of great value to the 
component that invokes it, especially if that was an SRM.

Arie


Michel Drescher wrote:
>> William E. Allcock wrote:
>>>> 1) Regarding the sentence:
>>>> "Setting up a data movement includes the selection of a transport 
>>>> protocol, for example GridFTP, and parameters for reliability, 
>>>> timing, scheduling, resource usage, accounting, billing, etc. The 
>>>> Working Group will explore existing mechanisms to reach such 
>>>> agreement, e.g. WS-Agreement and use them where appropriate."
>>>>
>>>
>>> Perhaps we need to reword this section, but I think the intent of 
>>> this part
>>> was in response to comments about providing information for the 
>>> service to
>>> make prioritization decisions amongst files within a request and 
>>> between
>>> requests.  So, reliability might be parameters associated with 
>>> number of
>>> retries, backoff, etc.; timing could be fail if it takes more than x 
>>> hours;
>>> scheduling could be have it done by x time; resource usage could be 
>>> use no
>>> more that 50 Mbs; accounting/billing could be don't exceed z cost.
>>>
>> I see.  So, you expect other layers (components, services) to 
>> provides such parameters to the DMIS, right?
>
> Perhaps, perhaps not. This is simply out of scope from where the 
> service gets its information. However, before a data movement can take 
> place, all surrounding parameters need to be clarified: A data 
> movement inflicts temporary changes in the serivce's environment: 
> bandwidth usage, system load, etc. When requesting a data movement, I 
> have certain expectations that I would like to be met. It is simply 
> not always true that a service can meet these expectations; for 
> example if I request a data transfer to be carried out using GridFTP 
> but the service supports only MTOM, then the request would fail.
>
> This section expresses that the DMIS WG wants to have the parameters 
> that directly affect the data movement itself sorted out before the 
> data movement will actually take place. (Don't know it this sentence 
> makes it any clearer, but on the other hand, I don't know how to 
> express this better.)
>
> THe DMIS WG will start with a relatively simple model to meet this use 
> case. It it turns out that this is a more fundamental issue, we always 
> have the option to re-charter and spin off another Working Group. In 
> fact, this strategy was discussed at the GGF16 BoF.
>
>>> The issue is that once something has decided the files need to move 
>>> (like
>>> SRM) and it gives the service a request for say 100 files, the service
>>> could, in theory, move those in any order within a request (perhaps we
>>> should have an option for in order...), or may need to prioritize 
>>> between
>>> requests.  So, a user might want to put priorities within a request 
>>> (though
>>> I could argue against that), but they certainly might put priorities 
>>> between
>>> requests of their own.
>> OK.  This makes sense to me.
>>> This service is intended to be a file movement
>>> service, along the lines of the existing RFT, FTS, etc..  We have no
>>> intention of doing SRM type functionality.  It is something SRM 
>>> might call.
>>>
>> Good.  This is where I found the above sentence unclear.  It sounded 
>> to me like the DMIS will be responsible to call all the other 
>> services and coordinate various activities including allocating and 
>> reserving space.  But, as long as it is intended as a "file movement 
>> service" without expectations that it will perform space 
>> negotiations, etc. it sounds fine.
>
> Yes, 'bull's eye".
>
>>> So, I would argue that we need verbiage that says we will want extra
>>> information available to make prioritization decisions within the 
>>> service,
>>> and the information is of the type listed.  Do you have a suggestion 
>>> for a
>>> wording change that makes it clearer?
>>>
>> I'll try a little later, by the end of the day ...
>>>
>>>> Thus, in the setup phase, all kinds of services will have to be 
>>>> contacted, such as components that manage resources, components 
>>>> that perform scheduling, keep track of accounting, etc.  For 
>>>> example, to reserve storage space, the data movement service may 
>>>> want to interact with SRMs.  I think this is a huge undertaking, 
>>>> and too large of a scope.  Even if you go through WS-agreement, it 
>>>> will have to interact with various resource managers, accounting, 
>>>> billing, etc.  You may want to leave all the setup to a separate 
>>>> component, like WS-agreement, and then focus on the data movement 
>>>> part.
>>>>
>>>
>>> WS-Agreement is not a component, it is just a pattern for negotiating
>>> agreements, but as I noted above, our intent is that this is just a 
>>> file
>>> movement service.  The one point I might argue on this would be 
>>> reporting to
>>> such services .vs. negotiating with them.  For instance, if there 
>>> were some
>>> standard message that could be used to report "resource 
>>> consumption", it
>>> might be useful to include some kind of EPR or such which the data 
>>> movement
>>> service could inform about the resources consumed during the 
>>> request.  I am
>>> not sure this is a good idea, nor that fits in the WS-I basic profile
>>> because it is a sort of notification, but it is a thought.  In the WSRF
>>> rendering, you have the alternative of exposing such info as an RP 
>>> and the
>>> interested service could be directed to subscribe to that RP.
>>>
>> Again, I am confused about term"resource consumption".  A DMIS can 
>> report on file movement, how many files were moved successfully, 
>> total size moved, hoe many to go, etc.  It cannot report on storage 
>> consumed, because it does not know what is done with the files after 
>> they are moved (for example they may be used quickly and removed, or 
>> archived).  Am I missing something?
>
> Yes, a DMIS can and should report on the resources it has consumed to 
> fulfil a data movement request, e.g. bandwidth usage, CPU cycles(?), 
> total bytes moved, etc. It is a different set of resources a storage 
> management system may report, though, but still a report on which a 
> bill/invoice may be filed later-on.
>
>>>> 2) A related point, in the "Seven questions: Evaluation Criteria 
>>>> (from GFD.3)" document it says:
>>>> "The most direct overlap would be with the gsm-wg, but they are 
>>>> participating in this group and will, presumably, use what is 
>>>> developed here for the transport portion of their interface."
>>>>
>>>> Yes, it makes sense for SRMs to make use of powerful transport 
>>>> services, since SRMs rely on transport services.  But, when a 
>>>> request is made to an SRM, they take care of managing storage 
>>>> space, keeping track of usage and accounting, as well as perform 
>>>> scheduling, so it does not make sense for the transport services to 
>>>> negotiate storage allocations (including lifetime), scheduling, 
>>>> etc, again, as is suggested in the "setup phase" above.
>>>>
>>>
>>> I think I addressed this above.  We have no intention of the data 
>>> movement
>>> service to be doing storage allocation.  I strongly disagree with 
>>> you though
>>> that it does not make sense for the service to be able do scheduling 
>>> of its
>>> own data movement tasks.
>>>
>> I do not disagree with you.  I did not mean to say that the DMIS 
>> should not be able to do its own scheduling of data movement.  What I 
>> meant is that if a request such as srmCopy (which later could be 
>> renamed) is made, and the SRMs reserved space and make their own 
>> scheduling decisions, that this is now passed to the DMIS.  For 
>> example, if you include in the DMIS space allocation, then we get 
>> into a circular situation: call SRM which calls DMIS which calls SRM.
>
> The model would be hierarchical: SRM utilises DMIS to carry out the 
> data movement, passing on its constraints down to a DMIS (like "Hey, 
> service, I need to have the following files transferred with such and 
> such constraints, can you do that?"). If you have only one DMIS in 
> your domain, that this is usually a piece of cake, basically 
> obliterating the need of some sort of negotiating/agreement phase. But 
> consider a horde of DMISs, then you may need or want to contact more 
> than one DMIS and see which one may serve you best.
>
>
> Cheers,
> Michel





More information about the dmis-bof mailing list