[RUS-WG] [UR-WG] OGF20 Session recommendations

Fri May 4 03:11:56 CDT 2007

Hi,

Xiaoyu Chen wrote:
> Hello, Gilbert and Everyone,
>  
>  
>           There are some comments and recommendations to OGF 20 sessions
>  
>           For Resource Usage Service OGF20 Session:
>           1). Agenda 
>               It would be considerable to have a short discussion on fault handling and access control or Configuration as well.

I don't know if I would use our precious time for that, since these 
issues are implementation-dependent and not directly related to the 
interface (your implementation can handle access control however it 
wants, the same for configuration, etc.). We should focus here only on 
the interface to the service, bnot how the service is implemented.
That doesn't that it isn't interesting how different tools or developers 
handle these things, but that is just a matter of exchange of 
experiences, not the definition of the interface. I would say we can 
talk about that if we have some "spare time" left at the end of the 
session, but I doubt there will be.

>           2). XPath Expression
>                XPath is a good candidate anchor of strucutural documents, like XML. In the context of RUS, xpath stament is mainly used for query operations.
>                (e.g. RUS:extractUsageRecords). In both specification (current and proposal), extraction of usage records returns only whole URs, which results in both cons and pros. 
>                cons: A large document set returned to client (network-intensive and risks in session timeout for service as well as low performance).
>                pros. easy to be implemented.
>  
>                The two suggestions proposed in the slide are basically contradictary each other. But still is good to be put forward for discussion.

Yes, they are. We should discuss and decide what is the better approach. 
Hopefully we can come to a conclusion at this OGF.

>  
>            3). Core RUS Specification 
>                Bach processing is preferable in most applications, but performance is a big challenge. clients would like to have flexibility on processing as well as responsiveness. 
>            
>            4). Batch Query
>                 Most of applicatoins have been diverted from XML database into Relational DB for UR storage because of performance. 

Actually nearly all implementations I know of so far use XML databases, 
not relational DBs.

>                 However, the rus service interface definition is not relation DB friendly. 

It does not need to be, it needs to be only _XML_ friendly (not even 
_XML database_ friendly, only XML friendly, since we are handling XML 
documents). How URs are stored by the underlying application is not an 
issue of the interface.

>                 In this sense, every simple query upon relational UR storage has to explicitly transform return results into XML UR as a well-formed query result.

Yes, but that is not a big problem the problem when using a relational 
database is that first the XPath query has to be translated into an SQL 
statement. That's a major reseach issue and as far as I know there is no 
general recipe for that.

>                 I don't known how WS-Enumeration reduce 
>                 returned usage records. But add a parameter for max request size is not a good idea in that if setting maximum returned usage records as 10, for example, how to return following usage records 
>                 to the client, becuase each time the user query for URs, he or she always get first 10 usage records. How to put an anchor there? besides that usage repository are kept updating, and setting an 
>                 anchor for query seems impossible. What my options here is to let RUS query operation to return either matched URs or partial of matched URs to client, even hugh amount of data, and leaves 
>                 RUS client to put restrictions on how to restrict the number of usage records returned.

The restriction on how many records can be returned depends not only on 
the client, if the client doesn't care the server will get into trouble. 
So a specified maximum number is basically ok, but: it would require an 
additional method that allows the client to know (something analohous to 
retrieveing the list of mandatory elements). And: the client often 
cannot know how many URs will be selected by its query (If I ask for URs 
for the last month for a specific user I can get anything from 0 to a 
million ...)
We might think about something like a "TooManyURsSelectedFault" that can 
be returned by the server.
But you are right that limiting the number of records is not a really 
good approach and we should definitely make sure the client can easily 
get everything it needs even if the number of records should exceed the 
limit (by multiple queries, by a response that is devided in multiple 
parts, something that is maybe inspired on TCP/IP to allow the client to 
get multiple pieces and put them together in the right order, or 
wahtever ...)

>    
>             5). Audit Information
>                 I don't know what is "record undo information" really means here. Do you mean usage records are failed to be proceeded? for example, extraction operations may return 10 matched usage records, but only returns 5 to the user, because of the user are not authorised to see the other five. Then the other failed returned five messages are called "undo" URs?

I guess it refers to modifications of the records, but I don't think an 
explicit undo method is necessary, an undo is nothing else then a 
reverse modification that restores the previous record. A client that 
wants to do so can formulate the correct XUpdate query (or whatever 
we're going to use). But I would avoid forcing implementations to think 
about a build in undo function.

>  
>            6). How to get audit information
>                Again, RUSUsageRecord schema should not be used in RUS. It undermines the UR standardization. It has been removed in RUS proposal version 1.9. Auditing information can be stored seperately and can only be reported to target manager on requests. ExtractRecordHistory or ExtractAuditInformation can be considered to enhance auditing purpose.

The RUSUsageRecord doesn't underminde the UR at all. An UR is wrapped 
anyway by an XML that is the response from the server. Where is the 
difference between:

<...Response>
   <UsageRecord>
    ...
   </UsageRecord>
</...Response>

and

<...Response>
   <RUSUsageRecord>
     <UsageRecord>
     ...
     </UsageRecord>
   </RUSUsageRecord>
</...Response>

?
In both cases you have to extract the UR from a wrapping XML document. 
It doesn'st make much difference whether you extract it from 
"...Response" of from "...Response|RUSUsageRecord" ... The UR standard 
is not affected by that.

But I agree that the RUSUsageRecord isn't necessary (although it doesn't 
do any harm), so I support your proposal to remove it and give audit 
information only on request. But we still have to take a decision on 
that, so for now it is a (good) proposal. Maybe we can discuss this at 
the OGF.

>  
>            7). Mandatory Elements
>                Mandatory Elements is another issue coming from UR itself. Should mandatory elements also consider usage records extensions? But usage of extensions are deprecated as it may undermines UR standardazation. On the other hand, UR seems not powerful enough to accommodate grid usage representation, like VO information. There is no implementations available using UR without extensions. I recommend put this dicussion to UR join session as well.

Mandatory elements are UR proporties, but the concept of mandatory 
elements doen't come from the UR since in the UR nearly nothing is 
mandatory (which is good, because URs must be able to be applied to many 
different environments), so we need the notion of Mandatory Elements for 
the RUS.
What we need to discuss with the UR people is the insertion of further 
standard elements (e. g. for VO name). But wether we should extend our 
concept of mandatory elements to extensions doesn't really regard the UR 
specification. But you are right that we need to discuss that.
There are drawbacks and advantages of allowing UR "Resource" extensions 
to be declared mandatory for the RUS or not:
Advantage: more flexibility for RUS implementations that may eventually 
require some data that the UR doesn't forsee as standard properties 
(like the VO, most of us need that)
Disadvantage: It will easily lead to incompatibility and lack of 
interoperability (if my implementations requires a <Resource 
description="VO"> and your's wants <Resource description="VOName"> then 
we are already in trouble).

>  
>            8). Advanced RUS 
>             
>                As RUS roadmap, aggregation and data replication are two possible advanced features for RUS. Aggregation is obviously important and put be put into one of the advanced features. For data replication, we need to realised whether RUS advanced specificaiton are Service Interface Definition recommendations only or possible advacned feature for RUS interfaces and low-level mechanisms. For example, Almost all relational database engines provide data replication or synchronizaiotn facilities. Data replication can be simple or complex in various aspects, real-time or after events, full replication or paritial replication and etc. From RUS Service Interface Definition (SID) perspective, Data replications can only invoked on-demand (invoking service interface operations). This results in complex configuration tasks on replica source and target, replication mechanism and even more complicated how to replicate URs between two different storages (relational DB and XML DB)
. 

I really think that only issues regarding the _interface_ should be 
discussed, not implementation issues. Aggregation is definitely an issue 
of the interface and we need to discuss it, above all also with the 
UR-WG (in order to coordinate our efforts for an aggregated version of 
the UR). But data replication is an implementation business and doesn't 
affect the interface to the service. We really should avoid defining 
implementation issues since we want as many potential developers use our 
standard with whatever means they prefer, right?

>  
>              Back to aggregation, we have an aggreate usage record schema avaible on RUS work group. However how to connect it to job usage records? there are certain gaps between them, for example, the VO properties, storage properties and etc. We definitely need have a dicussion with UR and see possible solutions. 

Attention, the AUR has been proposed to the OGF UR-WG but it has never 
been discussed so far, so it is far from being approved. And as RUS-WG 
we should avoid using something on our own and then eventually end up 
with something that is different from what will be approved by the 
UR-WG. We need to collaborate on that with the UR-WG.

Cheers,

Rosario.

>  
>          PS: Please check the attachment complentary slides. 
>  
>  
>         Cheers!  
>  
>          X. Chen
> 
> 
> ------------------------------------------------------------------------
> 
> --
>   rus-wg mailing list
>   rus-wg at ogf.org
>   http://www.ogf.org/mailman/listinfo/rus-wg