[RUS-WG] RUS implementd on relational database

Fri Aug 15 04:31:32 CDT 2008

Shiraz wrote:

>Is GRUS publicly available for download? interestingly the transformation tool i.e. XPath2HQL.

The GRUS is still under local test, and unfortunately not publicly available. I hope we would make it acceptable as part of other open source projects, or be part of Grid accounting system. I will let you know when it is publicly available. Anyway, the GRUS is a powerful framework that bridges the gaps between relational storage and standard-compatible RUS implementations. There are two subtools developed as well, the XPath2HQL, and Entity-Model compiler that is automatically generates entity models for Hibernate. A similar tool would be HyperJaxb but different. We tried to used HyperJaxb, but does not work for Jaxb 2.0 and not extensible. 

Many thanks for your interests!

X. Chen

________________________________

From: ahmed.shiraz.memon at gmail.com on behalf of Shiraz Memon
Sent: Fri 8/15/2008 09:28
To: Xiaoyu Chen
Cc: rus-wg at ogf.org
Subject: Re: [RUS-WG] RUS implementd on relational database

Hi Xiaoyu,

Is GRUS publicly available for download? interestingly the transformation tool i.e. XPath2HQL.

Thanks
Shiraz

On Thu, Aug 14, 2008 at 4:57 PM, Xiaoyu Chen <Xiaoyu.Chen at brunel.ac.uk> wrote:

	Stephen wrote:

	>I'm new to this list but I'm currently contemplating adding a RUS
	>service front end to an existing system that uses a relational
	>database back-end.
	That's what we implemented in fully RUS compatible pattern

	>I'd therefore like to make some comments on what looks like a bit of a
	>long standing issue which is the status of Xpath in the specification.

	>The most recent documents in the tracker add the ability to specify
	>alternate filter dialects which I think is a significant improvement.
	>However they mandate Xpath-1.0 as a supported dialect which I'm not
	>sure is a good idea.

	>Mandating the use of Xpath-1.0 makes it very difficult
	>to fully implement the specification using a relational database rather than
	>a native XML database.

	I cannot agree more. XPath-1.0 is not fully feature, especially for function calls of xsd:dataTime data type.
	Here is the use case:

	for a user that would like to get usage record of this month. The only possible way to specify xpath 1.0 expression is as following:

	/urf:UsageRecord[urf:StartTime>'2008-08-01T00:00:00Z'][urf:EndTime<'2008-08-31T59:59:59Z']
	However evaluation of XPath return null. I presume the reason is the XPath engine (of JDK 5) only returns the value of "urf:StartTime" and "urf:EndTime" as String and evaluate by comapring to specified values.

	>Actually I feel that Xpath-1.0 is not really sufficient anyway. There
	>does not seem to be any easy way of selecting records by EndTime (e.g.
	>records run in a specified month) without using features from Xpath-2.0.
	>I think this is really important as I have over a million accounting
	>records going back over 6 years in one of my databases and almost all
	>operations I perform on it select a small subset of this by date range.
	>Xpath-2.0/Xquery-1.0 have date functions and are fine in this respect.

	However XPath 2.0 is more features with additional functional calls (for datatime dataype for example). However, for usage records persistent in realational database, XPath does not do any good for RUS operations.
	In our implementation (GRUS and WLCG-RUS) we developed a lightweigh XPath2HQL (Hibernate Query Language, which is SQL-like but object-oriented) tool. Rather than providing general-purpose transformation
	between XPath and HQL, the tool has a set of constrainted rules, such as not supporting XPath function calls. We also use Hiberante API for access to hetergoenous relational database. A high level abstract, the Data Access Object layer, cusotmer implementations can be developed to support XML:DB, file system, etc.

	>My own feeling is that rather than mandate any filter dialect it would
	>be better to allow filters to be specified either by a search-string in a
	>supported query dialect or alternatively by an XML element that encodes
	>the minimal set of filters a RUS implementation needs to support.
	>As this selection language would only exist as part of the RUS it might
	>as well be written in XML and be part of the RUS schema rather than creating
	>an additional specification and custom parsers for a subset of Xpath.
	>Provided this additional selection language can be easily mapped to
	>Xpath-2.0 predicates it should not add significantly to the difficulty of
	>implementing the service on a native XML database.

	>I think a sensible minimal filtering capability is a list of binary
	>comparisons (== != < <= > >=) between
	>leaf elements of the UR (identified in the same way as the mandatory
	>elements) and literal values of the corresponding schema type.
	>The selected records would be those where all the match conditions
	>resolve true.
	>The update function could be supported by supplying a set of element
	>assignments along with the selector.

	Totally agree, i don't know wether it is feasible for update but defintely good for query. [see our extension below]

	>I imagine this would look something like

	><FilterList>
	><MatchCondition match="lt">
	><Target>EndTime</Target>
	><Value>2008-08-01 08:00:00Z</Value>
	></MatchCondition>
	><MatchCondition match="gt">
	><Target>EndTime</Target>
	><Value>2008-07-01 08:00:00Z</Value>
	></MatchCondition>
	></FilterList>

	In the GRUS, we defined three header elements, the schema of which are as follows:

	a) wlcgrus:GroupBy

	With this header, the user can interrogate a RUS service endpoint without using XPath, but explictly identifying desired usage metrics to be returned.

	The groupBy element can be used for both job and aggregate/summary usage query by specifying the //urf:GroupBy/@aggregate value.

	<xsd:element name="GroupBy">
	 <xsd:complexType>
	 <xsd:sequence>
	  <xsd:element ref="urf:StartTime" minOccurs="0" maxOccurs="1"/>
	  <xsd:element ref="urf:EndTime" minOccurs="0" maxOccurs="1"/>
	  <xsd:element name="usage" type="xsd:QName" minOccurs="0" maxOccurs="unbounded"/>
	  <xsd:any namespace="##other"
	       minOccurs="0"
	       maxOccurs="unbounded"
	       processContents="lax" />
	 </xsd:sequence>
	 <xsd:attribute name="aggregate" type="xsd:boolean" use="optional" default="false" />
	 </xsd:complexType>
	 </xsd:element>

	b). wlcgrus:SortBy

	The usage of this header enables ordering return usage records for usage metrics as a paricular usage metric.

	<xsd:element name="SortBy">
	 <xsd:complexType>
	  <xsd:choice>
	   <xsd:element name="usage" type="xsd:QName" minOccurs="0" maxOccurs="1" />
	   <xsd:any namespace="##other"
	         minOccurs="0"
	          maxOccurs="1"
	          processContents="lax" />
	     </xsd:choice>
	     <xsd:attribute name="order" type="wlcgrus:orderType" use="required" />
	 </xsd:complexType>
	 </xsd:element>

	<xsd:simpleType name="orderType">
	    <xsd:restriction base="xsd:token">
	       <xsd:enumeration value="asc"/>
	       <xsd:enumeration value="desc"/>
	       </xsd:restriction>
	  </xsd:simpleType>

	C). wlcgrus:maxRecords

	this header part is used to constrain the maximum usage records or usage metrics allowed by a specific request

	<xsd:element name="maxRecords" type="xsd:int" />

	With above header extensions, it is possible to query usage records through RUS service without using XPath.

	e.g. 1: get the top 10 job usage records of 'Atlas' VO with respect to maximum CPU usage of this month

	the request message is as follows:

	<env:Header ...>

	<wlcgrus:maxRecords>10</wlcgrus:maxRecords>

	<wlcgrus:GroupBy aggregate="false">
	  <urf:StartTime>2008-08-01T00:00:00Z</urf:StartTime>
	  <urf:EndTime>2008-08-31T23:59:59Z</urf:EndTime>
	  <wlcgrus:usage>urf:CpuDuration</urf:usage>
	  <urf:Resource description="VOName">Atlas</urf:Resource>
	 </wlcgrus:GroupBy>

	<wlcgrus:SortBy order="desc">
	 <wlcgrus:usage>urf:CpuDuration</wlcgrus:usage>
	 </wlcgrus:SortBy>

	</env:Header>

	<env:Body>

	<rus:extractUsageRecordsRequest />

	</env:Body>

	....

	>The translation of this into predicates is straightforward as is the
	>translation into a SQL select statment for elements that have been
	>extracted into SQL fields, any remaining match conditions could be
	>evaluated by regenerating the XML for the superset of the target records
	>returned by the SQL and applying Xpath.
	>Alternatively we could have a method that queries the permitted target
	>elements for a selector.

	if you are interested in more information, we can arrange further meeting (maybe face-to-face).

	X. Chen, A. Khan

	--
	 rus-wg mailing list
	 rus-wg at ogf.org
	 http://www.ogf.org/mailman/listinfo/rus-wg

-- 
==========================================
Ahmed Shiraz Memon
Jülich Supercomputing Centre (JSC)
Institute of Advanced Simulation
Forschungszentrum Jülich GmbH, Germany

Phone: +49 2461 61 6899
Fax: +49 2461 61 6656

Email: a.memon at fz-juelich.de

-----------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------
Forschungszentrum Jülich GmbH
52425 Jülich

Sitz der Gesellschaft: Jülich
Eingetragen im Handelsregister des Amtsgerichts Düren Nr. HR B 3498
Vorsitzende des Aufsichtsrats: MinDirig'in Bärbel Brumme-Bothe
Vorstand: Prof. Dr. Achim Bachem (Vorsitzender), Dr. Ulrich Krafft (stellv.
Vorsitzender)
-----------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------