Web pages vs. Schema Translations (was Re: [gin-info] Notes from July 10 telecon)

Wed Aug 23 10:03:28 CDT 2006

Delayed response, buy not forgotten!

On Jul 24, 2006, at 4:08 PM, Laura Pearlman wrote:

> I don't think the decision not to translate between schemas was  
> ever (until Jen's recent mail) communicated to anyone who wasn't  
> physically present at the Athens and Tokyo meetings. But more to  
> the point, I think that's the wrong decision, so I'd like to open  
> this for public discussion and present what I think are the two  
> competing proposals, which I'll call the "web proposal" and the  
> "translator proposal". Both proposals start by identifying a  
> minimal set of information necessary for resource selection and end  
> with displaying this information in a web page -- it's what happens  
> in the middle that differs.
>
> 1. The web proposal to create a web page that will display  
> information from various sources by using various APIs (or scripts  
> or whatever) to query different information systems, then combine  
> them and display them in some uniform fashion (probably in an HTML  
> table).

The down side to this approach is that it creates a new information  
schema
(HTML) different from the the sources and possibly with a different API.
Each consumer would have to develop interfaces to this schema in  
addition
to their native schema they already support.

> 2. The translator proposal is to translate that minimal set of  
> information between existing schemas, advertise at least that  
> minimal set of information about all GIN resources in at least one  
> information server of each type, and use one of the existing web- 
> based interfaces to display it.

This is the most attractive approach in my view.  It doesn't add a  
new HTML
schema, but publishes minimal information from each provider schema  
in each
publishing schema. Consumers can use their existing APIs.

> Each has its own advantages and disadvantages, which I will try to  
> summarize here:
>
> Schema development. I don't want to split hairs about the  
> definition of a schema, but my feeling is that both plans to some  
> extent involve defining a schema and mappings between the existing  
> schemas and the new schema. The translator plan involves creating  
> an intermediary schema that's used by the translating software (but  
> that is not exposed to users) as part of the work plan. The web  
> page plan involves creating a definition of an HTML table to  
> display the collected information in a uniform manner and creating  
> mappings of the existing schemas into that new schema.
>
> Schema flexibility. In the translator plan, users can view  
> information about all GIN resources using whichever existing schema  
> they prefer. In the web page plan, users who want to see  
> information regarding all web pages must either use the web page  
> schema (i.e., look at the web page) or do separate queries and  
> combine data from all existing schemas.
>
> Software development. Creating an "information provider" for one of  
> the web-based display systems is probably more or less equivalent  
> to creating an information provider to translate from a foreign  
> schema to a native one (at least, that's the case in MDS/WebMDS).  
> However, the translator plan probably involves writing more of these.
>
> Software deployment. The web plan requires a single deployment at  
> the site running the web server. The translator plan requires a  
> deployment at each "edge" site.
>
> Query language support. The translator plan allows users to run  
> queries using their native query languages that will return  
> information about all GIN resources (e.g., "show me all queues that  
> use PBS"). The web page plan either requires that users run queries  
> in each of the existing query languages and then combine the output  
> or possibly defines and implements some new query language. The  
> latter option probably involves a fair amount of development work,  
> especially if it supports sorting or aggregation operations.

I think your comparison is correct.

Translators makes the most sense in my view.

JP

>
> I look forward to hearing people's comments.
>
> -- Laura
>
>
> On 07/11/2006 02:00 PM, Jennifer M. Schopf wrote:
>> Fundamentally, yes, this is a big part of what Charlie said when  
>> this work started.
>>
>> That's why we were NOT going to translate between things and  
>> deploy new information providers- the plan had been to leave  
>> things in their native schema and simply have a web page (like  
>> WebMDS or the Pragma monitoring tool) suck up the data and display  
>> it - that piece would need to be built/adapted, but that would be  
>> one centralized thing, not re-deploying/writing large numbers of  
>> information providers at multiple sites.
>>
>> -j
>>
>>
>> At 11:22 11/07/2006, Laura Pearlman wrote:
>>> Jennifer M. Schopf wrote:
>>>> I thought one of the fundamental aspects of GIN was that no new  
>>>> software was to be created and deployed?
>>>
>>> None at all? I think that would severely limit what we could  
>>> accomplish.
>>>
>>> I think we need to try our best to limit the restrictions that we  
>>> put on the individual grids (e.g., by not requiring that everyone  
>>> run the same monitoring system or use the same schema) and that  
>>> we keep the gin-related software development as small as  
>>> possible. But I think we need to balance that against the actual  
>>> requirements of the project. For example, I think it's been  
>>> accepted for quite some time now that we will create software to  
>>> translate specific attributes from one schema into another.
>>>
>>> In this particular case, the issue that we have is that some of  
>>> the proposed minimal attributes are not currently collected by  
>>> TeraGrid. I am proposing that we look at the actual requirements  
>>> and determine whether:
>>>
>>> * We need to collect this information everywhere, because the  
>>> proposed GIN applications require it, or
>>> * We need to advertise this information where available (that is,  
>>> it should not be lost in the translation from one schema to  
>>> another), but we don't need to collect it everywhere.
>>> If we have a fundamental rule against creating any new software  
>>> (other than the translators we've already talked about), then I  
>>> suppose the decision is made for us. But it seems to me that it  
>>> would make more sense to balance the requirements against the  
>>> effort required to implement them -- and that is what I'm asking  
>>> for the community's help in doing.
>>>
>>> -- Laura
>>>>
>>>> -j
>>>>
>>>>
>>>> At 07:23 11/07/2006, JP Navarro wrote:
>>>>> Laura,
>>>>>
>>>>> See below.
>>>>>
>>>>> On Jul 11, 2006, at 1:35 AM, Laura Pearlman wrote:
>>>>>
>>>>>> Attending: Kazu, Yuji, and Laura.
>>>>>>
>>>>>> TeraGrid resources: after the last meeting, I was going to  
>>>>>> talk to
>>>>>> Stu Martin about what Teragrid resources are available for GIN;
>>>>>> however, Stu is on vacation. I'll see whether anyone on  
>>>>>> tomorrow's
>>>>>> wheels call knows the answer.
>>>>>
>>>>> Stu and I have been working together to perform GIN related
>>>>> activities on the UC/ANL TeraGrid cluster. Let me know if
>>>>> you'd like to implement something while Stu is on vacation.
>>>>>
>>>>>> Schema mapping: the spreadsheet that Kazu sent around looks  
>>>>>> pretty
>>>>>> clear, but there are some issues using it for TeraGrid. TeraGrid
>>>>>> is using (slightly modified versions of) standard Globus
>>>>>> information providers, which report information in GLUE 1.1  
>>>>>> schema,
>>>>>> not GLUE 1.2. This means that a couple of the elements that  
>>>>>> appear
>>>>>> in the spreadsheet (AuthVO and Software) are not advertised  
>>>>>> through
>>>>>> TeraGrid's information systems. We have, I think, two options for
>>>>>> dealing with this, depending on what our requirements are:
>>>>>>
>>>>>> 1. We could create extensions to the GLUE 1.1 schema to hold this
>>>>>> information (the structure of these extensions would be the  
>>>>>> same as
>>>>>> the corresponding elements in GLUE 1.2) and modify the TeraGrid
>>>>>> information providers to provide this information.
>>>>>
>>>>> The TeraGrid schema was extended to meet it's own requirement.
>>>>> Extending it further in support of our GIN activities is also
>>>>> good. As long as GIN extensions don't break the TeraGrid's
>>>>> schema we should implement them on the TeraGrid. Also, it
>>>>> would be good to present the other extensions the TeraGrid is
>>>>> planning on to the GIN community to determine if it would make
>>>>> sense to add them to the GIN schema.
>>>>>
>>>>> JP
>>>>>
>>>>>> 2. We could create schema extensions as above, but provide this
>>>>>> information only for non-TeraGrid resources (that is, anyone
>>>>>> looking at any information system, including TeraGrid's, would  
>>>>>> see
>>>>>> AuthVO and Software information for NAREGI and EGEE resources but
>>>>>> not for TeraGrid resources).
>>>>>>
>>>>>> It would be good to nail down the requirements and choose between
>>>>>> these courses of action fairly soon.
>>>>>>
>>>>>> -- Laura
>>>>
>>>> ------------------------------------------------------------------- 
>>>> -----------------------------
>>>> Dr. Jennifer M. Schopf
>>>> Scientist eInfrastructure Policy Advisor
>>>> Distributed Systems Lab National eScience Centre and JISC
>>>> Argonne National Laboratory The University of Edinburgh
>>>> <mailto:jms at mcs.anl.gov>jms at mcs.anl.gov  
>>>> <mailto:jms at nesc.ac.uk>jms at nesc.ac.uk
>>>> <http://www.mcs.anl.gov/~jms>http://www.mcs.anl.gov/~jms http:// 
>>>> homepages.nesc.ac.uk/~jms
>>>
>>> -------------------------------------------------------------------- 
>>> ----------------------------
>>> Dr. Jennifer M. Schopf
>>> Scientist eInfrastructure Policy Advisor
>>> Distributed Systems Lab National eScience Centre and JISC
>>> Argonne National Laboratory The University of Edinburgh
>>> jms at mcs.anl.gov jms at nesc.ac.uk
>>> http://www.mcs.anl.gov/~jms http://homepages.nesc.ac.uk/~jms
>>>
>