[infod-wg] Re: Vocabularies

Thu Aug 11 10:16:17 CDT 2005

Hi,

I am trying to carry out my action of reviewing the vocabulary stuff.

This is a rather rambling note

My first problem of course is to come up with a precise defintion of a
vocabulary (at least in the INFOD context)

There is one reference in the document to a paper by Froese. I found
this on the web but it is some research paper about some use of
"controlled vocabularies - thesauri" and does not tell me what I need
to know.

I did find some W3C documents:

http://www.w3.org/TR/CCPP-struct-vocab/#Appendix_D

http://www.w3.org/TR/owl-guide/ which defines a vocabulary as a set of
URIs. Perhaps the most useful is:

http://www.w3.org/TR/2004/REC-rdf-schema-20040210/

Unfortunately, in the best traditions it is recursive: "This
specification describes how to use RDF to describe RDF vocabularies."

There is also the RDF primer:
http://www.w3.org/TR/2004/REC-rdf-primer-20040210/

So are our notions of vocabulary along the lines of RDF?

Now referring to our appendix IV: I think we should not mention the
possibility of vocabulary transformations, as it makes the problem
very much simpler. We no longer need a vocabulary manager.

I find the term dialect is confusing. A dialect is normally a variant
of a language. The problem here is that we are imagining systems for
which multiple conceptual models can be built. We don't have a single
conceptual model with different representations - but different
conceptual models. To be specific if you model with ER you have to
think in different terms to the way you model with a hierarchy. If you
include the notion of inheritance it also affects the way in which the
system can be modelled. Some mappings are possible - but typically
only from the richer model to the simpler. For example a pure binary
model can't easily be mapped to anything else. I can accept the term
dialect if I see it is some recognised document.

A transformations could be between vocabularies using the same dialect
or the dialect could also be changed. To make transformations possible
we would need a very rich dialect able to hold all the concepts
expressed by all possible dialects - including those we have not
thought of yet. I think we should just forget about this.

With the dialects listed there are other issues:

 - SQL92 is a proper subset (I hope) of SQL99

 - RDF can be expressed in XML - i.e. the RDF gives the way of
   thinking about the problem and the XML a way of expressing the RDF.

as I said before the reference [VOC] is not appropriate

For me the important question is what do we store to define a
vocabulary. For me this is the name of the dialect and the name of the
vocabulary which is where we were in the F2F at Chicago. Alternatively
we could say that the vocabulary includes both the dialect and the
specific schema expressed in that dialect.

If we say that transformations will be forbidden for evermore the
latter solution is fine, however if we wany to express the fact that
the same system is being modelled with 2 dialects I would prefer to
use the pair (dialect, vocabulary). In any case we need to be precise
about what we mean by these terms and preferably show that our
terminology is the accepted one.

Can we just forget about vocabulary transformations for now? Then only
specific INFOD components need to understand the vocabularies.

Steve