[dfdl-wg] New draft of DFDL primer

Robert E. McGrath mcgrath at ncsa.uiuc.edu
Mon Oct 24 09:59:29 CDT 2005


On Thursday 20 October 2005 07:40, Geoff Judd wrote:
> Robert,
>               In the first paragraph of section 1.4 you that DFDL is used
> to describe XML files. I thought this was not the aim of DFDL as standard
> XML Schema can be used to describe XML files.
>

Yes, that is an error. 

> In the third paragraph of section 1.6 you say that 2 DFDL Schemas could be
> used, one to raad the data and trhe second to write it. If selectors are
> used this would be possible using a single DFDL Schema.
>

Sure.  The important point is that the same logical model can have
multiple IO models.  (And if there are N x N translations of interest,
I probably wouldn't make one DFDL with giant N-way selectors!)

> In section 2.3.1.1 you discuss specifiying multiple inputs. As this has
> not been discussed in any detail before I think a new entry is required on
> the issues list.
>

Yes, I wrote out my own capture of the discussions I heard.  My own
impression is that there is a lot of agreement here, though details
haven't been settled.

> In the parsing example in section 6 you say in paragraph 3 that layer 2
> reads bytes from layer 1 until one string has been read. How would you
> know that you have a one string for layer 2? In reality I don't think
> layer 2 is necessary. I think what you describe as layer 3 would read
> bytes from layer 1 as characters until a comma was found. This would then
> indicate that this was the end of the string for layer 3.

First, "How would you know?":  think of the layer as a library module.
The author defines the right sequence of modules based on the intended
transformation.  Also, there can be 'shrink wrapped' macros, shorthand
that stands for common sequences.

Second, as to the necessity of layer 2:   There are two separable parsing
goals here (find the string from bytes, and find commas within a string),
so there should be two abstract layers.   There are lots of reasons
why this abstraction is helpful:
   * it is easy to understand what the layers do
   * it is easy to reuse and substitute layers
   * we can deal with nesting, e.g., from the comma separated strings,
     now find colon separated fields in each of the strings.  This would
     be another layer.

You describe how a parser would optimize the implementation.  This is
quite true:  I wouldn't expect a parser to naively implement a pipeline
like this in order to parse strings. Implementations will collapse the
steps into one-pass searches.



> Regards,
>
> Geoff Judd
> Websphere MQ Integrator Development
> IBM UK Ltd
> Hursley
>
> Telephone:  +44-1962-818461
> E-mail:           JUDDG at uk.ibm.com
>
>
>
>
> "Robert E. McGrath" <mcgrath at ncsa.uiuc.edu>
> Sent by: owner-dfdl-wg at ggf.org
> 18/10/2005 15:48
>
> To
> dfdl-wg at gridforum.org
> cc
>
> Subject
> [dfdl-wg] New draft of DFDL primer
>
>
>
>
>
>
> Greetings,
>
> I have been plugging away on the primer document.  There are still
> major holes which await development of the specification and/or
> input from others.
>
> This draft include considerable revisions based on what I learned
> from you all in Boston.
>
> I attach my current draft here.  It is also available at:
>
>   http://verbena.ncsa.uiuc.edu/DFDL

-- 
---
Robert E. McGrath, Ph.D.
National Center for Supercomputing Applications
University of Illinois, Urbana-Champaign
1205 West Clark
Urbana, Illinois 61801
(217)-333-6549

mcgrath at ncsa.uiuc.edu





More information about the dfdl-wg mailing list