[DFDL-WG] String literal syntax for hexBinary ?? - Re: String literals - various usage patterns thereof

Mike Beckerle mbeckerle.dfdl at gmail.com
Thu Apr 19 09:00:54 EDT 2012


I don't think % by itself requires any escaping. There is only a need for
escaping when the characters after the % match the syntax for one of our
entities.

I don't expect Dfdl:terminator="%done%" to require any escaping.
On Apr 19, 2012 7:16 AM, "Steve Hanson" <smh at uk.ibm.com> wrote:

> Tim, thinking some more on this:
>
> - A DFDL expression is sometimes allowed to *return* a DFDL String Literal.
> In this case, the returned value is an xs:string that conforms to the DFDL
> String Literal syntax
>
> That is indeed how properties like initiator, terminator and separator
> behave today.  But there is a problem. Let's say I have dynamically defined
> a separator at the start of my data. The value in the data is %. My
> dfdl:separator expression therefore returns %. That will give an error as a
> badly formed DFDL entity. DFDL string literal rules say that you must use
> %% to represent a single % character. The expression itself can work around
> this by checking for % and if so substituting %%, but that's a bit
> unfriendly especially as fn:replace() is not in the DFDL XPath subset - I
> think this is because it comes under this category
> http://www.w3.org/TR/xpath-functions/#string.match and not
> http://www.w3.org/TR/xpath-functions/#substring.functions.  Perhaps we
> should include fn:replace() or provide a DFDL function that handles %?
>
> I started wondering whether these properties' expression should return a
> list of String. I can envisage no format that has in its data a value that
> contains DFDL entity syntax and intends it to mean a DFDL entity!  That is
> too contrived to be real.  However I can certainly envisage an expression
> like this:
>
>        dfdl:separator = "{if ../version eq 1 then %CR;%LF; else %LF;}"
>
> So I think String is not sufficient and DFDL String Literal must be
> allowed.
>
> Regards
>
> Steve Hanson
> Architect, Data Format Description Language (DFDL)
> Co-Chair, OGF DFDL Working Group
> IBM SWG, Hursley, UK
> smh at uk.ibm.com
> tel:+44-1962-815848
>
>
>
> From:   Tim Kimber/UK/IBM
> To:     Steve Hanson/UK/IBM at IBMGB
> Cc:     Mike Beckerle <mbeckerle.dfdl at gmail.com>, dfdl-wg at ogf.org
> Date:   19/04/2012 11:32
> Subject:        Re: [DFDL-WG] String literal syntax for hexBinary ?? - Re:
>            String literals - various usage patterns thereof
>
>
> I agree with all of that.  One refinement, though. I don't think it's
> necessary to *require* an implementation to auto-cast the result of a DFDL
> Expression into the target type. If an implementation wants to be picky
> about the return type AND issue a clearly-worded Schema Definition Error
> stating what the problem is then I think we should allow it.
>
> Arguably, this would reduce the portability of DFDL schemas, but there is
> precedent for defining a portable subset  of a language while allowing
> conveniences for users who don't need portability ( e.g. ANSI 'C' ). We
> already take that line for the regular expression syntax, so there is
> precedent for this in the DFDL specification too.
>
> regards,
>
> Tim Kimber, Common Transformation Team,
> Hursley, UK
> Internet:  kimbert at uk.ibm.com
> Tel. 01962-816742
> Internal tel. 246742
>
>
>
>
>
> From:   Steve Hanson/UK/IBM
> To:     Tim Kimber/UK/IBM
> Cc:     Mike Beckerle <mbeckerle.dfdl at gmail.com>, dfdl-wg at ogf.org
> Date:   19/04/2012 11:04
> Subject:        Re: [DFDL-WG] String literal syntax for hexBinary ?? - Re:
>            String literals - various usage patterns thereof
>
>
> Hi Tim
>
> Firstly, both your bulleted assertions are correct, but your conclusion is
> not.
>
> Secondly, let me flesh out my earlier reply about constructor functions.
>
> The last paragraph of Section 23 says: "The result of evaluating the
> expression must be a single atomic value of the type expected by the
> context, and it is a schema definition error otherwise".
>
> This is where the XPath constructors come into play. Eg: <element
> name="myHexBin" type="xs:hexBinary" dfdl:inputValueCalc="{ xs:hexBinary
> (...) }"/>
> These xs: constructors, plus the special fn:dateTime() constructor that
> DFDL adds, allow the correct types to be created.
>
> Note that you don't always need the constructors.  An expression that
> returns a quoted value is returning an XPath string literal so that is
> automatically xs:string.  An expression that returns an unquoted number is
> returning an XPath number literal so that can be xs:decimal, xs:integer,
> xs:double (depends whether the number contains a '.' or 'e' or 'E').
> This is described here: http://www.w3.org/TR/xpath20/#id-literals
>
> So simply returning the literal 'DEADBEEF' will return an xs:string and if
> the context requires xs:hexBinary that is a schema definition error
> according to DFDL spec.
>
> A clarification is worth while though.  Take the following expression:
> {if ../type eq 'A' then 10000 else 20000}.  That returns xs:integer.
> - What if my context was xs:decimal?  xs:integer is a restriction of
> xs:decimal so the value will always be in range, so is that 'auto-cast'
> allowed?
> - What if my context was xs:long or another restriction of xs:integer? The
> value may or may not be in range, so is that 'auto-cast' iff value in
> range?
> I think that we should auto-cast when type restrictoions are involved, and
> clarify that in the spec.
>
> We *could* change the spec to say that the result of the expression is
> always automatically cast to the type expected by the context. That takes
> some of the burden off the modeler and makes it much more likely that
> expressions written by XPath novices will return the correct results. But
> it could also hide accidental errors. Note this proposal I shall call (d)
> as it not the same as Mike's (c). If we made this change, then returning
> the literal 'DEADBEEF' for xs:hexBinary would succeed. I don't think it
> affects the desire for expressions to be statically type checkable -
> because it is known whether type X can be cast to type Y, so a cast
> mismatch can be statically detected.
>
> Regards
>
> Steve Hanson
> Architect, Data Format Description Language (DFDL)
> Co-Chair, OGF DFDL Working Group
> IBM SWG, Hursley, UK
> smh at uk.ibm.com
> tel:+44-1962-815848
>
>
>
>
> From:   Tim Kimber/UK/IBM
> To:     Mike Beckerle <mbeckerle.dfdl at gmail.com>
> Cc:     dfdl-wg at ogf.org, dfdl-wg-bounces at ogf.org, Steve
>            Hanson/UK/IBM at IBMGB
> Date:   19/04/2012 09:35
> Subject:        Re: [DFDL-WG] String literal syntax for hexBinary ?? - Re:
>            String literals - various usage patterns thereof
>
>
> I'm pretty sure that the rules are:
> - DFDL expressions must not *contain* DFDL String Literals. They must be
> valid XPath 2.0 expressions except that the list of allowable function
> names includes the DFDL extension functions.
> - A DFDL expression is sometimes allowed to *return* a DFDL String Literal.
> In this case, the returned value is an xs:string that conforms to the DFDL
> String Literal syntax. But that does not apply to your example because the
> dfdl:inputValueCalc must return a value ( an XML value ) that is valid for
> the type of the element.
>
> I think that corresponds to your answer a) ; 'DEADBEEF' is a valid
> xs:hexBinary lexical value.
>
> regards,
>
> Tim Kimber, Common Transformation Team,
> Hursley, UK
> Internet:  kimbert at uk.ibm.com
> Tel. 01962-816742
> Internal tel. 246742
>
>
>
>
>
> From:   Mike Beckerle <mbeckerle.dfdl at gmail.com>
> To:     Steve Hanson/UK/IBM at IBMGB
> Cc:     dfdl-wg at ogf.org
> Date:   19/04/2012 07:42
> Subject:        [DFDL-WG] String literal syntax for hexBinary ?? - Re:
> String
>            literals - various usage patterns thereof
> Sent by:        dfdl-wg-bounces at ogf.org
>
>
>
> What is the DFDL string literal syntax for a hexBinary type value?
>
> E.g.,  I want a hex binary whose value is the 4 bytes described by this
> hex: DE AD BE EF.
>
> <element name="myHexBin" type="xs:hexBinary"
> dfdl:inputValueCalc="{ ... }"/>
>
> So, what can one syntactically put, for literal constant values, in the
> input value calculation expression?
>
> Note that this is legal pure (non-DFDL) XSD (I think)
>
> <element name="aHexBin" type="xs:hexBinary" fixed="DeadBeef"/>
>
> That is, the fixed/default are allowed and one specifies these values as
> just strings of hex digits. Notice no special escaping or anything. You
> just use a string that just so happens to contain hex digits.
>
> I think there are three possibilites
> (a) we allow "DEADBEEF" i.e., because the type of the expression is
> hexBinary, a string is cast to hexBinary by interpreting it as hex nibbles.
>
> (b) we require a special kind of string literal - a bytes-only string
> literal, so for example: "%#rDE;%#rAD;%#rBE;%#rEF;" is the way you create 4
> bytes. If you just put characters, then that's a processing error - like a
> cast failure. Only raw-bytes allowed.
> (c) Anything you return from the expression is converted to a hexBinary by
> unparsing it to bytes (using current properties), then using the bytes as
> the hexBinary data. So you could have an expression that returns a double,
> and that would create 8 bytes if representation="binary".  In this case the
> decimal number 3735928559 (hex 0xdeadbeef) as a binary bigEndian int would
> produce the 4 bytes I want.
>
>
>
> --
>  dfdl-wg mailing list
>  dfdl-wg at ogf.org
>  https://www.ogf.org/mailman/listinfo/dfdl-wg
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.ogf.org/pipermail/dfdl-wg/attachments/20120419/1f1e0658/attachment-0001.html>


More information about the dfdl-wg mailing list