[DFDL-WG] Specify field based on the beggining of the

Mike Beckerle mbeckerle.dfdl at gmail.com
Wed Sep 19 11:27:43 EDT 2012


Ok, so I believe I understand the problem.

You have a specification in terms of either absolute offsets, or relative
offsets from absolute anchor fields.

While you could construct a DFDL schema by simply figuring out the length
of each field, and putting suitable leading/trailing skip in to take up any
unused space,  the specification you have does not match that style. So the
correspondence between your specification and the DFDL schema would not be
obvious.

So, while it is possible to create a schema this way, a schema created by
translating your specification that way would be fragile. One error in the
middle and all subsequent fields would be wrong.

I have seen this problem in the past also, as well as mixes of things that
are 'back to back' fields, where each starts immediately after the
previous, with things that have absolute or relative offsets specified.
Basically, if you create a big enough data dictionary, you want to use both
styles to avoid this "error in the middle throws everything off" problem.

It would be greatly preferable have the option to simply put a DFDL
annotation on each element giving its start position as an offset (measured
in lengthUnits) relative to the start of some other element.

Then an error in the middle would throw off the value of that one field
only.

At one time we did have a dfdl feature for such a capability. It was either
dfdl:offset and dfdl:offsetFrom, or dfdl:startPosition or something like
that. I believe we dropped it because we were desperate to just reduce the
feature count on DFDL because the specification was so large already.


On Wed, Sep 19, 2012 at 4:24 AM, Miguel Vieira <carlos.vieira at gmv.com>wrote:

> Well yes this structure is ok.
>
> The problem is that my next fields will continue after field 4. For
> instance field5 would start right after the end of field4. Going back to my
> data description:
>
> Header: 512Bytes;  (bytes 1-512)
> Field1: 21504 bytes (21KB);  (bytes 513-21016)
> Field2: 5Bytes;  (bytes 21017-21021)
> Field3: 1Byte;  (bytes 21021-21022)
> Field4: 3 Bytes - Stating at byte 316 counting from the beginning of the
> file; (bytes 316-318)
> Field5:10Bytes; (bytes 319-318)
> ...; bytes (319-...)
>
> Basically I would need to do something like fseek function in C language
> :) (http://www.cplusplus.com/reference/clibrary/cstdio/fseek/).
>
>
>
>
> Message: 1
> Date: Tue, 18 Sep 2012 09:20:26 +0100
> From: Tim Kimber <KIMBERT at uk.ibm.com>
> To: "dfdl-wg at ogf.org" <dfdl-wg at ogf.org>
> Subject: Re: [DFDL-WG] Specify field based on the beggining of the
>         binary  file
> Message-ID:
>         <
> OF826A1ADE.83551CD7-ON80257A7D.002D3AE6-80257A7D.002DD59A at uk.ibm.com>
> Content-Type: text/plain; charset="utf-8"
>
> Hi Miguel,
>
> I work with Steve, and we were discussing this a moment ago.
>
> Would it be acceptable to get this structure in the info set:
>
> Document  ( complex element )
> |---Header ( complex element )
> |---|----Field4 ( simple element )
> |---Body
> |---|---Field1 ( simple element )
> |---|---Field2 ( simple element )
> |---|---Field3 ( simple element )
>
> If you really need to get the entire Header field as a BLOB *and* get
> Field4 as a separate field then you could achieve that using
> inputValueCalc. The value of Field4 would be calculated from the BLOB or
> string value of Header using XPath string operations.
>
> regards,
>
> Tim Kimber, DFDL Team,
> Hursley, UK
> Internet:  kimbert at uk.ibm.com
> Tel. 01962-816742
> Internal tel. 37246742
>
>
>
>
> From:   Steve Hanson/UK/IBM at IBMGB
> To:     Miguel Vieira <carlos.vieira at gmv.com>,
> Cc:     "dfdl-wg at ogf.org" <dfdl-wg at ogf.org>
> Date:   17/09/2012 18:29
> Subject:        Re: [DFDL-WG] Specify field based on the beggining of the
> binary  file
> Sent by:        dfdl-wg-bounces at ogf.org
>
>
>
> It sounds like you need two DFDL schemas and two parse operations.
> The first schema models the whole message. It takes the whole data stream
> as input and creates an infoset for Header, Field1, Field2, Field3.
> The second schema models just Field 4 using leadingSkip and trailingSkip
> to position correctly. It takes the Header from the first infoset as
> input, and creates an infoset for Field4.
>
>         <xs:element name="Field4" type="xs:hexBinary"
> dfdl:lengthKind="explicit" dfdl:length="3" dfdl:lengthUnits="bytes"
>                                       dfdl:leadingSkip="316"
> dfdl:trailingSkip="193" dfdl:alignmentUnits="bytes" />
>
> Regards
>
> Steve Hanson
> Architect, Data Format Description Language (DFDL)
> Co-Chair, OGF DFDL Working Group
> IBM SWG, Hursley, UK
> smh at uk.ibm.com
> tel:+44-1962-815848
>
>
>
> From:        Miguel Vieira <carlos.vieira at gmv.com>
> To:        Steve Hanson/UK/IBM at IBMGB
> Cc:        "dfdl-wg at ogf.org" <dfdl-wg at ogf.org>
> Date:        17/09/2012 18:04
> Subject:        RE: [DFDL-WG] Specify field based on the beggining of the
> binary file
>
>
>
> Hi Steve!
>
> I need them both.
>
> From: Steve Hanson [mailto:smh at uk.ibm.com]
> Sent: segunda-feira, 17 de Setembro de 2012 18:00
> To: Miguel Vieira
> Cc: dfdl-wg at ogf.org
> Subject: RE: [DFDL-WG] Specify field based on the beggining of the binary
> file
>
> Is Field4 the only part of the Header that is of interest?
> Do you want the full Header (as a 512 byte BLOB) and Field4 in the
> infoset?
>
> Regards
>
> Steve Hanson
> Architect, Data Format Description Language (DFDL)
> Co-Chair, OGF DFDL Working Group
> IBM SWG, Hursley, UK
> smh at uk.ibm.com
> tel:+44-1962-815848
>
>
>
> From:        Miguel Vieira <carlos.vieira at gmv.com>
> To:        Steve Hanson/UK/IBM at IBMGB
> Cc:        "dfdl-wg at ogf.org" <dfdl-wg at ogf.org>
> Date:        17/09/2012 16:06
> Subject:        RE: [DFDL-WG] Specify field based on the beggining of the
> binary file
>
>
>
>
>
>
> First of all: Thanks! J
>
> My binary data specification is:
> Header: 512Bytes;
> Field1: 21504 bytes (21KB);
> Field2: 5Bytes;
> Field3: 1Byte;
> Field4: 3 Bytes  ?Stating at byte 316 counting from the beginning of the
> file.
>
> This means that Field4 is placed inside the Header field.
>
> ?leadingSkipBytes? or  ?alignment? would do, but can you specify the place
> you start counting?
>
> Thanks.
> Miguel
>
> From: Steve Hanson [mailto:smh at uk.ibm.com]
> Sent: segunda-feira, 17 de Setembro de 2012 15:41
> To: Miguel Vieira
> Cc: dfdl-wg at ogf.org
> Subject: Re: [DFDL-WG] Specify field based on the beggining of the binary
> file
>
> Miguel, please can you send us a graphic of the data in question, I want
> to make sure I have understood your data correctly.
>
> Regards
>
> Steve Hanson
> Architect, Data Format Description Language (DFDL)
> Co-Chair, OGF DFDL Working Group
> IBM SWG, Hursley, UK
> smh at uk.ibm.com
> tel:+44-1962-815848
>
>
>
> From:        Miguel Vieira <carlos.vieira at gmv.com>
> To:        "dfdl-wg at ogf.org" <dfdl-wg at ogf.org>
> Date:        17/09/2012 15:03
> Subject:        [DFDL-WG] Specify field based on the beggining of the
> binary file
> Sent by:        dfdl-wg-bounces at ogf.org
>
>
>
>
>
>
>
>
> Hi!
>
> I have a binary file specification that specifies some fields referenced
> by the beginning of the binary info.
> For instance imagine that there is a field with 1024Bytes and the next one
> starts at byte 512 counting from the beginning of the info.
>
> Is it possible to do that using dfdl?
> I?ve seen dfdl:floating, but I can?t find much info on that.
>
> Thanks.
>
>
>
>
> P Please consider the environment before printing this e-mail.
>
>
>
>
>
> This message including any attachments may contain confidential
> information, according to our Information Security Management System, and
> intended solely for a specific individual to whom they are addressed. Any
> unauthorised copy, disclosure or distribution of this message is strictly
> forbidden. If you have received this transmission in error, please notify
> the sender immediately and delete it.
>
>
>
>
>
> Este mensaje, y en su caso, cualquier fichero anexo al mismo, puede
> contener informaci?n clasificada por su emisor como confidencial en el
> marco de su Sistema de Gesti?n de Seguridad de la Informaci?n siendo para
> uso exclusivo del destinatario, quedando prohibida su divulgaci?n copia o
> distribuci?n a terceros sin la autorizaci?n expresa del remitente. Si Vd.
> ha recibido este mensaje err?neamente, se ruega lo notifique al remitente
> y proceda a su borrado. Gracias por su colaboraci?n.
>
>
>
>
>
> Esta mensagem, incluindo qualquer ficheiro anexo, pode conter informa??o
> confidencial, de acordo com nosso Sistema de Gest?o de Seguran?a da
> Informa??o, sendo para uso exclusivo do destinat?rio e estando proibida a
> sua divulga??o, c?pia ou distribui??o a terceiros sem autoriza??o expressa
> do remetente da mesma. Se recebeu esta mensagem por engano, por favor
> avise de imediato o remetente e apague-a. Obrigado pela sua colabora??o.
>
>
>
>
>
> --
> dfdl-wg mailing list
> dfdl-wg at ogf.org
> https://www.ogf.org/mailman/listinfo/dfdl-wg
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
>
>
> P Please consider the environment before printing this e-mail.
>
>
>
> This message including any attachments may contain confidential
> information, according to our Information Security Management System, and
> intended solely for a specific individual to whom they are addressed. Any
> unauthorised copy, disclosure or distribution of this message is strictly
> forbidden. If you have received this transmission in error, please notify
> the sender immediately and delete it.
>
>
>
> Este mensaje, y en su caso, cualquier fichero anexo al mismo, puede
> contener informacin clasificada por su emisor como confidencial en el
> marco de su Sistema de Gestin de Seguridad de la Informacin siendo para
> uso exclusivo del destinatario, quedando prohibida su divulgacin copia o
> distribucin a terceros sin la autorizacin expresa del remitente. Si Vd. ha
> recibido este mensaje errneamente, se ruega lo notifique al remitente y
> proceda a su borrado. Gracias por su colaboracin.
>
>
>
> Esta mensagem, incluindo qualquer ficheiro anexo, pode conter informao
> confidencial, de acordo com nosso Sistema de Gesto de Segurana da
> Informao, sendo para uso exclusivo do destinatrio e estando proibida a sua
> divulgao, cpia ou distribuio a terceiros sem autorizao expressa do
> remetente da mesma. Se recebeu esta mensagem por engano, por favor avise
> de imediato o remetente e apague-a. Obrigado pela sua colaborao.
>
>
>
>
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
>
>
> P Please consider the environment before printing this e-mail.
> This message including any attachments may contain confidential
> information, according to our Information Security Management System, and
> intended solely for a specific individual to whom they are addressed. Any
> unauthorised copy, disclosure or distribution of this message is strictly
> forbidden. If you have received this transmission in error, please notify
> the sender immediately and delete it. Este mensaje, y en su caso,
> cualquier fichero anexo al mismo, puede contener informacin clasificada
> por su emisor como confidencial en el marco de su Sistema de Gestin de
> Seguridad de la Informacin siendo para uso exclusivo del destinatario,
> quedando prohibida su divulgacin copia o distribucin a terceros sin la
> autorizacin expresa del remitente. Si Vd. ha recibido este mensaje
> errneamente, se ruega lo notifique al remitente y proceda a su borrado.
> Gracias por su colaboracin. Esta mensagem, incluindo qualquer ficheiro
> anexo, pode conter informao confidencial, de acordo com nosso Sistema de
> Gesto de Segurana da Informao, sendo para uso exclusivo do destinatrio e
> estando proibida a sua divulgao, cpia ou distribuio a terceiros sem
> autorizao expressa do remetente da mesma. Se recebeu esta mensagem por
> engano, por favor avise de imediato o remetente e apague-a. Obrigado pela
> sua colaborao.
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
> --
>   dfdl-wg mailing list
>   dfdl-wg at ogf.org
>   https://www.ogf.org/mailman/listinfo/dfdl-wg
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
>
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <
> http://www.ogf.org/pipermail/dfdl-wg/attachments/20120918/eeb601df/attachment.html
> >
>
> ------------------------------
>
> --
>   dfdl-wg mailing list
>   dfdl-wg at ogf.org
>   https://www.ogf.org/mailman/listinfo/dfdl-wg
>
> End of dfdl-wg Digest, Vol 75, Issue 23
> ***************************************
>
> P Please consider the environment before printing this e-mail.
>
> ______________________
> This message including any attachments may contain confidential
> information, according to our Information Security Management System,
>  and intended solely for a specific individual to whom they are addressed.
>  Any unauthorised copy, disclosure or distribution of this message
>  is strictly forbidden. If you have received this transmission in error,
>  please notify the sender immediately and delete it.
>
> ______________________
> Este mensaje, y en su caso, cualquier fichero anexo al mismo,
>  puede contener informacion clasificada por su emisor como confidencial
>  en el marco de su Sistema de Gestion de Seguridad de la
> Informacion siendo para uso exclusivo del destinatario, quedando
> prohibida su divulgacion copia o distribucion a terceros sin la
> autorizacion expresa del remitente. Si Vd. ha recibido este mensaje
>  erroneamente, se ruega lo notifique al remitente y proceda a su borrado.
> Gracias por su colaboracion.
>
> ______________________
>
> --
>   dfdl-wg mailing list
>   dfdl-wg at ogf.org
>   https://www.ogf.org/mailman/listinfo/dfdl-wg
>



-- 
Mike Beckerle | OGF DFDL WG Co-Chair
Tel:  781-330-0412
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.ogf.org/pipermail/dfdl-wg/attachments/20120919/c2fd6ad1/attachment-0001.html>


More information about the dfdl-wg mailing list