[DFDL-WG] MS Word Limitations for DFDL Spec Review

Sill, Alan Alan.Sill at ttu.edu
Thu Sep 17 05:06:45 EDT 2020


Thanks, Mike! Based on this, I was able to do an import of this version of the document into the following GitBook area:

https://open-grid-forum.gitbook.io/dfdl-1-05r28-test/

This is set up to automatically synchronize bidirectionally with the corresponding GitHub OGF organizational pages as a second access method to enable repository storage, pull requests, etc. through a matching repository for this test import at the OGF GitHub pages at

https://github.com/OpenGridForum/DFDL-1.05r28-test

Normally we would not embed the version and release number into the repository, of course, but I wanted to label this clearly as a test to avoid confusing anyone as to its purpose, which is at the moment to be a test of the gitBook display and editing and the corresponding GitHub alternative access methods for managing and editing the document.

Right now only you, Jens, Wolfgang, Greg, and I have write access to the GitBook area. As an admin you can add team members to the DFDL team on GitBook, which is managed separately from GitHub, with either read, write, or admin privileges for spaces accessible to that team. For convenience now, I have made both the GitBook and GitHub spaces publicly readable, so anyone can browse them, and anyone in the OGF GitHub DFDL team will be able to do pull requests, so we can add GitHub users to that team for convenience in suggesting edits.

I haven’t worked out the pluses and minuses for each way of managing the editing process, but GitBook does have both draft editing and merge capabilities, and GitHub pull request can be proposed (in principle by anyone), discussed, edited and merged (by admins) through that mechanism also. Changes made on either side will be automatically synchronized so that both GitBook and GitHub will be kept in sync. The difference is basically that GitBook is prettier to look at and read and navigate, and is really intended to provide a usable online document as a whole, whereas the GitHub representation (which is optional and which we have turned on as a convenience) makes it possible to allow people to fork and branch their own copies, propose individual atomic pull requests for discussion, and going through those pull requests can really be a very nice way to organize multiple simultaneous proposed changes to the document without actually making changes until everyone in the group is happy. I have seen other groups I work with organize their group meetings around going through the pull requests and suggestions in a GitHub-based document editing workflow.

Caveats: I was unable to figure out how to manage colorizing text in the code representation that seemed best for the XML and data examples, so for now those portions are uncolored. I was not able to handle the fanciest of the colorized, more intricate tables, so I included those as images. Hopefully those do not need to be edited much and it will be easy enough to keep a copy of the Word source for the tables and redo the images as screenshots whenever necessary. I haven’t figured out how to do alphabetically indexed ordered lists in GitBook Markdown, only numerically indexed ones, so had to retype the lists as paragraphs in a couple of cases to the the alpha labels in. The import process is non-trivial, and loses all of the section and subsection numbers and hierarchy, so I had to retype those all in by hand, and the table import process also required some hand editing to get the entries to end up in the right columns, which I think I have done.

This should be enough to give you and the DFDL folks a feel fro what a GitBook + GitHub document production workflow might look like, with a pretty realistic example. If you think this looks promising, the next step would be to simply start adding people to your teams on one or both of these sites. We can either clone these repositories to go from a test to a production version, or rename them if you think this is a good enough starting point. To add the comments you mentioned wanting to add, you can either navigate to the GitBook pages, hover over the text in the area you want to add a comment, and use the internal GitBook commenting tool there, or as I mentioned above, use the issues and pull requests within GitHub to organize incremental changes there.

Docs for the GitBook tool are at https://docs.gitbook.com . Let me know if I can help any further in setting this up, and in helping you add members to your teams on GitBook and GitHub.

Hope this helps!

Alan


On Sep 16, 2020, at 4:46 PM, Mike Beckerle <mbeckerle.dfdl at gmail.com<mailto:mbeckerle.dfdl at gmail.com>> wrote:

Alan,

I will definitely take you up on this RIGHT NOW.

Attached is a MS-Word docx file which is my current working draft, but I've removed all change tracking and deleted all comments.

If you are successful I'd need to re-edit back in the comments, as they are placeholders for reviewers to find, or point out ongoing issues unresolved.

But I am very curious about this gitbook thing.

This MS-Word doc does have cross references in it, and hyperlinks to outside web docs, also footnotes. My hope is that conversion doesn't lose these, as they would be very tedious to recreate.

It is otherwise a big, but not terribly complex document. But MS-Word is just not up to the job any more.

Mike Beckerle | OGF DFDL Workgroup Co-Chair | Owl Cyber Defense | www.owlcyberdefense.com<https://nam04.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.owlcyberdefense.com%2F&data=02%7C01%7CAlan.Sill%40ttu.edu%7C12007ffae52a4a33502808d85a89f9cf%7C178a51bf8b2049ffb65556245d5c173c%7C0%7C1%7C637358897170849143&sdata=uU8LfeIxBtC1v%2FmzcalRpoFv6uEPODfD45cRxGCGRA4%3D&reserved=0>
Please note: Contributions to the DFDL Workgroup's email discussions are subject to the OGF Intellectual Property Policy<https://nam04.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.ogf.org%2FAbout%2Fabt_policies.php&data=02%7C01%7CAlan.Sill%40ttu.edu%7C12007ffae52a4a33502808d85a89f9cf%7C178a51bf8b2049ffb65556245d5c173c%7C0%7C1%7C637358897170859138&sdata=mP%2F%2FXDbYrCKp7O6jCkQ3tCRi7qlw4UwEaLqH11tKUpg%3D&reserved=0>



On Tue, Sep 15, 2020 at 3:50 PM Sill, Alan <Alan.Sill at ttu.edu<mailto:Alan.Sill at ttu.edu>> wrote:
I have gotten a lot better at use of GitBook and can now import Word documents more or less seamlessly, with only a small amount of formatting fix-up needed. Changes can be tracked as GitBook comments by simple markup within the text once uploaded, managed by branching and merging within GitBook, or when synchronized with the OGF GitHub, you can manage them with bidirectional syncing through GitHub pull requests and associated discussion tools.

I’m willing to take on uploading the current (or other good baseline) DFDL document (s) and make them available for your evaluation and analysis if you provide me with a clean, non-marked-up copy of any document or documents.

I think this is the future of OGF document production and that moving to tools like this will both allow us to modernize our infrastructure and leave the path forward to any future migration with minimal difficulties.

Thanks,
Alan

On Sep 15, 2020, at 12:26 PM, Mike Beckerle <mbeckerle.dfdl at gmail.com<mailto:mbeckerle.dfdl at gmail.com>> wrote:

Two topics:

* change tracking
* hyperlinks

So...

Change Tracking:

I have reached the limits of MS-Word with change tracking in this roughly 250 page DFDL spec document.

I believe at this point I am going to have to give up on change tracking, I.e., providing drafts with accumulated changes since a prior major version marked with change bars.
I have found that a PDF export from MS Word with tracked changes even with only "simple markup" is quite illegible, with heading numbers appearing not on the same line even as their heading, etc.  I believe this is an interaction of renumbering of sections and change tracking. Either way, MS Word crashes often and I am worried about losing work or corrupting the document. I found already that in revision r22, a cross-reference to the section about recoverable errors was not putting in a cross reference to that section, but rather was repeating the entire contents of the section at each point of cross reference. I had to hand delete all of these as I encountered them when going through the reviewer comments page by page.

I think at this point we're forced to greatly reduce use of MS-Word change tracking, and if a reader wants to study the changes between two revisions, they have to fire up MS-Word, and use it to compare two working draft versions of the document.

So the version I am going to push up for consideration soon (which is probably r27 or r28) will have change tracking, and also I will create a version with all changes accepted. Further changes will happen in the one with all (current) changes accepted, creating a new, smaller set of changes, not an accumulated set of all changes since the prior official draft. In addition, for various large changes like section moves, I plan to accept them, and just add a comment bubble to remind reviewers to read the section, as having the whole change visible with strikethrough of the deleted and colored/underlined text for the insertions ruins the flow of the document.

The only reliable viewer for the document, which can show the tracked changes in "simple markup" so that you see change bars on sides of pages only, is MS Word itself. Creating a PDF with "simple markup" doesn't work right.

Hyperlinks:

I have determined that MS-Word cross references are simply NOT converted into navigable hyperlinks when the document is output as an HTML document. This appears to be simply a MS-Word limitation. The same limitation exists in OpenOffice. A PDF gets navigable hyperlinks, but not an HTML output.
Furthermore, I have determined that an MS-Word Index results in a printable index, but again there are no navigable links from the index to the referenced pages/locations.

Based on this I am going to abandon, for now, creating a easily/readily used HTML version of the spec., and stick with just PDF.


Mike Beckerle | OGF DFDL Workgroup Co-Chair | Owl Cyber Defense | www.owlcyberdefense.com<https://nam04.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.owlcyberdefense.com%2F&data=02%7C01%7CAlan.Sill%40ttu.edu%7C12007ffae52a4a33502808d85a89f9cf%7C178a51bf8b2049ffb65556245d5c173c%7C0%7C1%7C637358897170859138&sdata=ghdlIL%2FUfmsw2EagHRX%2BhxSARe6AD4Uvtboftje4e1E%3D&reserved=0>
Please note: Contributions to the DFDL Workgroup's email discussions are subject to the OGF Intellectual Property Policy<https://nam04.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.ogf.org%2FAbout%2Fabt_policies.php&data=02%7C01%7CAlan.Sill%40ttu.edu%7C12007ffae52a4a33502808d85a89f9cf%7C178a51bf8b2049ffb65556245d5c173c%7C0%7C1%7C637358897170869140&sdata=hxpbgvKqzUm9XIZfQJD5MWVqYuX1RomeiZil9%2F3xPUE%3D&reserved=0>

--
 dfdl-wg mailing list
 dfdl-wg at ogf.org<mailto:dfdl-wg at ogf.org>
 https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ogf.org%2Fmailman%2Flistinfo%2Fdfdl-wg&data=02%7C01%7Calan.sill%40ttu.edu%7Cdd68dcfaf7c34740a57108d8599c87f5%7C178a51bf8b2049ffb65556245d5c173c%7C0%7C1%7C637357876118137764&sdata=99C%2FIK%2FSlyHe8UneoEHxVTDyh8bKf32bIdgZQzPH82I%3D&reserved=0<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ogf.org%2Fmailman%2Flistinfo%2Fdfdl-wg&data=02%7C01%7CAlan.Sill%40ttu.edu%7C12007ffae52a4a33502808d85a89f9cf%7C178a51bf8b2049ffb65556245d5c173c%7C0%7C1%7C637358897170879131&sdata=Ob3tKEk9vAVeJsu3pS6Xg1kn9IYe3eIfF26wsR8O400%3D&reserved=0>

<gwdrp-dfdl-v1.0.5-r28-no-changes-no-comments.docx>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.ogf.org/pipermail/dfdl-wg/attachments/20200917/ee909a4b/attachment-0001.html>


More information about the dfdl-wg mailing list