Thursday, December 13, 2007

Converting to XML - Some Point-form Pros and Cons

I have recently converted some user documents from MS Word to XML for a medical device company with the intent that they would be looking at authoring their future end-user documentation (printed, embedded, and online) in XML. I want to share with you some of the triumphs and challenges we had met along the way.

Pros
  • Reuse becomes so much easier - The same 15 steps to 'prep' a patient were used in four different manuals and all we had to do was point to the content. When a change got made in the one manual, it was automatically updated in the other three.

  • Editing process is shorter - Chapters or pieces of chapters that are shared between the different models are only edited once reducing the cost for editing.

  • Updating for global changes are a snap - when content like product name, revisions, corporate names, logos, styles, etc. are saved as separate XML files that are referenced instead of embedded they can be updated in one place and changed in all documents that point to it.

    Imagine how valuable your company becomes as a salable entity when the purchasing company can re-brand all documentation in less than a few minutes and then republish everything with the new corporate image.

  • Consistency is easier to enforce - If you are using an editor that validates DITA it becomes easier to uphold standard in authoring. Authors that must validate their content are more likely to follow the standards set by DITA and validated by the software. Note - Most authors know there are ways to work around validation error that do not fit a standard, but it is usually more difficult then following the standard in the first place.

  • OASIS and DITA bring down the costs of using XML - A few years ago I wouldn't even consider suggesting to my clients to tackle XML unless they just happen to have an expert on staff with nothing better to do then write XSL and play with the DITA Open Toolkit or invest upwards of $100k for a great publisher and editor.

    With the advances in DITA, the general acceptance of the standard, the specializations that are constantly being improved on, and the new tools hitting the market that are priced for the SMEs the benefits of implementing XML can often outweigh the costs and time it takes to embrace the technology.

  • Future cost saving - Once you have your XLST developed, your authoring standards in place, the process for authoring and editing understood, etc. there are so many cost savings that can be realized over MS Word authoring that it makes all the Cons worth the effort.

    When the client said they wanted one manual that had only common content in it, created from the four vertical market manuals, we returned to our publisher, set the conditions to remove all product specific conditions, added a short description to the introduction to describe the manual and then published. This took a total of 11 minutes and we had a reasonable manual. Try to do that with four 170 page manuals in MS Word.

Cons
  • Painful - No matter how prepared you are there is going to be some pain. There are new processes, now tools, and new ways of thinking that must be embraced not only by the person looking after or authoring the documentation but also by all the people in the company that will touch the content as it moves to publication.

    Half way through the project the DITA standard for chapter books was released with DITA 1.1 and we had to go through a software upgrade for our editor and publisher as well as a second review of the manual styles, a second quote for style customization, and issues that were associated with any release of new software.

    DITA is by no means a mature standard and there will be many interations that will cause pain.

  • Editing in PDF format was slower - If the content is published to PDF then edited it is very similar to editing each manual separately. There is benefit to having the content edited directly at the XML level in a content management system so that changes can be tracked and accepted or rejected. The problem with this is the person editing must have the ability to edit XML code or have the tools to view XML in a WYSIWYG environment.

  • Tools add to costs - Yes there are loads of free products out there and many are feature rich and easy to use but if you were thinking you would publish using DITA open source easily you might be in for a programing ride. I had a software engineer look at the implementation for DITA Open Toolkit and claim he had "flashbacks to his Unix coding days", where only the people closest to the program could use the application.

  • XSLT coding required - The XML authoring is the easy part and if you have determined how you will publish your content, the publishing is quite painless as well, but the styling can be very complex and somewhat expensive if you decide that the DITA standard for style does not suit your branded corporate image.

  • DITA standards don't apply to all - If you are like most good companies you have a brand and image to uphold. Until just recently (August 2007) DITA did not provide chapter numbering as a standard in book publications. There are other features that your company may require for publishing a book or help file that are not part of the DITA standards.

2 comments:

Tom Johnson said...

Barbara, I really enjoyed this post and I'm glad to find your blog. I'm curious to know how you solve the problem of chunking your information into small topics without ending up with a gigantic TOC. I wrote about this problem here: see my blog. Is it feasible to use a lot of conrefs in DITA authoring to solve the problem?

Barbara said...

Hi Tom:
Thanks for the feedback.
I have not used hotspots so I am not really familiar with the issues you are having in XML. As for the TOC getting bloated... it could be a problem if every chunk of content was mapped at the same level. The software I was using to publish only used the top three levels of nested files from the ditamap to create the TOC. The XDocs publisher used the map file hierarchy to create the first, second, and third level titles (maybe one level too many, but it was what the client had). This means you may have to create a top level introduction to a bunch of grouped, chunked data for that specific heading or change the XSLT to only publish the top one or two levels of headings. If I had gone to the fourth level headings the TOC would have been crazy long. I am not familiar with how Flare works but you may have the option of only taking the top one or two levels of nested heading.
If the chunk is going to be reused then yes I may consider using a conref to create this small piece of information but not likely an entire task or concept.
As for your hotspots issue, in pure XML I cannot imagine that there is no way to do this. This seems to be the exact reason for using XML. Each link leading to a separate XML file containing one task or concept. Interesting issue though.