Friday, January 11, 2008

CMS or Version Control - which one for documentation?

Version control has been and is being used by 1000's of companies to manage their software design. Software programmers check out files from a server to their local client. They can then, make changes and update or check the files back into the server at any time, as long as the connection between the client and the server is available. The version control software keeps the change history. Version control is often used by the documentation department of small companies to track the changes in documents. So why would anyone need a Content Management System (CMS) to look after their documentation? I believe now that CMS tools are becoming more available and cost effective this questions is being seriously asked but the habits of the past are making it hard for us to justify the cost of the change.

In the 60's and earlier technical writing was a very narrow field, comprised mostly of defense and aerospace documentation content writers. It wasn't until the age of the personal digital computer and the explosion of software applications that the technical writer became an invaluable person in the design, distribution, and marketing of these new products. Before this age documents were paper based, stored in boxes. Design history was a function of process and procedure, tracked by a paper-driven system. Even as late as eleven years ago I can remember clearly the physical effort involved in updating a procedure or manual. It required:


  • many trips to the active and archived document filing system
  • the walking around of drafts for sign-off
  • walking around beta for testing
  • filing drafts
  • archiving current documents to a physically different filing cabinet located in a different room
  • holding a release meeting to have the traveler signed off by all officials
  • filling the final released document to the active document location.

The process itself is not unlike what is used today, but we did not even have internal email so every question, response, check, and issue had to be physically walked from person to person.


As computer memory increased and word processing applications became more accessible the ability to store documents electronically became more feasible, but the systems that were available to the average company where for tracking the history of code changes. Using version control to track document changes was really a stop-gap solution with no alternative.

I don't know the exact history of CMS applications but I do remember witnessing the increase in hype amongst the technical writing community with respect to XML and CMS about 9 years ago. With the responsibility of a technical writer changing quickly, the hype was more of a plea for a solution to be able to organize, share, and search the increasingly growing size of paperless content. Content writers where now often responsible for coding (HTML, XHTML, XML, JavaScript, CSS, XSL), designing (print, web, help, training, blogs, FAQ's, knowledge base), doc management, graphics, web marketing, regulatory, usability, user validation, applications expertise and evaluation (Word, FrameMaker, WebWorks, RoboHelp, Flare, CMS software, bug reporting, version, screen capture, publishing, e-mail) and finally publishing.

And again I ask, "what is the value of a CMS to a company?"

Content is not code. Software code is written for a specific function, used in a specific location, in response to a specific action or inaction of the hardware, user interface, or the software itself. Content is the description of a concept or procedure that can be used in multiple ways to define a product or company. The design of software code cannot be changed without a consequence to the quality of the product. The content can be written in multiple ways to mean the same thing or sometimes written exactly the same to mean something completely different. The consequence of changing content may or may not affect the quality of a product. At best it is merely inconsistent.

Kevin Ballard of L-3 Communications says, "CMS allows authors to maintain unique names" as well as "maintain the [relationship of] links" between document files. I believe he is right and here is why. As the volume of the content increases, the location of specific wording becomes impossible to track and remember. Content reuse is only possible if the relationship between the new document and the current content is known.


File Relationships
In a version control system files are checked in to a files system designed to store code. The file is checked out to any location on a clients computer. In a CMS system the relationship of file system is maintained when files are checked out. This means that when ditamaps are created for XML files or images used in marketing are imported into a user manual, the relationship is the same on the authors computer as it is when it is checked in. The knowledge of where a file is stored is related to the file it is linked to, not the absolute file location.

Unique Names
Files saved to a version control system use specialized naming structures to meet coding standards. This nomenclature often makes it difficult to determine the actual content or purpose of the document.

Metadata
Another very important functions of a CMS is the ability to search for and find information using metadate. The term "information mining" is become very prevalent in this age of information. The more products and models, the more content that is created and the harder it becomes to find pertinent information. Text files and some word processing files can be easily searched but binary files like jpg, cdr, and quark files cannot.

In a CMS environment files can be categorized using metadate so that the information can be quickly grouped and located. For example, lets say we are a moderate sized medical device company that has 80 products on the market. Within those 80 products there are three target markets (hospitals, government, and personal use). How would you search for all products that can be sold to both the hospital and government market? Well if it was a Word document and you happen to put the target market in the document content then you will find those with a simple search using Windows Explorer, but what about marketing documents designed in Quark or Indesign, or jpg images, FAQ responses in XML, and HTML help topics?


Result
As the amount of content in a company grows over the years, the ability to find specific information becomes more difficult. Without a way to search content across an enterprise, information is lost and then re-written causing similar products in current release to have different wording. If technical documentation is explained differently in different documents there is a potential for misinterpretation, incorrect product use, product failure, product damage or worse, injury. If updated in one document but not in all locations this content resides the problem can continue to haunt the company. The location of all content must be known to ensure consistent, reliable documentation and guaranty the reliability and trustworthiness of a company. Companies are built on brand, brand is built on trust and trust is not implied or expected, it is earned.

The CMS system is a tool required for this information age. Without this type of tool there is little hope of finding our way through the mountains of information we create every day.

5 comments:

Anonymous said...

Unit of work is a construct that highlights an important difference between a source code manager (SCM) and a content management system (CMS). In a SCM the unit of work is a release. You'd store all the executables required to build a given release as a stand alone package with no dependencies such that they could be compiled into an executable and released. In a CMS the unit of work is a published output composed from files that are likely living independent lives. The ability to manage "referenced to" and "referenced from" for each file is critically important for a CMS and non-existent in an SCM. As a result a CMS is designed differently than a SCM. An example will highlight the consequences of this design difference. Consider a caution that is used across many products and therefore appears in many documents (ie. do not submerge power chord in water or severe shock will result). If you’re using an SCM to control your content that caution would be copied and pasted many times over. In a CMS it would be written once and used by reference. Can you afford the time it takes to track down and change all occurrences of that caution? Can you manage the risk of missing one of those cautions and having it result in a product liability lawsuit? Where the economies and certainty of reuse are required your really can’t afford to do a CMS’s job with an SCM. As technical communicators we’re under exponentially increasing pressure to “write once and use many” so we need to articulate which is the right tool for the job at hand.

Anonymous said...

Great post. This gave me a much better understanding of how important a CMS really is for an efficient documentation system.

Yappa said...

I arrived at this post really late (4 years in fact), so this may never be read, but what the hay...

To compare a version control system and CMS, you need to be clear about what you're comparing: not XML in the CMS versus Word in the source control (as seems to be going on here), but XML in both.

Also, doc files in a version control system can and should have meaningful file names. In fact, file names for the code should too! There's no technical reason to have non-meaningful names.

Searching is actually usually easier in version control, because you can do something simple like grep. Likewise, you can easily write scripts to make global changes across a great many files at once.

CMSs should have easy search, but in my experience they don't. They are relational databases and tend to be opaque. Because the doc market for CMSs is relatively small, CMSs are optimized for a wide range of products and they never seem to work well for whatever I'm working on. There is a steep learning curve to using one effectively, especially for things like search.

I have never used a CMS that made it possible or easy to make global changes across a doc set, either.

Responding to a comment: CMSs provide no advantage in the ability to reuse topics, conrefs, file references, etc. in DITA or Docbook.

So when is a CMS useful? Perhaps in large, complex doc environments with lots of writers. Eg if you are a hardware manufacturer with a lot of models of product that require huge reuse of documentation and the docs are localized into many languages.

If you buy a CMS, you'd better have a skilled tools team to troubleshoot, as well, because (again in my experience) CMSs have many bugs and there are many glitches.

Finally, they're brutally expensive.

Yappa said...

I arrived at this post really late (4 years in fact), so this may never be read, but what the hay...

To compare a version control system and CMS, you need to be clear about what you're comparing: not XML in the CMS versus Word in the source control (as seems to be going on here), but XML in both.

Also, doc files in a version control system can and should have meaningful file names. In fact, file names for the code should too! There's no technical reason to have non-meaningful names.

Searching is actually usually easier in version control, because you can do something simple like grep. Likewise, you can easily write scripts to make global changes across a great many files at once.

CMSs should have easy search, but in my experience they don't. They are relational databases and tend to be opaque. Because the doc market for CMSs is relatively small, CMSs are optimized for a wide range of products and they never seem to work well for whatever I'm working on. There is a steep learning curve to using one effectively, especially for things like search.

I have never used a CMS that made it possible or easy to make global changes across a doc set, either.

Responding to a comment: CMSs provide no advantage in the ability to reuse topics, conrefs, file references, etc. in DITA or Docbook.

So when is a CMS useful? Perhaps in large, complex doc environments with lots of writers. Eg if you are a hardware manufacturer with a lot of models of product that require huge reuse of documentation and the docs are localized into many languages.

If you buy a CMS, you'd better have a skilled tools team to troubleshoot, as well, because (again in my experience) CMSs have many bugs and there are many glitches.

Finally, they're brutally expensive.

Barbara said...

Thanks for the feedback Yappa. It has been a long while since I posted this and about 2 years since I have been working in the industry. I am sure in the last 2 years the technology has changed significantly. I appreciate your view and even though this is an old post thanks for taking the time to make your voice heard.
Cheers,
Barb