Single source publishing

Single source publishing, also known as single sourcing, allows the same content to be used in different documents (deliverables) or in various formats. The labour-intensive and expensive work of editing need only be carried out once, on one document. Further transformations are carried out mechanistically, by automated tools. You may also add new output formats in the future, as your organization's needs change.

Single source publishing offers benefits in at least three areas. Their relative emphasis depends on the application and the type of information being communicated, and will influence the choice of tools used:

Contents

Technical publishing formats

Single sourcing allows the creation of documents in various technical formats from the same content. For example, a company might use the same content in online help, a printed document PDF, a web page and in an Interactive Voice Response system. With a single source solution, the company only has to update the one source file for the content and regenerate the four outputs.

Where the differences are purely those of formatting, this is a simple process to implement.

Content assembly

As an example, a company may have several products with individual user guides, all of which share a common procedure. Rather than maintain duplicate versions of this procedure (one in each guide) the guides can share the content, merging it into the document at the time of publication. Eliminating duplicate content reduces the cost of maintaining it, and also improves future consistency. A change to this procedure need only be made once, then it will appear in all of the outputs.

Translation and localization

An increasing need in a globalised market is that of localizing products and their documentation to suit local markets. This most obviously encompasses translation, but it may also extend to simple formatting (dates or currency) or by the selection of culturally appropriate examples (more usually, by avoiding culturally or religiously inappropriate content).

By separating text and structure of overall documents, the text units may be translated simply and overall structure is abstracted from this. It supports the provision of multiple localizations at lower cost, also the structure or formatting may then be modified in the future without needing the translation work to be re-done.

Context of use

A common mistake in single sourcing systems is to confuse transforms between technical formats (e.g. PDF vs. HTML) and different modalities of use. The technical transformation is mechanical and easily implemented, but the different ways in which a document may be used will have an impact on its authoring from the outset and this is not easy to automate, or to automate well. As an example, a user manual as a PDF document may have a linear narrative of 50 pages while its comparable online help system may present the same content as paragraphs, and structures this as 100 pages, all of which must be usable as stand-alone topics and require extensive linking between them. A "tutorial" might present much of the same content, but in less depth and following a linear narrative. This requirement to support multiple contexts is not easily met by mere programming.

Implementation

Choosing the tool

Ideally, the tools used for single-sourcing do not require human intervention to customize the formatting or content for the various outputs. There are many approaches to single-sourcing. The master information can be stored in any number of ways. These might include word processing documents, databases, XML files, content management systems, GUI builders, or spreadsheets. Various tools can then be used to extract the information from the master document and format them into various output formats or modalities.

A number of tools can be used to generate the various modalities from the master document. Programming languages provide the greatest flexibility, but also provide the least amount of direct support. If the output can pass through an XML format during output, then XSLT and XSL-FO can be used to transform the document to various forms (see DocBook XSL). Code generators such as CodeSmith or a graphical stylesheet design tool such as Altova's StyleVision can be used to make transforms more easily than a general purpose programming language. Content management systems may have various output modalities that they support directly.

At the top end, in both price and functionality, there are programs specifically designed to support single sourcing. These will require less effort to configure and manage than more general purpose tools. At the very least, they can give you ideas about what sorts of features you would want in your in-house single sourcing solution. (See later section that lists several popular single-sourcing tools.)

Designing the master

One of the more difficult parts of single sourcing is designing the master formats. To do this, you need to notice if you have a lot of similar things that all need to be documented. For example, menus and dialogs in an application, classes in a programming application programmer interface (API), widgets in a line of similar products, objects in a museum collection, parts of complex machines, products for sale in an online store, and so forth. Once you have identified a set of similar items, what characteristics do each of these items have in common? A computer programmer or analyst skilled at object-oriented programming can be helpful at this stage in identifying and organizing common attributes. Store information about each object and attribute in your master file. How to do this of course varies considerably depending upon the nature of your master file. In choosing the master file format, keep in mind those who will maintain the actual words. Often, it will be technical or professional writers. Choose a tool that they are familiar with or can learn quickly. Raw XML, for example, might be difficult for many writers to manually input accurately.

Having fine and quantified granularity of your information can be helpful in enabling various methods of massaging the data for different output modalities. For example, you don't want your master file to consist of pages upon pages of unorganized text about the object. You generally want to know such things as its name, its category, a short description, a long description, perhaps how it is used in a given context. For a museum item, for example, you would want its catalog number, the collection it's from, its age, where it was collected from, how it was acquired, its value if known, its use, its provenance, its historical context, etc. With all of these individual pieces of information, you can output cards for use in displays, descriptions for the museum's web site, and printed manuals describing specific collections. If you just have pages of unorganized text, this becomes much more difficult to manage from a single source perspective.

If you have a database containing information about the objects, study it. There will be many ideas contained therein about what might be interesting about the items you are documenting. A database programmer can also be helpful in helping you to design your master files.

There may be multiple dimensions to the master data. For example, you might have the data translated into various languages. Every time you add a dimension, you make maintenance of the master data exponentially more difficult. However, if the problem you are solving warrants multiple dimensions, then it's also likely a good candidate for single sourcing.

Transformation

Once you have identified these objects and attributes, think about how they will be presented in each output modality. Do mock ups of your data for several objects in each modality you are thinking about supporting. If you can translate the data by hand from your master format to each output modality with the control you are interested in having, then you're on the way to a successful system. If you demand ultimate control in one modality, you might consider that modality as your master, or part of your master. For example, if a PDF manual is the most important modality, you might consider FrameMaker as your master format since many technical writers are familiar with it, it does a good job of creating attractive pages, and it has tags that are easily transformed into other modalities using the single source tools mentioned above. You might also keep in mind that translations to other languages are often outsourced, so a common format that can be easily used by translators is frequently important.

Whatever master format you choose, you should provide templates for each type of object you wish to document. This helps maintain consistency over a collection of master objects. It is advisable to design your master format very carefully before beginning to use it. Going through all of your master objects making changes to conform to a change you think of later can be very expensive, tedious, and error-prone. Planning ahead and thinking about your organization's needs are very important to this process.

Human considerations

Single sourcing can be accomplished successfully in many contexts where you have a large number of similar items to document. It requires considerable up front planning and careful training of staff members. Alternatively, effort can be put into creating a program or set of online forms for inputting the raw data in a prompted way so that training can be minimized. Balancing these efforts in the context of maximizing your productivity is not a trivial task, but given a project of reasonable size, it can give great returns in flexibility and return on investment (ROI). Most single-sourcing projects, if they fail, fail because of inadequate training, poor planning, or resistance from staff members.

Popular tools

The following are commonly used as end-step publishing frameworks that support transformations to multiple formats.

Based on the earlier Cocoon, Forrest can aggregate multiple sources as well as serving multiple targets.
An early example of pipelined processing and a framework for XSLT, Cocoon is still widely used.

The following are applications for designing and structuring multi-format output based on a single source.

Adobe® Technical Communication Suite 3 is a complete single-source authoring toolkit with multichannel, multidevice publishing capabilities. Develop standards-compliant content with Adobe FrameMaker® 10 software, publish in various formats with Adobe RoboHelp® 9 software and Adobe Captivate® 5 workflows, collaborate with reviewable PDF files, incorporate images using Adobe Photoshop® CS5, and add demos and simulations using Adobe Captivate 5.
Graphical stylesheet designer used for creating template-based designs for XML, XBRL, and database output to HTML, RTF, PDF, Office Open XML, and Authentic e-Forms[3]
Graphical authoring tool, stylesheet designer, and publishing engine used for creating template-based designs for XML, SGML (and other input sources) automatically to HTML, RTF, PDF, text, or any other XML-based output format for further compilation (CHM, eBook, XSLT, etc.)]
Help authoring tool, stylesheet designer, and publishing engine used for creating user manuals, knowledge bases and online help Technical Communication.

References

External links

See also