Introduction
The Federation of Animal Science Societies (FASS) is an association management
organization that provides a range of services (membership management and support,
meeting planning, accounting, information technology, and publishing) to its
founding member associations and client associations. The publications department at
FASS produces 4 monthly journals, 1 bimonthly journal, and 1 quarterly journal, as
well as 1 bimonthly magazine, totaling approximately 18,000 published pages in 2009.
In addition, the department produces association newsletters, marketing materials,
conference program books, conference proceedings and abstract books, and a variety
of other print and electronic publications. These additional products account for
another 4,000 published pages each year. Staffing in the department includes 2.75
full-time compositors, 1 graphic artist, 5.5 technical editors, and 1 full-time
proofreader.
In 2007, our publications department was using Miles33 (OASYS) and La TeX to produce
the journals. These code-driven typesetting systems provided effective
batch-pagination engines for journal publishing; however, the workflow surrounding
them was time-consuming and cost-intensive.
In 2008, we transitioned to an XML-based workflow using eXtyles software (www.inera.com)
and Typéfi Publish (www.Typefi.com). Our technical
editors work in the familiar environment of Word, compositors export XML (validated
to the NLM Journal Publishing DTD) from Word using eXtyles, and the Typéfi
Publish engine uses InDesign server, a journal-specific template, NLM XML, and
graphics files (figures and math equations) to compose and produce an InDesign
article and PDF in minutes. Details of the transition and implementation process
have been published previously (Adam, 2009).
The objective of this paper is to describe our journal publishing workflow, highlight
the benefits of using an XML workflow, and discuss some of the challenges
encountered in an end-to-end XML process using the NLM DTD.
eXtyles and Typéfi
Although some might argue that an XML workflow should begin with content creation in
XML, it probably isn’t feasible in scholarly publishing today: authors use
the tools they have available, which is usually Microsoft Word, and they are not
usually concerned about document structure or tagging. And most web-based manuscript
submission and peer-review systems accept Word or LaTeX to create PDFs from the
submitted files to facilitate peer review. To go from Word to XML, we use eXtyles
from Inera Inc. eXtyles is a Word plug-in that combines XML creation and export with
a variety of macro-based editorial tools. It was specifically designed to facilitate
publishing workflows in which content starts life in Word and where XML is needed
for composition. eXtyles allows editors to work in a familiar software environment
without ever having to interact with or have specialized knowledge of XML.
eXtyles
To begin the editorial/composition process, we run the original manuscript
through eXtyles (a process called tooling). Tooling runs through several steps
to
Add metadata
Clean up unnecessary formatting (e.g., What
is that?)
Apply custom Word styles to all elements (including tables and
figure captions)
Run auto-redaction (routine editorial clean-up)
Slice, dice, and style elements in references
Match references to PubMed and/or CrossRef databases
Match references to in-text citations
Compositors run the tooling process; tooling an average manuscript takes about 15 or 20 min. and show a title page and references, respectively, before and after tooling.
Title page of a manuscript before and after tooling using the eXtyles
software. Note the custom Word styles applied to all elements of the
title page after tooling.
Reference section of a manuscript before and after tooling using the
eXtyles software. Note the addition of citation type tags, styling
(tagging) of each element of the reference (e.g., author names, year,
article title), and links to PubMed citation (more...)
Now the manuscript is ready for copyediting. One of the most valuable aspects of
using eXtyles in the editorial office is that it automates many mundane
editorial tasks: changing British to American spelling, formatting statistical
terms (p-value, p-value, or P-value?) and
units, and changing spelled-out numbers to numerals, according to house style.
Moreover, having consistent styles applied before editing allows the editors to
focus on content instead of formatting. The bibliographic processing tools add
immense value, too, in terms of time and quality.
Once the manuscript is edited, it is returned to the compositor. The compositor
adds processing instructions in the form of sizing tags () from a custom eXtyles menu. These allow tables and
figures of various sizes to be placed in the appropriate Typéfi templates
(e.g., 1-column, 1.5-column, 2-column, or side-turned format). The custom Word
styles applied during tooling facilitate XML export and tagging. The Word styles
are used to tag different elements of the article according to the NLM Journal
Publishing DTD: article title; authors; affiliations; abstract; text; equations
(as inline or display graphics); table titles, column heads, body, and
footnotes; figure captions; and references.
Figure and table sizing tags (processing instructions) are added to
figure captions (top panel) and table titles (bottom panel),
respectively. The XSLT maps the processing instruction to the
InDesign/Typéfi template variant, ensuring that the (more...)
eXtyles offers many tagging/export options to users; we currently export XML
validated to the NLM Journal Publishing DTD, version 2.3, with the CALS table
model. Because export includes a validation step, errors in the XML can be
corrected before composition. Most parsing errors arise from elements mistagged
in the Word file (i.e., human error) and are easily resolved by our compositors.
shows a sample of XML exported
from an edited manuscript.
Sample of a journal manuscript XML export showing publisher and article
metadata. Metadata is added to the manuscript file during tooling and
editing and updated throughout the composition process.
Typéfi Publish
Typéfi Publish is a design-driven platform that adds batch pagination to
an InDesign workflow. Because the design and composition rules are embedded in
the template, no external coding or scripting is required for basic layout.
However, custom scripting is used to facilitate complex page and column
balancing and complex table data alignment. The FASS workflow includes 2 custom
JavaScripts running within InDesign. The first of these is a math alignment
script that is used to correctly adjust the baseline of inline MathType .eps
files; without the script, inline equations would sit above the baseline. The
second is a table alignment script that analyzes table cell content and
automatically adds leading and trailing space to achieve the correct alignment
(e.g., alignment on decimal or ± symbol).
shows the title page template for
one of our journals. The Typéfi templates use containers to hold specific
elements; for example, the title box contains article title, authors, and
affiliations. The title page box is linked, in this case, to the container for
the abstract; as the title box grows to contain the title page info, it pushes
the abstract box further down the page. The container for the title page
footnotes is anchored to the bottom of the page and grows upward as needed,
pushing the introduction (the main story container) out of the way. A set of
templates for a journal includes the opener page, such as this one, main story
template, and templates for figures and tables of different sizes. Two versions
of these templates are required for articles so that they can begin on recto or
verso pages.
Title page template showing Typéfi containers for title, abstract,
key words, and the main story. Some content in the template is fixed
(e.g., copyright line), whereas other content is pulled in from metadata
exported from the Word file (see red (more...)
Fixed elements of a page (e.g., copyright line on the title page, slug line,
running heads) are also added to the template. Some containers hold metadata
from the XML file; metadata is added to the Word file during the tooling and
editing processes.
The Typéfi template also embeds rules for handling floating elements, or
floats. Users can set journal-specific priorities for automatic placement of
floats: bottom of page, top or bottom, one float per page, and so on. Floats are
placed according to the corresponding call-out in the text and then according to
the priority rules for that element.
Content XML
Content XML (CXML) is a DocBook-based (http://www.docbook.org/)
schema that underlies the Typéfi engine; it is optimized for composition
rather than for information storage or retrieval. Typéfi uses XSLT to
transform the incoming NLM XML to CXML. The purpose of CXML is to provide input
to the composition process (e.g., for print, HTML, or ePub outputs). For
example, CXML implementation of CALS tables allows paragraphs to appear within a
table, enabling complex table formatting that is not allowed in the base CALS
table model.
Article composition is initiated from a simple browser-based interface. The
compositor selects the journal (thus selecting the correct template and XSLT)
and enters the manuscript number. The Typéfi Engine ingests the XML file
and any graphics (figures and MathType files) associated with the manuscript and
composes the article. Composition takes from 2 to 10 min for the average
article, although manuscripts with lots of references (>50) or with very
large tables can take 30 to 60 min. shows a screen capture of recent activity showing the range of
composition times for a sample of jobs.
Typéfi Publish job monitor, showing actual composition time
("Duration" column) for a range of articles. Note the "Job Option"
column specifying "odd" or "even": all articles are initially run (first
build) using the "odd" template (article starts (more...)
The Typéfi Publish system produces an InDesign file and a PDF. Our
compositors open the InDesign file and review the composed article. At this
point, changes can be made to the layout; all of the tools available in InDesign
are available for use. Typically, very few changes are required at this point,
and the automatically generated PDF can be sent to authors as a proof.
Less-than-ideal layouts might result from manuscripts that have little text
relative to the number of floats; such a situation might require manual
intervention for a more pleasing layout. That, however, is not a limitation of
Typéfi Publish—similar adjustments were needed with Miles33 and
LaTeX workflows but without the ease of drag-and-drop that InDesign allows.
Unusual symbols might also require attention. If a symbol is not present in the
base font or is presented as a non-Unicode symbol in the Word file, the symbol
may not display correctly in the InDesign file. eXtyles and Typéfi are
fully Unicode compliant.
Math content
Many of the journals produced at FASS have content with complex math. In
particular, IEEE’s Transactions in Ultrasonics, Ferroelectrics,
and Frequency Control (UFFC) is very math-intensive. For complex
math, we use MathType from Design Science (www.dessci.com); MathType is
the full-featured version of the Equation Editor in Word (pre-Word 2007). It
provides a straightforward, point-and-click math editor and allows export of
mathematical content in graphical (.eps) or XML (MathML) formats, as well as TeX
and LaTeX formats. In our current workflow, we export MathType equations as
graphics, and those files are handled by Typéfi in the same way as other
figures. shows MathType in Word,
in XML, and in the composed article.
Complex equations in Word/MathType (top left), XML (bottom), and in the
composed article (top right).
We anticipate that the inclusion of MathML in the HTML5 standard will allow us to
move away from math in a graphical format and toward math rendered in the
browser directly from the MathML. A recent initiative, called MathJax (http://www.mathjax.org/), from Design Science, the American
Mathematical Society, and the Society for Industrial and Applied Mathematics is
an exciting development. MathJax is an open-source, JavaScript display engine
that uses CSS and web fonts to display math in any modern browser and even on
smartphones.
Inline graphics
We can also handle a variety of inline graphics in text, tables, and figure
captions, such as chemical structures, icons for multimedia content, or
biography photos. For example, we recently published an article containing
numerous inline sparklines (Cole and VanRaden,
2010; ), all
automatically placed within the text and rendered in the XML as
<inline-graphic>. The term “sparkline” was coined by Edward
Tufte (Tufte, 2004) to describe
“data-intense, design-simple, word-sized graphics.” The
flexibility of the NLM Journal Publishing DTD and the eXtyles-Typéfi
workflow allows us to incorporate unusual elements into journal articles without
extensive manual workarounds.
Inline graphics are usually shown inline (top right) but can also be
shown as for a display equation (bottom right).
Corrections and roundtripping
The Typéfi system enables corrections to be made in the InDesign file and
“roundtripped” to the NLM XML. However, InDesign is not a
validating XML editor, so corrections made in this way could disrupt the NLM
structure. This is an inherent limitation of an XML-InDesign workflow (Inera Inc., 2008), and we do not currently
use the roundtripping function. Instead, major corrections are made in the
working Word file, XML is re-exported, and the article is rebuilt in
Typéfi. Minor (late stage) corrections are made in 2 places: Word (and
exported to final XML) and InDesign; checks and balances ensure the fidelity of
this process. All articles are composed at least twice: the first time in draft
mode (the proof may contain author queries) and at least once in final mode
(which cannot be run if author queries remain). The speed and accuracy of the
Typéfi composition process means that recomposing articles from scratch
after the correction stage is still an efficient approach.
Online publication
Most of the FASS journals are hosted online by Stanford University
Libraries’ HighWire Press (http://highwire.stanford.edu/). HighWire publishes PDF and
full-text versions of journal articles. In our old workflow, we sent PDF files
to HighWire and a third-party vendor converted the content to full text. Around
the time we moved to an XML-based process, HighWire transitioned from a
proprietary DTD to the NLM Journal Publishing DTD. This has allowed us to submit
PDF and NLM XML to HighWire, facilitating rapid online publication, now within
hours instead of days.
Submitting NLM-validated XML to HighWire was not as simple as anticipated. In
fact, several tweaks to our tagging and export were needed:
Added the <license> element to identify open-access papers
(and define the embargo period) on HighWire and in deposits to PubMed:
Added a paragraph style, Related Article, and a metadata field to
tag and link errata and corrected papers; letters and replies; and companion
papers:
Added support to map hard and soft returns in table cells to
</break> during XML export
Added a footnote-type attribute to all <fn> types [fn-type
= “current-aff”]
Moved the <fn-group> that holds footnotes related to
financial disclosure to <back> matter
Changed <custom-meta-wrap>
<custom-meta><meta-name>Primary Audience
</meta-name><meta-value><bold>Primary
Audience:</bold>Nutritionists, Meat Scientists, Poultry
Producers</meta-value> </custom-name></custom-meta-wrap>
to
<notes><p><bold>Primary Audience:</bold>Nutritionists,
Meat Scientists, Poultry Producers</p></notes>
Overall, great collaboration among our colleagues at HighWire Press and Inera
allowed us to transition to XML delivery to HighWire; we are very happy with the
speed and ease of online publication today.
Results
FASS has seen many benefits of transitioning to an XML-based editorial and
composition workflow. The most important has been a significant reduction in the
time it takes to generate an author proof. shows the change in proof turnaround time across all journals, for the
math-intensive UFFC journal, and for our largest journal, Journal of Dairy
Science® from 2008 (mostly Miles33 workflow) to 2009 (first full
year in XML workflow). In that time, the number of pages published increased from
16,148 to 18,259 (total of all journals). We have not increased composition staff
time despite the 13% increase in pages published in this period. The most
noticeable change apparent to authors has been the reduced time to receive a proof
following acceptance.
Production time in the Miles33/LaTeX (2008) and eXtyles-Typéfi (2010) workflows. Production time (acceptance of manuscript until proof sent to authors) is shown for all journals, for the largest journal (Journal of Dairy Science, JDS), and for (more...)
The impact on our editorial and composition staff has been positive. Our copyeditors
have minimal exposure to the XML, and the powerful editorial tools within eXtyles
have allowed the editors to focus on content instead of routine editorial clean-up
and formatting. Our compositors have more direct interaction with the XML: they
export the XML, resolve parsing errors (with only occasional assistance from eXtyles
Support), build the articles in Typéfi, and troubleshoot any composition
problems that arise. The compositors are also responsible for changes to the
templates (e.g., updating the date for copyright lines each year, changing a font or
other design element). Changes that affect the conversion of NLM XML to
Typéfi’s internal content XML are made by Typéfi support staff,
although this is rare.
We have reduced the costs associated with using freelance typecoders while retaining
control over all aspects of the workflow. Many small publishing operations have made
the choice to outsource the composition process (and sometimes the editorial work)
to a third party. By keeping the editorial and composition work in house, we have
retained complete control over quality, timeliness, and cost-effectiveness of the
publishing operation. Our member societies benefit from the reduction in composition
costs achieved by elimination of typecoding. Our authors, members, and journal
readers benefit from the reduced time to publication of high-quality, accessible
research.
Next steps
FASS continues to explore new ways to leverage the editorial and composition
tools we have. eXtyles is used in many of our non-journal (non-XML) projects
because it allows us to apply consistent editorial and formatting rules,
especially useful for projects with multiple authors. When working on a
nonjournal layout project, the production specialist or graphic artist creates
an InDesign template with paragraph styles that match the style names used in
the “eXtyled” Word document.
We use a customized eXtyles module to apply a consistent editorial style to
meeting abstracts submitted for the member society meetings. Again, using
eXtyles with abstracts permits consistent formatting and a consistent set of
editorial rules to be applied without incurring extensive copyediting time. To
do this manually for 2000 abstracts (a typical joint society meeting) would be
time-consuming and cost-prohibitive. By using eXtyles, we are able to produce
better quality programs and abstract books without additional costs to the
member societies.
Although Typéfi’s template-based system is ideal for journal
composition (because the design elements do not change from article to article
or from issue to issue), it can also be used for one-off book designs.
Experienced InDesign users can modify an existing template or create a new one
in a matter of hours. Creating a new template for each book project would be an
efficient approach to automating long-document layout. We plan to integrate the
full eXtyles-Typéfi workflow into book projects in the future.
Another benefit of transitioning to an XML workflow is that our journal content
is stored in an accessible, searchable, and reusable platform-neutral format. We
hope to add a content management system (CMS) to our XML workflow in the next
few years. Addition of a CMS will facilitate greater automation and better
long-term storage and reuse of our content within and between the journals
published by our member societies.
The eXtyles-XML-Typéfi combination is a flexible, scalable, and powerful
workflow. The combination should allow the FASS publications department to
continue producing high-quality journals effectively and to adapt quickly to new
delivery channels (e.g., ePub and applications) and end-user devices (e.g.,
smartphones, iPads, and eReaders) adopted by the STM publishing community.