Crane's ISOSTS Rendering Environment

G. Ken Holman

Crane Softwrights Ltd.

$Date: 2015/07/17 01:00:17 $(UTC)


Table of Contents

1. Introduction
2. Invocation
2.1. Drag-and-drop
2.2. Command-line invocation
2.3. Stylesheets
2.4. PDF stylesheet parameters
3. Graphics
3.1. Indicating supported formats
3.2. MathML rendering
4. Special text characters
5. Installed Java environments
6. Rendering to PDF at no cost
7. Development notes
7.1. Vocabulary coverage
7.2. Side-by-side
7.3. Interpretation notes
7.4. Rendering quality
8. Commercialization and future work
Bibliography

1. Introduction

This package renders a subset of instances of the ISO Standards Tag Set 1.1 [ISOSTS] to PDF (using XSLT 2.0 [XSLT 2.0] and XSL-FO 1.1 [XSL 1.1]). The PDF stylesheets have been tested with XSL-FO processors from Antenna House http://www.AntennaHouse.com and Visual Programming Limited http:www.XMLPDF.com (see Section 7.4, “Rendering quality” for a discussion of differences).

This package also offers validation of XML documents against the ISOSTS v1.1 document model constraints (using RELAX-NG [RELAX-NG] and XSLT 2.0 [XSLT 2.0]), created from ISO's own Schematron [Schematron] expression of constraints.

Figure 1. Crane's ISOSTS environment

[Crane data flow]

The following data flows are identified in the diagram:

  1. Pass 1 of 2 for validation - confirming the input XML does not violate model constraints

  2. Pass 2 of 2 for validation - confirming the input XML does not violate value constraints

  3. Publishing input XML to output PDF through intermediate XSL-FO

  4. Intermediate XSL-FO to output XHTML with all styles in external CSS (in development)

See Section 7.1, “Vocabulary coverage” regarding changes you may need in these stylesheets to be more useful to you.

2. Invocation

2.1. Drag-and-drop

The windows\ subdirectory has the following batch files onto which dragging and dropping an XML file will produce results:

  • ahf-a4-pdf.bat - validate XML and use Antenna House Formatter to create A4-sized PDF results (errors to .pdf.txt)

  • ahf-us-pdf.bat - validate XML and use Antenna House Formatter to create US-sized PDF results (errors to .pdf.txt)

  • ibex-a4-pdf.bat - validate XML and use Ibex to create A4-sized PDF results (errors to .pdf.txt)

  • ibex-us-pdf.bat - validate XML and use Ibex to create US-sized PDF results (errors to .pdf.txt)

  • validate.bat - validate XML and produce no output (errors to .xml.validate.txt)

Similarly, there are five bash scripts with the same names and ".sh" extension.

2.1.1. Windows "Send to" context menu

After installing the Windows drag-and-drop batch files above, one can create shortcuts to the batch files and position them in the system to be recognized by the context menu item "Send to". To create a shortcut, select the batch file and use right-click the "Create Shortcut". After installing the shortcut as described below, you may wish to rename the shortcut in the "Send to" folder to be more appropriate than simply the batch file name.

Once installed as below, the user can simply right-click on any ISOSTS XML file, select "Send to" and then select the shortcut of the drag-and-drop batch file desired from above.

2.1.1.1. Windows XP

Open up the XP menu with Ctrl+Esc and select "Run...". In the "Open" box enter simply "sendto" (without the quotes). This brings up the folder of shortcuts to applications that accept a file as an argument. Simply move into this folder each of the desired shortcuts created from the drag-and-drop batch files above.

2.1.1.2. Windows 7/8

Open up the folder C:\"Users\Your User Name"\AppData\Roaming\Microsoft\Windows\SendTo (without the quotes, and replacing the given text with your Windows user name). This brings up the folder of shortcuts to applications that accept a file as an argument. Simply move into this folder each of the desired shortcuts created from the drag-and-drop batch files above.

2.2. Command-line invocation

2.2.1. Validation

This supplied batch file is used to validate an XML document as not violating ISOSTS constraints:

  • Crane-ISOSTS-validate.bat (Windows) and Crane-ISOSTS-validate.sh (bash)

    • the return code is zero for success or non-zero for failure

The two steps are as follows in checking the document (accessed as of the date of Crane's documentation):

  1. against the document model described at http://www.iso.org/schema/isosts/

  2. against the model constraints described at http://www.iso.org/schema/isosts/resources/schematron/ISOSTS_validation.sch

2.2.2. PDF generation

In the following scripts there are three invocation arguments:

  1. the input XML document name

  2. either "a4" or "us" (without quotes) to indicate the page size

  3. the output PDF document name

In the ibex/ directory has this supplied batch file used to validate and then transform XML to PDF using Ibex:

  • Crane-isosts2pdf4ibex.bat (Windows) and Crane-isosts2pdf4ibex.sh (bash)

    • the return code is zero for success or non-zero for failure

In the ahf/ directory has this supplied batch file used to validate and then transform XML to XSL-FO and PDF using Antenna House Formatter:

  • Crane-isosts2pdf4ahf.bat (Windows) and Crane-isosts2pdf4ahf.sh (bash)

    • the return code is zero for success or non-zero for failure

2.3. Stylesheets

This top-level XSLT 2.0 stylesheet is included in this package::

These top-level XSLT 2.0 stylesheets are included in this package (the lower level included and imported stylesheet fragments in the support/ directory are not listed here):

  • Crane-isosts2fo-a4.xsl to create an XSL-FO production result for the A4 page size;

  • Crane-isosts2fo-us.xsl to create an XSL-FO production result for the US-letter page size; and

These top-level XSLT 2.0 stylesheets are included in the ibex/ subdirectory:

  • Crane-isosts2fo-a4-ibex.xsl a specialization supporting the graphic formats available in the Ibex XSL-FO engine; and

  • Crane-isosts2fo-us-ibex.xsl a specialization supporting the US-letter page size for the Ibex XSL-FO engine.

A PDF file is created by an XSL-FO engine that interprets the directives in the XSL-FO file. The demonstration environment includes an XSL-FO engine. This environment is described in Section 6, “Rendering to PDF at no cost”.

2.4. PDF stylesheet parameters

There are numerous stylesheet parameters available to customize the PDF generated by the print stylesheets. These can be specified either on the XSLT invocation command line as individual parameter arguments, or they can be specified in a parameter XML file with the following structure (see the included Crane-ISOSTS-params.xml as an example):

<parameters>
  <parameter-name-1>parameter-value-1</parameter-name-1>
  <parameter-name-2>parameter-value-2</parameter-name-2>
  <parameter-name-3>parameter-value-3</parameter-name-3>
</parameters>

There is no order to the parameters in a parameter file. A parameter specified on the command line takes precedence over a parameter specified in the parameter file. When neither is specified, a parameter's value in a wrapper stylesheet (such as stylesheets with the US-letter-sized geometry) takes precedence over that parameter's value in the base stylesheet (with the A4-sized geometry).

One can incorporate a parameter file indirectly at invocation time on the command line by indicating a file name relative to the input ISOSTS XML:

  • Crane-ISOSTS-params-uri="uri-relative-to-ISOSTS-XML-file"

    • a relative URI is resolved relative to the input XML; an absolute URI is resolved independent of the input XML

    • this is the common method of specifying multiple parameter files (all with the same name) when the parameters are specific to a given ISOSTS XML document

    • when not specified, the relative URI "Crane-ISOSTS-params.xml" is assumed (thus it will be expected to be in the same directory as the input XML)

    • this is ignored when the "+Crane-ISOSTS-params" parameter is used

One can incorporate a parameter file directly at invocation time on the command line by specifying the file name:

  • +Crane-ISOSTS-params="uri-relative-to-invocation-directory"

    • note the use of the "+" at the beginning of the name is mandatory for this parameter (and only for this parameter) when using Saxon as the XSLT processor (other XSLT processors may have a different convention for specifying the passing of a document node as a parameter value)

    • a relative URI is resolved relative to the invocation directory; an absolute URI is resolved independent of the invocation directory

    • when not specified, the Crane-ISOSTS-params-uri parameter is used

The following parameters can be specified in the parameter file but are overridden by specified values on the invocation command line:

  • copyright-suppress="yes-or-no-default-yes"

    • use this to strike through all of the text of the copyright badges

  • fallback-font-family-names="comma-separated-list-of-font-family-names"

    • use this to change the fallback fonts from "Arial Unicode MS" and Helvetica

    • names with spaces must be quoted

  • graphic-content-types="space-separated-list-of-MIME-types"

    • use this to indicate to the stylesheets which MIME types are recognized by the XSL-FO engine that will process the resulting XSL-FO file, so as to flag a file of a format that is not supported in advance of invoking the engine

  • graphic-image-extension="assumed-extension-including-dot"

    • use this to change the assumed extension from ".png"

  • graphic-uri-prefix="string-value"

    • use this to prefix the graphic file name created from the XML attributes of a figure

    • an example would be when, say, all of the images files are in a subdirectory relative to the input XML: one would use graphic-uri-prefix="images/"

    • this parameter is not needed when the images are in the same directory as the input XML

  • links-with-blue="yes-or-no-default-yes"

    • use this to change the colour of links to black

  • links-with-underline="yes-or-no-default-yes"

    • use this to remove the underline under links

  • metadata-suppress="yes-or-no-default-no"

    • use this to remove the metadata exposition added after the final page of the document

  • page-height="length-value-default-297mm"

    • use this to change the page geometry

  • page-width="length-value-default-210mm"

    • use this to change the page geometry (only 'cm', 'mm' and 'in' are allowed as units of measure)

  • preview="yes-or-no-default-no"

    • use this to abbreviate the content to only the foreword, introduction, scope, terms and definitions and normative references

    • the table of contents is complete, but other sections are presented in a grayed-out font

    • the page count on the back cover is presented only if there is the "price-ref-pages" custom metadata in the XML

  • serif-font="yes-or-no-default-yes"

    • use this to change the font families to sans-serif for the document

  • toc-indent-length="length-value-default-12mm"

    • use this to change the length of the indent for nested table of contents members

    • use toc-indent-length="0mm" to turn off the indentation and present the table of contents flush left

  • toc-max-depth="scalar-value-default-3"

    • use this to change how many sections show up in the table of contents

3. Graphics

3.1. Indicating supported formats

An unordered list of XSL-FO-engine-supported graphic formats assumed by the base stylesheets for the <graphic> element has the following MIME types:

  • image/cgm (using: mimetype="image" mime-subtype="cgm")

  • image/gif

  • image/jpeg

  • image/png

  • image/svg+xml

  • image/tiff

  • application/mathml+xml

  • application/x-bmp

  • application/x-eps

  • application/x-wmf

  • image/x-bmp

  • image/x-eps

  • image/x-wmf

The list of supported formats can be overridden with an importing stylesheet. See Crane-isosts2fo-a4-ibex.xsl for an example of overriding the $graphic-content-types variable with the white-space-separated-list subset supported by the Ibex XSL-FO engine.

3.2. MathML rendering

The <mml:math> element is assumed to be supported only if application/mathml+xml is in the list of supported formats in the $graphic-content-types variable. Its inclusion in the output is ignored if the MIME type is not in the list.

4. Special text characters

Hexadecimal numeric character references are used in the XML as a safe and explicit encoding of special characters. Two examples are &x2014; for the em-dash and &#xa0; for the non-breaking space.

An official cross-reference of Unicode code positions and their character definitions can be found by saving and inspecting the file at ftp://ftp.unicode.org/Public/UNIDATA/UnicodeData.txt.

Some useful values from this table are:

00A0;NO-BREAK SPACE;Zs;0;CS;<noBreak> 0020;;;;N;NON-BREAKING SPACE;;;;
2011;NON-BREAKING HYPHEN;Pd;0;ON;<noBreak> 2010;;;;N;;;;;
2013;EN DASH;Pd;0;ON;;;;;N;;;;;
2014;EM DASH;Pd;0;ON;;;;;N;;;;;
2020;DAGGER;Po;0;ON;;;;;N;;;;;
2021;DOUBLE DAGGER;Po;0;ON;;;;;N;;;;;
2022;BULLET;Po;0;ON;;;;;N;;;;;

An example would be the use of the no-break space and en-dash in title components in order to format the title of a document such that the en-dashes and the part number do not get orphaned:

  <title-wrap xml:lang='en'>
    <intro>Information Technology</intro>
    <main>Business Operational View</main>
    <compl>Linking business operational view to functional 
service view</compl>
    <full>
      Information Technology&#xa0;&#x2013; Business Operational 
View&#xa0;&#x2013; Part&#xa0;20:&#xa0;Linking business 
operational view (BOV) to functional service view (FSV)</full>
  </title-wrap>

5. Installed Java environments

This package includes installations of the following Java environments available for free (note that each package directory has the details of all licenses and applicable notices for each package):

/jar/ibex

  • an XSL-FO processor used for rendering Crane's ISOSTS stylesheets that are unchanged from delivery (note that making any changes to the stylesheets will prevent this engine from executing)

  • see http://www.xmlpdf.com/ibex-downloads-signed.html for details

/jar/jing

/jar/saxon

/jar/saxon9he

6. Rendering to PDF at no cost

All XSL-FO vendors have no-cost evaluation licenses with which to run these ISOSTS stylesheets. The evaluation license stamps an indelible mark of some kind on each page, but the page content is still useful for review.

Additionally, an un-stamped version of the Ibex XSL-FO commercial engine from Visual Programming Limited is available for free use with selected stylesheets from Crane Softwrights Ltd. To transform an XML instance into PDF using Ibex in a single step, there is the Crane-isosts2pdf4ibex.bat and Crane-isosts2pdf4ibex.sh example invocation files in the package. This invocation takes three arguments (in order):

  1. the XML input instance;

  2. "us" or "a4" (no quotes) to indicate the page size; and

  3. the name of the PDF output file.

Note

Some rendering interpretation issues have been identified and Crane is working actively with Visual Programming Limited to address the issues. Please check regularly for updates to the Ibex Signature Edition for Crane Softwrights Ltd. See Section 7.4, “Rendering quality” for more information in this regard.

The digital signature manifest Crane-isosts2fo-manifest.xml is tied to the stylesheets as published, thus the processor will not work with these stylesheets if they are modified in any way.

The support file Crane-isosts-ibexconfig.xml is used to configure font information for a typical Windows installation.

When the font configuration file is properly specified for a Linux environment, these stylesheets, manifest files and .jar files will all work as expected (acknowledging that some issues are still to be addressed by the vendor).

Note

The two Ibex specialization stylesheets Crane-isosts2fo-a4-ibex.xsl and Crane-isosts2fo-us-ibex.xsl are illustrations of how one would create specialization stylesheets for other XSL-FO products.

7. Development notes

7.1. Vocabulary coverage

At this time there is no attempt to cover the entire ISOSTS vocabulary with these stylesheets. They were developed to meet the specific needs to publish a particular ISO/IEC project document. Moreover, those elements that have been implemented may not have been implemented in full for all attributes used or contexts placed.

Unsupported elements are rendered as a pair of start and end tags in green, indicating the name of the construct. There is no visual indication of unsupported attributes.

The XSLT transformation error port posts a message for every encountered construct that is not supported by the stylesheets.

7.1.1. Requesting enhancements and further vocabulary coverage

We are open to enhancing this stylesheet library as an evolving free developer resource, but given the magnitude of the ISOSTS vocabulary, we cannot spend our time guessing what is important to users.

If you need a change to these stylesheets, either in layout or in the list of supported constructs, please let us know your requirements at info@CraneSoftwrights.com. This might be a request to change how we have implemented a particular construct, or you may need more of the vocabulary to be supported by the stylesheets.

When requesting a change to these stylesheets, please take the time to include an XML document with as many differing examples of your requirement. Also include illustrative examples of the layout you need so we know what is to be expected. Any information you can give us to make the job of updating the library easier for us will help us get the updated stylesheets posted as soon as possible.

Better still, if you could hack your additions into the included testall.xml example and send it back to us, that will illustrate its use for all users. You must do this only on the condition that you not include any private or proprietary information in the example you provide.

7.2. Side-by-side

These stylesheets are not written for side-by-side multi-lingual rendering. All content is rendered in the order authored. There is nothing inherent about ISOSTS that prevents a side-by-side rendering, it simply was not a priority in meeting the needs for which this library was created.

7.3. Interpretation notes

During development certain observations were made regarding ISOSTS and its implementation with XSL-FO.

7.3.1. TBX-ISO-TML

A subset of the TBX-compliant terminology markup language [TBX] (in the urn:iso:std:iso:30042:ed-1 namespace) for ISO standards is implemented. These stylesheets render terms in the order edited, without rearranging the terms in any particular order.

The <term-sec> element wraps either or both a TBX <tbx:termEntry> and/or a generic ISOSTS <term-display> construct, with preference to the ISOSTS construct when both are found in the terminology section.

An example of using TBX-ISO-TML is as follows:

     <term-sec id="t_OeCI">
       <label>3.18</label>
       <tbx:termEntry>
         <tbx:langSet xml:lang="en">
           <tbx:definition>information exchanged among 
<xref rid="t_OeSE"><bold>Open-edi Support Entities</bold></xref> 
to co-ordinate their operation</tbx:definition>
           <tbx:tig>
             <tbx:term>Open-edi Control Information</tbx:term>
             <tbx:partOfSpeech value="noun"/>
             <tbx:normativeAuthorization value="preferredTerm"/>
             <tbx:termType value="fullForm"/>
           </tbx:tig>
           <tbx:tig>
             <tbx:term>OeCI</tbx:term>
             <tbx:partOfSpeech value="noun"/>
             <tbx:normativeAuthorization value="admittedTerm"/>
             <tbx:termType value="acronym"/>
           </tbx:tig>
         </tbx:langSet>
       </tbx:termEntry>
     </term-sec>

7.3.2. Floating objects and footnotes

XSL-FO does not allow a footnote to be in a floating object. Thus whenever a construct is asked to be floated, that construct is checked for having any footnotes. If footnotes are found, the object is anchored, not floated, in order for the footnotes to function as footnotes.

7.4. Rendering quality

The different XSL-FO processors interpret differently some of the chosen constructs used to specify the rendering of the information. Crane is in discussion with vendors regarding these interpretation issues and so the rendering will continue to improve in future versions of this package.

Please follow Crane's RSS feed for free developer resources to be notified of updates to the Ibex Signature Edition included this package: http://www.CraneSoftwrights.com/resources/crane-resources.rss

Most XSL-FO vendors offer a watermarked evaluation version of their software, and using these versions of the software will produce results for your evaluation of Crane's stylesheets:

8. Commercialization and future work

These stylesheets are being incorporated, without the watermarks, as the publishing component of a standards management suite of work-flow tools. Please see http://setare.no/?page_id=222 for more details in this regard.

Bibliography

[ISOSTS] ISO Central Secretariat ISO Standard Tag Set June 2013

[RELAX-NG] James Clark, Makoto Murata ISO/IEC 19757-2 RELAX-NG (Regular Language for XML)

[XSL 1.1] Anders Berglund Extensible Stylesheet Language Version 1.1 2006-12-05

[XSLT 1.0] James Clark XSL Transformations (XSLT) Version 1.0 1999-11-16

[XSLT 2.0] Michael Kay XSL Transformations (XSLT) Version 2.0 2007-01-23