[Accessibility conventions are described at the bottom of the page]
2. The context of XSL-FO
[> 3.][< 1.2.1][^^^]
2.0 Overview
[> 2.1][> 3.][< 2.][^^][^^^]
In this chapter we look at four technologies. In later chapters we will both set the stage for and delve into the semantics
of paginated formatting available through detailed examples, definitions of terminology, and an examination of each of the
formatting objects.
Extensible Markup Language (XML)
[[1] - hierarchically describes an instance of information
[[2] - using embedded markup according to rules specified in the Recommendation
[2] - according to a vocabulary (a set of element types each with a name, a structure and optionally some attributes) described
by the user
][1] - optionally specifies a mechanism for the formal definition of a vocabulary
[[2] - controls the instantiation of new information
[2] - validates existing information
]]
Document Style Semantics and Specification Language (DSSSL)
[[1] - an internationally standardized collection of style semantics
[1] - a specification language for the transformation of structured information and the application of the standardized internationalized
style semantics for paginating structured information
]
Cascading Stylesheets (CSS)
[[1] - a set of formatting properties to attach to structured documents
[[2] - defines a cascade of property sources from the markup to the embedded stylesheet to the external stylesheet to the default
presentation semantics of the processor
][1] - browsers supporting CSS can render the document structure of HTML documents or XML documents according to the properties
]
Extensible Stylesheet Language Family (XSLT/XSL/XSL-FO)
[[1] - XSL Transformations (XSLT)
[[2] - specifies the transformation of XML-encoded information into a hierarchy using the same or a different document model primarily
for the kinds of transformations for use with XSL
][1] - XSL (Formatting Semantics, a.k.a. XSL-FO)
[[2] - specifies the vocabulary and semantics of the formatting of information for paginated presentation
[2] - colloquially referred to at times as XSL Formatting Objects
]]
2.1 The XML family of Recommendations
[> 2.2][< 2.0][^^][^^^]
2.1.1 Extensible Markup Language (XML)
[> 2.1.2][> 2.2][> 3.][< 2.0][^^][^^^]
[[1] - [http://www.w3.org/TR/REC-xml]
]
A Recommendation fulfilling two objectives for information representation
[[1] - capturing information in a hierarchical form in markup according to basic XML-defined constraints
[[2] - creating well-formed documents of elements, attributes and other constructs
][1] - restricting and/or validating hierarchical information in XML to arbitrary user-specified constraints
[[2] - defining a model or grammar for the structure and content of a document
[[3] - collection of available element types and their respective attributes
[[4] - inherent relationships between information in the hierarchy
][3] - vocabulary can be expressed formally in XML 1.0 as a Document Type Definition (DTD)
]]]
Nothing in XML is related to presentation or rendition
[[1] - no inferred semantics for the display or formatting of information when using XML
[1] - xml:space is used only for the significance of the white space in the document
]
The vocabulary of elements and attributes used in an instance can be validated
[[1] - at a grammar level by a declarative document model
[[2] - structural validation
[[3] - the nesting and order of elements and their use of attributes
][2] - lexical validation and integrity
[[3] - certain aspects of content and the allowable string values of attributes
][2] - a distinct process separate from the applications acting on the information
[[3] - analysis against the DTD
[3] - analysis using other validation mechanisms (e.g. XML Schema, RELAX-NG Schema, Schematron, etc.)
]][1] - at a semantic level by the application processing the information
[[2] - the semantics of information is not defined by the grammar or structure of the information
[2] - information "means" exactly what any application processing the information wants it to mean
[2] - the application analyzes the structure and content of the information for appropriateness to its purpose
[[3] - can test conditions or constraints that cannot be expressed in a formal document model syntax
[3] - can algorithmically determine validity to support requirements not easily expressed declaratively
]]]
XML vocabularies can be translated to an application's specific vocabulary
[[1] - our XML vocabularies should be designed according to the business processes acting on the information, not around the appearance
[[2] - flexibility to have many different appearances for the same information
][1] - a presentation application vocabulary could just be attribute values added to our own vocabularies without changing element
names (e.g. Cascading Stylesheets (CSS))
[[2] - we can add properties that are recognized by a browser
[2] - our information is then rendered visually or aurally according to the properties
][1] - a presentation application vocabulary could be elements and attributes expressing the semantics of browsing information (e.g.
Hypertext Markup Language (HTML) or Scalable Vector Graphics (SVG))
[[2] - browsers have adopted common presentation semantics for HTML constructs
[2] - if the presentation semantics are sufficient, we need only use HTML
[2] - if the presentation semantics are insufficient, we can use both HTML and CSS
[[3] - CSS-aware browsers would present the information as desired
[3] - non-CSS-aware browsers would at least present the information using the presentation semantics of HTML
]]]
Namespaces distinguish constructs in information from different vocabularies
[[1] - a single instance can contain information from different document models
[1] - the recognition of element types by an application is through combination of namespace URI string and un-prefixed element
type name
[[2] - the prefix used in the instance is irrelevant to the application
]]
An application can only act on the vocabulary it recognizes and must have a behavior for vocabulary it doesn't recognize
[[1] - a web browser understands the HTML and CSS vocabularies
[[2] - displays understood constructs accordingly
[2] - ignores unrecognized constructs while passing content through to the canvas
][1] - an e-commerce application understands the vocabulary designed to trigger behavior
[[2] - performs the functionality accordingly
[2] - could exit with an error or warning for unrecognized constructs
][1] - a namespace-aware application can recognize constructs from different vocabularies
[[2] - using namespaces gives us better labels for the elements and attributes in our XML information than simple names
[2] - the semantics of our information is assumed by the application we use to process our information through the labels we use
[[3] - using better labels results in more successful application processing
]]]
2.1.2 XML information links
[> 2.1.3][> 2.2][> 3.][< 2.1.1][^][^^][^^^]
Links to useful information
[[1] - [http://www.xml.com/axml/axml.html] - annotated version of XML 1.0
[1] - [http://xml.coverpages.org/xml.html] - Robin Cover's famous resource collection
[1] - [http://xml.coverpages.org/xll.html] - Extensible Linking Language
[1] - [http://xml.silmaril.ie/] - Peter Flynn FAQ
[1] - [http://www.xmlbooks.com/] - a summary of available printed books
[1] - [http://www.CraneSoftwrights.com/links/trn-20080127.htm] - training material
[1] - [http://www.CraneSoftwrights.com/resources] - free resources
[1] - [http://XMLGuild.info] - consulting and training expertise
[1] - [http://xml.coverpages.org/elementsAndAttrs.html] - a summary of opinions
]
Related initiatives and specifications
[[1] - [http://www.w3.org/TR/2004/REC-xml-infoset-20040204] - XML Information Set
[1] - [http://www.w3.org/TR/xmlschema-0/] - W3C XML Schema
[1] - [http://www.relax-ng.org] - ISO/IEC 19757-2 RELAX NG (based on RELAX and TREX)
[1] - [http://www.schematron.com] - ISO/IEC 19757-3 Schematron
[1] - [http://www.nvdl.org] - ISO/IEC 19757-4 Namespace-based Validation Dispatching Language (NVDL)
[1] - [http://www.w3.org/TR/DOM-Level-2/] - Document Object Model Level 2
[1] - [http://www.saxproject.org] - Simple API for XML
]
2.1.3 Document Style Semantics and Specification Language (DSSSL)
[> 2.1.4][> 2.2][> 3.][< 2.1.2][^][^^][^^^]
[[1] - ISO/IEC-10179:1996
[[2] - [http://www.y12.doe.gov/sgml/wg8/dsssl/readme.htm]
]]
Transforming and formatting structured information
[[1] - distinguishes the separate behaviors required to style structured information
[1] - a transformation language to rearrange structured information
[1] - pagination semantics for presenting information in fixed-sized folios
[1] - includes an extension mechanism for arbitrary formatting semantics
]
A programming language for transforming structured information
[[1] - specifies relationships between multiple input documents in the transformation to zero or more output documents
[1] - uses side-effect-free dialect of Scheme (itself a derivative of LISP) for the expression language
]
A standardized set of formatting semantics for paginated output
[[1] - specifies the intent of the result of a formatting process
[1] - does not specify the rendering process
[1] - no bias in constructs to any particular writing direction
]
A framework for implementation-defined sets of formatting semantics
[[1] - there exists a set of semantics for formatting into SGML/XML markup, thus implementing a transformation function through the
use of formatting facilities
]
Custody of ISO/IEC JTC 1/SC 34/WG 2
[[1] - formerly ISO/IEC JTC 1/WG 4
[1] - formerly ISO/IEC JTC 1/SC 18/WG 8
]
2.1.4 Cascading Stylesheets (CSS)
[> 2.1.5][> 2.2][> 3.][< 2.1.3][^][^^][^^^]
[[1] - [http://www.w3.org/TR/REC-CSS1]
[1] - [http://www.w3.org/TR/REC-CSS2]
]
Formatting property assignment for web documents (HTML and XML)
[[1] - no document manipulation capabilities
[1] - width and length of presentation are not fixed
[[2] - can be changed dynamically by the reader of the information
][1] - developed to address incompatible vendor extensions for formatting that were being added to browsers
]
Ornamentation of the document tree
[[1] - attaching stylistic information to nodes
[1] - simple prefixing and suffixing of nodes with text
[1] - control of white space around information
[1] - overlapping and transparent rectangular regions
[1] - significant use of inheritance of formatting properties from ancestral tree locations
[[2] - defines the "cascade" of application of inheritable formatting properties
]]
Multiple media type support
[[1] - character display presentation properties
[1] - tabular presentation properties
[1] - aural presentation properties for visually impaired browsing
[[2] - disabled users
[2] - mobile users
]]
Doesn't (shouldn't) interfere with legacy browsers not supporting CSS
[[1] - values expressed in attributes and document metadata
[1] - expression can be external to the document itself
[[2] - required to be external for XML files not using namespaces
][1] - can be introduced to namespace-aware XML documents through the HTML vocabulary
]
Working group is producing a common formatting model for web documents
[[1] - all W3C Recommendations needing presentation properties should use the CSS semantics and associated property names where applicable
]
2.1.5 Styling structured information
[> 2.1.6][> 2.2][> 3.][< 2.1.4][^][^^][^^^]
Styling is transforming and formatting information
[[1] - the application of two processes to information to create a rendered result
[1] - the ordering of information for creation isn't necessarily (or shouldn't be constrained to) the ordering of information for
presentation or other downstream processes
[[2] - it is a common (though misdirected) first step for people working with these technologies to focus on presentation
[2] - the ordering should be based on business rules and inherent information properties, not on artificial presentation requirements
[2] - downstream arrangements can be derived from constraints imposed upstream in the process
[2] - information created richly upstream can be manipulated into less-richly distinguished information downstream, but not easily
the other way around
[2] - exception when the business rules are presentation or appearance oriented (e.g. book publishing)
][1] - the need to present information in more than one arrangement requires transformation
[1] - the need to present information in more than one appearance requires formatting
]
W3C XSL Working Group
[[1] - chartered to define a style specification language that covers at least the formatting functionality of both CSS and DSSSL
[1] - not intended to replace CSS, but to provide functionality beyond that defined by CSS
[[2] - e.g. add element reordering and pagination semantics
]]
Two W3C Recommendations
[[1] - designed to work together to fulfill these two objectives
[1] - XSL Transformations (XSLT) - versions 1.0 and 2.0
[[2] - transforming information obtained from a source into a particular reorganization of that information to be used as a result
][1] - Extensible Stylesheet Language (XSL/XSL-FO) - versions 1.0 and 1.1
[[2] - specifying and interpreting formatting semantics for the rendering of paginated information
[2] - the acronym XSL-FO is unofficial but in wide use, including at the W3C, for just the formatting objects, properties and property
values
[2] - XSL normatively includes XSLT by reference in chapter 2
[[3] - XSLT has specific features designed to be used for XSL-FO
]]]
XSLT and XSL-FO are endorsed by members of WSSSL
[[1] - an association of researchers and developers passionate about markup technologies
]
2.1.6 Extensible Stylesheet Language Transformations (XSLT)
[> 2.1.7][> 2.2][> 3.][< 2.1.5][^][^^][^^^]
[[1] - [T1.0][http://www.w3.org/TR/xslt]
[1] - [T2.0][http://www.w3.org/TR/xslt20]
]
Transformation using construction by example
[[1] - a vocabulary for specifying templates of the result that are filled-in with information from the source
[[2] - the stylesheet includes examples of each of the components of the result
[2] - the stylesheet writer declares how the XSLT processor builds the result from the supplied examples
][1] - the primary memory management and manipulation (node traversal and node creation) is handled by the XSLT processor using declarative
constructs, in contrast to a transformation programming language or interface (e.g. the DOM - Document Object Model) where
the programmer is responsible for handling low-level manipulation using imperative constructs
[1] - includes constructs to iterate over structures and information found in the source
[1] - the information being transformed can be traversed in different ways any number of times required to construct the desired
result
[1] - straightforward problems are solved in straightforward ways without needing to know programming
[[2] - useful, commonly-required facilities are implemented by the processor and can be triggered by the stylesheet
[2] - the language is Turing complete, thus arbitrarily complex algorithms can be implemented (though not necessarily in a pretty
fashion)
][1] - includes constructs to manage stylesheets by sharing components in different fragments
[1] - [T2.0]XSLT 2.0 has many more programming features and function calls than XSLT 1.0
]
Not intended for syntactic general purpose XML transformations
[[1] - designed for downstream-processing transformations suited for use with XSL formatting vocabulary
[[2] - includes facilities for working with the XSL vocabulary easily
][1] - still powerful enough for most downstream-processing transformation needs
[[2] - an XSLT stylesheet can be (and is) called a transformation script
[2] - absolutely general purpose when the output from XSLT is going to be input to an XML processor
][1] - does not include certain features appropriate for syntax-level general purpose transformations
[[2] - unsuitable for original markup syntax preservation requirements
][1] - [T2.0]XSLT 2.0 has many more syntax serialization features than XSLT 1.0
]
Illustration of triggered templates constructing a result:
[Figure 2.1: Construction of result tree by triggered stylesheet templates
The figure is split in three vertical panes: a source node tree on the left, a stylesheet of tree fragments in the middle,
and three incrementally-building result trees on the right.
Arrows connect the source tree nodes to the tree fragments in the stylesheet, and other arrows connect the tree fragments
with their use in the result tree.
]
Of note:
[[1] - the source tree contains nodes of six different types, labeled "1" through "6"
[[2] - a number of nodes are found multiple times in the source tree
][1] - the stylesheet contains fragmented examples of the result tree
[[2] - each example template is associated with a node in the source tree
][1] - the nodes in the source tree trigger the building of the result from the example templates
[[2] - some examples are used multiple times in the result
][1] - in this example, the source tree is visited strictly in parse order to generate the result tree
[[2] - the stylesheet can visit the source tree in whatever order is required to trigger the assembly of the result tree in result
parse order
[2] - result parse order is indicated by the letters "A" through "Z"
][1] - the node at the very top of the source tree and the result tree is a root node that does not represent any actual information
]
2.1.7 Extensible Stylesheet Language (XSL/XSL-FO)
[> 2.1.8][> 2.2][> 3.][< 2.1.6][^][^^][^^^]
[[1] - [F1.0][http://www.w3.org/TR/2001/REC-xsl-20011015/]
[1] - [F1.1][http://www.w3.org/TR/xsl11] ([http://www.w3.org/TR/xsl])
]
Paginated flow and formatting semantics vocabulary
[[1] - capturing agreed-upon formatting semantics for rendering information in a paginated form on different types of media
[1] - XSLT is normatively referenced as an integral component of XSL as a language to transform an instance of an arbitrary vocabulary
into the XSL-FO XML vocabulary
[1] - XSL-FO can be regarded simply as a "pagination markup language"
[1] - flow semantics from the DSSSL heritage
[[2] - e.g. headers, footers, page numbers, page number citations, columns, etc.
][1] - formatting semantics from the CSS heritage
[[2] - e.g. visual properties (font, color, etc.) and aural properties (speak, volume, etc.)
]]
Target of transformation
[[1] - the stylesheet writer transforms a source document into a hierarchy that uses only the formatting vocabulary in the result
tree
[1] - stylesheet is responsible for constructing the result tree that expresses the desired rendering of the information found in
the source tree
[[2] - the XML document gets transformed into its appearance
][1] - stylesheet cannot use any user constructs as they would not be recognized by an XSL rendering processor
[[2] - for example, the rendering engine doesn't know what an invoice number or customer number is that may be represented in the
source XML
[2] - the rendering engine does know what a block of text is and what properties of the block can be manipulated for appearance's
sake
[2] - the stylesheet transforms the invoice number and customer number into two blocks of text with specified spacing, font metrics,
and area geometry
]]
Device-independent formatting constructs
[[1] - the XSL-FO vocabulary describes two media interpretations for objects and properties:
[[2] - visual media
[2] - aural media
[2] - a further distinction is also made at times for interactive media
][1] - the results of applying a single stylesheet can be rendered on different types of rendering devices, e.g.: print, display,
audio, etc.
[1] - may still be appropriate to have separate stylesheets for dissimilar media
[[2] - device independence allows the information to be rendered on different media, but a given rendering may not be conducive to
consumption
]]
2.1.8 Styling semantics and vocabularies
[> 2.1.9][> 2.2][> 3.][< 2.1.7][^][^^][^^^]
XSLT and XSL-FO processors implement styling semantics
[[1] - recognize standardized constructs by their labels in the two respective namespaces
[[2] - elements and their attributes represent semantic concepts
[[3] - XSLT instructions and their controls
[3] - XSL-FO formatting objects and their properties
]][1] - recognize extension constructs by their labels in namespaces recognized by the processor
[1] - accommodate constructs by their labels in unrecognized namespaces
]
XSLT and XSL-FO document type definitions are described using prose
[[1] - there are no standardized XML 1.0 DTD representations of the grammar of the vocabularies
[[2] - DTD semantics and syntax unable to fully express all of the grammatical constraints
][1] - XSLT 1.0 and XSL 1.0 Recommendations describe the document type definitions
[1] - processors do all aspects of validation and interpretation according to the respective document type
[1] - snippets of DTD content model syntax with Kleene operators used in documentation because of the familiarity with the reader
[[2] - "?" for zero or one
[2] - "*" for zero or more
[2] - "+" for one or more
][1] - additional constraints not expressible in DTD content model syntax are described in prose
]
2.1.9 Outboard XSLT and XSL-FO processes
[> 2.1.10][> 2.2][> 3.][< 2.1.8][^][^^][^^^]
The XSL-FO and foreign object vocabularies can be used in a standalone XML instance, perhaps as the result of an XSLT transformation
using an outboard XSLT processor:
[Figure 2.2: Creating standalone XML instances of XSL vocabulary
A flow diagram is grouped into three areas: transformation on the left, formatting in the middle, and rendering on the right.
The transformation area shows an XML source document and an XSLT/XSL-FO transformation script feeding an XSLT process, producing
a stand-alone XSL-FO instance labeled "result formatted document".
The formatting area shows the XSL-FO instance feeding independent processes labeled "Aural Process", "Visual Process" and
"Interactive Process". The formatting area includes most of these processes, but falls short with the right edges of these
processes being outside the formatting area.
The rendering area encompasses the right edges of the processes and the final medium to which each of these processes projects
its output: a speaker for aural, a paginated output for visual, and a user's screen for interactive.
]
Note the same three distinct phases as when XSLT and XSL-FO processors are combined in a single application:
[[1] - transformation creates XSL-FO expressing our intent for formatting the source XML
[1] - XSL-FO process interprets our intent into the information that is to be rendered on the target device
[1] - XSL-FO process effects the rendering to reify the result
]
2.1.10 Transforming and rendering XML information using XSLT and XSL-FO
[> 2.1.11][> 2.2][> 3.][< 2.1.9][^][^^][^^^]
When the XSLT result tree is specified to utilize the XSL-FO formatting vocabulary:
[[1] - the normative behavior is to interpret the result tree according to the formatting semantics defined in XSL for the XSL-FO
formatting vocabulary
[1] - an inboard XSLT processor can effect the transformation to an XSL-FO result tree
[1] - the XSL-FO result tree need not be serialized in XML markup to be conforming to the recommendation (though useful for diagnostics
to evaluate results of transformation)
]
[Figure 2.3: Transformation from XML to XSL Formatting Semantics
A large block represents an XSL-FO process. Two triangle inputs from the left are the source file and the stylesheet file,
the stylesheet file indicates it contains only XSLT and XSL-FO vocabularies.
The first block inside the XSL-FO process is an XSLT process taking the two inputs and producing a dotted triangle XSL-FO
result tree output. A solid line leads from this result tree out the bottom of the large box to an XML serialization of the
XSL-FO tree. Three arrows also lead from the result tree to three process boxes, one each for aural, print and display interpretation
of the XSL formatting and flow object semantics in each domain. Each such process box has an arrow leading out of the large
box to a depiction of a speaker, a piece of paper, and the electronic display.
]
Of note:
[[1] - the stylesheet contains only the XSLT transformation vocabulary, the XSL formatting vocabulary, and extension transformation
or foreign object vocabularies
[1] - the source XML contains the user's vocabularies
[1] - the result of transformation contains exclusively the XSL formatting vocabulary and any extension formatting vocabularies
[[2] - does not contain any constructs of the source XML or XSLT vocabularies
][1] - the rendering processes implement for each medium the common formatting semantics described by the XSL recommendation
[[2] - for example, space specified before blocks of text can be rendered visually as a vertical gap between left-to-right line-oriented
paragraphs or aurally as timed silence before vocalized content
]]
2.1.11 Using XSL-FO as an intermediate form
[> 2.1.12][> 2.2][> 3.][< 2.1.10][^][^^][^^^]
XSL-FO can be used as the basis upon which to build an HTML result:
[[1] - the HTML page is an echo of the PDF page
[1] - the HTML page is painted using only <div> and <span> and formatting properties to reproduce the appearance of the document
[1] - no attempt to utilize HTML constructs such as <h1> or <li>
[1] - one can introduce augmentations into the XSL-FO (referred to in the diagram as XSL-FO++) using a foreign namespace not recognized
by an XSL-FO processor
[1] - the augmentations are ignored by the standard processor
[[2] - errors are not triggered
][1] - the augmentations can be interpreted by a script that recognizes the foreign namespace
[1] - the standard XSL-FO is translated to HTML using the publicly-available script where not overridden by the augmentation script
[1] - the use of augmentations is entirely optional
[[2] - the standard XSL-FO to HTML conversion handles much of XSL-FO as is
]]
[Figure 2.4: Creating HTML from XSL-FO
Two triangle inputs from the left are the source file and the stylesheet file, the stylesheet file indicates it contains only
XSLT and XSL-FO++ vocabularies, where XSL-FO++ represents standard XSL-FO plus augmentations not defined by XSL-FO.
An XSLT process formats the inputs to produce a triangle output labeled XSL-FO++. This is passed to an XSL-FO Visual Process
to produce a paginated result. This is also passed to a second XSLT process, using the augmentation script pp2html.xsl, which itself uses the publicly-available script fo2html.xsl as the stylesheet, to produce a browsed HTML result.
]
Of note:
[[1] - find the latest version of the fo2html.xsl stylesheet using a Google search with the following criteria:
[[2] - fo2html site:renderx.com
]]
2.1.12 Generating XSL-FO instances
[> 2.1.13][> 2.2][> 3.][< 2.1.11][^][^^][^^^]
The XSL vocabulary need not be created using XSLT
[[1] - volume HTML is often generated directly from applications
[1] - XSLT can be used to transform XML into any vocabulary
[1] - nothing in XSL-FO prevents it from being generated directly from applications
]
[Figure 2.5: Generating XML instances of XSL vocabulary
A large flow diagram depicts a database on the left with three arrows leaving it to the right.
The first arrow leads to an application process that puts out an HTML instance to an HTML browser, ultimately to a display.
The second arrow leads to an XML instance that, in turn, leads to (1) an XSLT process that puts out the same HTML instance
as described above, (2) an XML/XSLT browser to a display, and (3) to a second XSLT process to an XSL-FO instance from which
to each of aural, print, and display processes and their associated physical projections.
The third arrow leads to an application process that puts out the same XSL-FO instance described above.
]
Sole requirement of instance is the use of the XSL vocabulary namespace:
[[1] - http://www.w3.org/1999/XSL/Format
[1] - can be the default namespace
[[2] - no XSL-FO attribute problems as exhibited with XSLT attributes when using the default namespace for XSLT
]]
2.1.13 Using XSL-FO on a server
[> 2.1.14][> 2.2][> 3.][< 2.1.12][^][^^][^^^]
A typical web-based use of XSL-FO is the server delivery of "printable versions" of information found on a web page
[[1] - a transformation from XML to HTML creates the static information for a browser
[1] - a transformation from XML to XSL-FO to PDF creates the static information in printable form
[[2] - user would use the PDF reader on their system to produce the paper through the system printer
]]
[Figure 2.6: Using XSL-FO in a server environment
A flow diagram depicts a three-tiered environment with creation and storage on the left, the server in the middle and the
user on the right. A single XML file is shown to flow through two paths, the first in combination with an XSLT+HTML stylesheet
through an XSLT process to produce the static HTML for the user's display, and the second in combination with an XSLT+XSL-FO
stylesheet through an XSLT process to produce an XSL-FO structure which, in turn, goes through an XSL-FO process to produce
the static PDF for the user's printer.
]
A dynamic web-based use of XSL-FO is the on-the-fly synthesis and delivery of "printable versions" of information found on
a web page
[[1] - static information is combined with user information based on the session engaged with the user
[[2] - user information based on a predefined user profile
[2] - application information based on a dynamic request from the user
][1] - a transformation from XML to HTML is used to deliver the information to a browser
[1] - a transformation from XML to XSL-FO to PDF is used to deliver a paginated form of the information to the user
[[2] - user would use the PDF reader on their system to produce the paper through the system printer
]]
[Figure 2.7: Using XSL-FO in a server environment with user profiled or requested information
A flow diagram depicts a three-tiered environment with storage only on the left, the server in the middle and the user on
the right. Two XML files are shown, one with common source information and the other with user information, both flowing through
two paths, the first in combination with an XSLT+HTML stylesheet through an XSLT process on the server to produce HTML for
the user's display, and the second in combination with an XSLT+XSL-FO stylesheet through an XSLT process on the server to
produce an XSL-FO structure which, in turn, goes through an XSL-FO process on the server to produce PDF for the user's printer.
]
A distributed web-based use of XSL-FO is to transform and produce on the target client or send XSL-FO "on the wire" to the
target client software for processing into the resulting PDF
[[1] - similar benefits to dynamic web-based on-the-fly synthesis are enjoyed
[1] - a transformation from XML to HTML is used to deliver the information to a browser
[1] - XSL-FO-aware client software
[1] - disadvantage in that the XSL-FO files are somewhat large if choosing to transmit them
[1] - advantages are that the XSL-FO processing on the client can be fast and the server can use existing hardware assists for pure
XSLT transformation
]
[Figure 2.8: Using XSL-FO in a distributed environment with user profiled or requested information
A flow diagram depicts a three-tiered environment with storage only on the left, the server in the middle and the user on
the right. Two XML files are shown, one with common source information and the other with user information, both flowing through
three paths, the first in combination with an XSLT+HTML stylesheet through an XSLT process on the server to produce HTML for
the user's display, the second in combination with an XSLT+XSL-FO through an XSLT process on the client, and the third in
combination with an XSLT+XSL-FO stylesheet through an XSLT process on the server to produce an XSL-FO structure which, in
turn, is transmitted to the client software run by the user. The client software runs the XSL-FO process to produce PDF for
the user's printer in the later two paths.
]
2.1.14 Historical development of the XSL and XSLT Recommendations
[> 2.1.15][> 2.2][> 3.][< 2.1.13][^][^^][^^^]
Recommendation release history:
[[1] - first concept description floated in August 1997 with no official status within the World Wide Web Consortium (W3C)
[[2] - [http://www.w3.org/TR/NOTE-XSL.html]
][1] - the XSL Working Group officially chartered in early 1998
[[2] - [http://www.w3.org/Style/XSL/]
][1] - agreed upon requirements for XSL by the Working Group:
[[2] - [http://www.w3.org/TR/WD-XSLReq]
][1] - the XSL 1.0 Recommendation (XSL-FO) published October 15, 2001
[[2] - [http://www.w3.org/TR/2001/REC-xsl-20011015/]
][1] - the XSL 1.1 Recommendation (XSL-FO) published December 5, 2006
[[2] - [http://www.w3.org/TR/2006/REC-xsl11-20061205/]
][1] - the XSLT/XPath 1.0 Recommendations published November 16, 1999
[[2] - [http://www.w3.org/TR/1999/REC-xslt-19991116]
[[3] - [http://www.w3.org/1999/11/REC-xslt-19991116-errata] - errata
][2] - [http://www.w3.org/TR/1999/REC-xpath-19991116]
[[3] - [http://www.w3.org/1999/11/REC-xpath-19991116-errata] - errata
]][1] - XSLT 1.1 (work abandoned)
[[2] - [http://www.w3.org/TR/2000/WD-xslt11req-20000825] - requirements
[2] - [http://www.w3.org/TR/2001/WD-xslt11-20010824]
[2] - no incompatible changes to XSLT 1.0 in XSLT 1.1, only additional functionality
[2] - too many interactions with plans for XSLT 2.0, so functionality to be folded into XSLT 2.0 release
][1] - XSLT 2.0/XPath 2.0/XQuery 1.0 published January 23, 2007
[[2] - [http://www.w3.org/TR/2007/REC-xslt20-20070123/]
[2] - [http://www.w3.org/TR/2007/REC-xpath20-20070123/]
[2] - [http://www.w3.org/TR/2007/REC-xpath-datamodel-20070123/]
[2] - [http://www.w3.org/TR/2007/REC-xpath-functions-20070123/]
[2] - [http://www.w3.org/TR/2007/REC-xslt-xquery-serialization-20070123/]
[2] - [http://www.w3.org/TR/2007/REC-xquery-20070123/]
[2] - [http://www.w3.org/TR/2007/REC-xquery-semantics-20070123/]
[2] - [http://www.w3.org/TR/2007/REC-xqueryx-20070123]
]]
2.1.15 XSL information links
[> 2.2][> 3.][< 2.1.14][^][^^][^^^]
Links to useful information
[[1] - [http://xml.coverpages.org/xsl.html] - Robin Cover
[1] - [http://www.mulberrytech.com/xsl/xsl-list/] - mail list
[1] - [http://www.dpawson.co.uk] - an XSL/XSLT FAQ
[1] - [http://www.zvon.org/HTMLonly/XSLTutorial/Books/Book1/index.html] - numerous example XSLT scripts and fragments
[1] - [http://www.openmath.org/cocoon/openmath/] - OpenMath project work by David Carlisle
[1] - [http://www.CraneSoftwrights.com/links/trn-20080127.htm] - comprehensive XSLT/XPath and XSL-FO training material
[1] - [http://www.CraneSoftwrights.com/resources]- free XSLT and XSL-FO resources
[1] - [http://incrementaldevelopment.com/xsltrick/] - "Stupid XSLT Tricks"
[1] - [http://xml.coverpages.org/xslSoftware.html] - list of tools
[1] - [http://www.exslt.org/] - community effort for XSLT extensions
[1] - [http://exslfo.sf.net] - community effort for XSL-FO extensions
[1] - [http://foa.sourceforge.net/] - open source FO GUI authoring tool
[1] - [http://www.xslfast.com/] - commercial FO GUI authoring tool
[1] - [http://www.inventivedesigners.com/] - commercial FO GUI authoring tool
[1] - [http://www.abisource.com/] - word processing with "Save As..." for XSL-FO
[1] - [http://www.AntennaHouse.com/XSLsample/XSLsample.htm] - paginating XHTML
[1] - ISBN 1-56609-159-4 - "The Non-Designer's Design Book", Robin Williams, Peachpit Press, Inc., 1994
[1] - ISBN 0-8230-2121-1/0-8230-2122-X - "Graphic design for the electronic age; The manual for traditional and desktop publishing", Jan V. White, Xerox Press,
1988 (out of print but worthwhile to search for as a used book)
]
Examples of XSLT processors
[[1] - [http://www.jclark.com/xml/xt.html] - James Clark
[1] - [http://saxon.sourceforge.net] - Mike Kay
[1] - [http://msdn.microsoft.com/downloads/webtechnology/xml/msxml.asp] - updated web release of XML/XSLT processor for Internet Explorer 5 (IE6 follows the W3C specifications)
[[2] - [http://www.netcrucible.com/xslt/msxml-faq.htm] - useful FAQ
][1] - [http://www.altova.com/] - Altova
[1] - [http://technet.oracle.com/tech/xml/] - Oracle
[1] - [http://xml.apache.org/xalan/index.html] - Apache Project JAVA-based implementation (originally from IBM/Lotus AlphaWorks)
[1] - [http://alphaworks.ibm.com/tech/LotusXSL]- IBM/Lotus AlphaWorks wrapper for Xalan
[1] - [http://www.xmlsoft.org] - XSLT for Gnome
[1] - [http://www.SolaceSystems.com] - XSLT-dedicated hardware (board/chip)
[1] - [http://www.DataPower.com] - XSLT-dedicated hardware (box)
[1] - [http://www.sarvega.com] - XSLT-dedicated hardware (box)
[1] - [http://www.intel.com/software/xml] - Intel XML Appliance
[1] - [http://www.ambrosoft.com/gregor.html] - XSLT compiler
[1] - [http://www.infoteria.com] - iXSLT - commercial implementation
[1] - [http://www.unicorn-enterprises.com/] - Unicorn XSLT Processor
[1] - [http://www.a-dos.com] - XSLT processor and associated tools
]
The above list is just some of the early or interesting processors of the very many that are available commercially and publicly.
Examples of XSL formatting object rendering processors
[[1] - [http://www.AntennaHouse.com/] - AntennaHouse Windows-based and multi-platform versions
[1] - [http://www.RenderX.com/] RenderX - direct to PDF
[1] - [http://www.xmlpdf.com/ibex.html] - Ibex - direct to PDF
[1] - [http://www.compart.net] - DOPE - direct to AFP or PDF
[1] - [http://xml.apache.org/fop/] - FOP - direct to PDF, PCL, others
[1] - [http://xmlroff.sourceforge.net/] - open source to PDF
[1] - [http://www.Arbortext.com] - Epic and 3B2 composition tools
[1] - [http://www.Adobe.com [http://www.adobe.com]] - Adobe Document Server
[1] - [http://www.alt-soft.com] - .Net - direct to PDF
[1] - [http://www.Lunasil.com] - Java/COM - direct to PDF
[1] - [http://www.tei-c.org.uk/Software/passivetex/] - Passive TeX - TeX to PDF
[1] - [http://www.unicorn-enterprises.com/] - Unicorn UFO - TeX to PDF
[1] - [http://www.alphaworks.ibm.com/tech/xfc] - IBM XFC - direct to PDF
[1] - [http://www.xmlmind.com/foconverter] - Pixware XFC - XSL-FO to RTF
[1] - [http://www.jfor.org/] - XSL-FO to RTF
[1] - [http://www.xsmiles.org/] - XML browser using FOP
]
The above list is just some of the early processors of what is anticipated to be very many that will be available commercially
and publicly.
2.2 Examples
[> 3.][< 2.1.15][^^][^^^]
2.2.1 Hello world example
[> 2.2.2][> 3.][< 2.1.15][^^][^^^]
Consider a simple, but complete, XSL-FO instance hellofo.fo for an A4 page report:
[Example 2-1: A simple example01 <?xml version="1.0" encoding="UTF-8"?>
02 <root xmlns="http://www.w3.org/1999/XSL/Format"
03 font-size="16pt">
04 <layout-master-set>
05 <simple-page-master
06 margin-right="15mm" margin-left="15mm"
07 margin-bottom="15mm" margin-top="15mm"
08 page-width="210mm" page-height="297mm"
09 master-name="bookpage">
10 <region-body region-name="bookpage-body"
11 margin-bottom="5mm" margin-top="5mm" />
12 </simple-page-master>
13 </layout-master-set>
14 <page-sequence master-reference="bookpage">
15 <title>Hello world example</title>
16 <flow flow-name="bookpage-body">
17 <block>Hello XSL-FO!</block>
18 </flow>
19 </page-sequence>
20 </root>
]
All examples illustrate instances of the XSL-FO vocabulary
[[1] - how the instance is created is not material to the semantics of the vocabulary
[[2] - could be hand-authored in a simple text or XML editor
[2] - could be the result of an XSLT transformation from another XML vocabulary
[2] - could be output from any application
][1] - the default namespace is used in the examples for brevity and clarity
]
Rendered on the screen in two adjacent windows using conforming XSL-FO processors:
[[1] - Antenna House XSL Formatter (an interactive XSL-FO rendering tool)
[1] - Adobe Acrobat (a Portable Document Format (PDF) display tool)
[[2] - PDF created by RenderX XEP (a batch XSL-FO rendering tool)
]]
[Figure 2.9: A simple XSL-FO instance example
A screen shot of two application windows shows the Antenna House XSL Formatter on the left and Adobe Acrobat on the right.
Both applications show a page rendered with the phrase "Hello XSL-FO!".
The text in the left window is in a serif font, while the text in the right window is in a sans-serif font. Otherwise there
are no differences in the pages rendered in the windows.
The title bar of the window on the left reads "XSL Formatter - Hello world example", while the title bar of the window on the right reads "Adobe Acrobat - [hellofo.pdf]".
]
Note the two renderings are not identical
[[1] - XSL-FO only specifies the formatting process, not the rendering process
[1] - the XSL-FO instance may be insufficient in describing the entire formatting intent
[1] - page fidelity is not guaranteed if the instance does not express the entire intent
[[2] - XSLT semantics are extensive, but not necessarily comprehensive for certain nuances of formatting
][1] - the rendering may engage certain property values of its own choosing
]
No different than two web browsers with different user settings for fonts
[[1] - a simple web page not using CSS properties relies on the browser settings
[1] - the HTML constructs represent the intent of what is to be formatted
[1] - absent formatting properties are satisfied using the tool option defaults
]
2.2.2 A detailed example of flowed content
[> 2.2.3][> 3.][< 2.2.1][^][^^][^^^]
Consider a page of content from some instructor-led training material that contains a mixture of a table, a list, a proportionally-spaced
paragraph, and mono-spaced paragraphs:
[Figure 2.10: A page of handouts rendered in XSL-FO
A printed page is shown reflecting the formatted result of the example PDF. A company logo is shown on the top right of the
page, the page title information on the top left, a horizontal rule, followed by the main content of the page.
This content is a paragraph, followed by a single item list, followed by a lengthy listing of XSL-FO XML markup, with each
line prefaced by a line number in half-sized font.
]
2.2.3 Training material example
[> 3.][< 2.2.2][^][^^][^^^]
This page's material as an instructor-led handout:
[[1] - excerpt of formatting objects created using an XSLT stylesheet and an XSLT stylesheet processor
]
[Example 2-2:
Formatting objects (excerpt) for a page of handout material01 <flow flow-name="pages-body"><table>
02 <table-column column-width="( 210mm - 2 * 15mm ) - 2in"/>
03 <table-column column-width="1in"/>
04 <table-column column-width="1in"/>
05 <table-body><table-row><table-cell><block text-align="start">
06 <block font-size="19pt">Training material example</block>
07 <block font-size="10pt" space-before="10pt">Module
08 2 - The context of XSL-FO</block>
09 <block font-size="10pt">Lesson 2 - Examples</block></block>
10 </table-cell>
11 <table-cell><block text-align="end"><external-graphic
12 src="url("..\whitesml.bmp")"/></block></table-cell>
13 <table-cell><block text-align="start"><external-graphic
14 src="url("..\cranesml.bmp")"/></block></table-cell>
15 </table-row></table-body></table>
16 <block line-height="3px"><leader leader-pattern="rule"
17 leader-length="100%" rule-thickness="1pt"/></block>
18 <block space-before="6pt" font-size="14pt">
19 This page's material as an instructor-led handout:</block>
20 <list-block provisional-distance-between-starts=".43in"
21 provisional-label-separation=".1in" space-before="6pt">
22 <list-item relative-align="baseline">
23 <list-item-label text-align="end" end-indent="label-end()">
24 <block>-</block></list-item-label>
25 <list-item-body start-indent="body-start()">
26 <block font-size="14pt">excerpt of formatting objects created
27 using an XSLT stylesheet and an XSLT stylesheet processor</block>
28 </list-item-body></list-item></list-block>
29 <block space-before="12pt div 2" font-family="Courier"
30 linefeed-treatment="preserve" white-space-collapse="false"
31 white-space-treatment="preserve" font-size="12pt"><inline
32 font-size="inherited-property-value(font-size) div 2">01 </inline
33 ><flow flow-name="pages-body"><table>
34 <inline font-size="inherited-property-value(font-size) div 2"
35 >02 </inline> <table-column column-width...
]
The nesting of the hierarchy of the formatting objects in the example page:
[Figure 2.11: The nesting of XSL-FO constructs in the example
A series of nested boxes is shown, one box labeled for each formatting object in the example. Child boxes are wholly contained
within parent boxes throughout the hierarchy.
Ancestral elements are noted in dotted-line boxes as they are not present in the fragment of markup of the example: the document
element is named <[root]>, containing two elements named <[layout-master-set]> and <[page-sequence]>. The <[flow]> element of the example is a child of the <[page-sequence]> element.
]
This is an accessible version of Crane's commercial training material.
The content has been specifically designed to assist screen reader software
in viewing the entire textual content. Figures are replaced with text
narratives.
Navigation hints are in square brackets:
[Tx.x] and [Fx.x] are textual representations of the applicability icons;
[digit] indicates list depth for nested lists;
[link [URL]] indicates the URL of a hyperlink if different than link;
[EXAMPLE] indicates an example listing of code;
[FIGURE] indicates the presence of a figure replaced by its description;
[>] jumps forward;
[<] jumps backward;
[^] jumps to start of the section;
[^^] jumps to the start of the chapter;
[^^^] jumps to the table of contents.
Suggestions for improvement are welcome:
[info@CraneSoftwrights.com]
Book sales: [http://www.CraneSoftwrights.com/links/trn-acc.htm]
Information: [http://www.CraneSoftwrights.com/links/info-acc.htm]
This content is protected by copyright and, as there are no means to protect
this accessible version from plagiarism, please do not make any
commercial edition available to others.
+//ISBN 978-1-894049::CSL::Courses::PFUX//DOCUMENT Practical Formatting Using XSL-FO 2008-01-27 17:30UTC//EN
Practical Formatting Using XSL-FO
Seventh Edition - 2008-01-27
ISBN 978-1-894049-19-1
Copyright © Crane Softwrights Ltd.