Copyright © 2003-2019 Crane Softwrights Ltd.
$Date: 2019/04/18 02:40:59 $(UTC)
Table of Contents
scenario/
subdirectoryLiterateXSLT™ is a methodology, an XSLT stylesheet and supplemental validation and maintenance tools for the synthesis of production XSLT stylesheets out of prototypical transformation results that are seeded with the necessary information to build the desired transformation. Both XSLT 1.0 [XSLT 1.0] and XSLT 2.0 [XSLT 2.0] are supported.
The use of "™" is not meant to restrict the open use of this tool, but rather only to protect its name.
While some XSLT terminology is defined and used, no active knowledge of XSLT is needed for basic black-box operation (see Section 4.1, “Black-box simple operation”) of LiterateXSLT. However, knowledge of XPath is required and critical to the specification of the result operation. One limits their use to either XPath 1.0 [XPath 1.0] or XPath 2.0 [XPath 2.0] based on the production environment's corresponding use of XSLT 1.0 or 2.0. With knowledge of XSLT one can take advantage of many nuanced behaviors in this environment (see Section 4.2, “Nuanced operation”).
This tool produces an XSLT stylesheet that can then be run against source documents to produce target documents using the same structures as those found in the prototypical result document. The resulting XSLT stylesheet builds in knowledge of the source document vocabulary written as XPath expressions by the user. The stylesheet can be treated as a monolithic implementation of a black box, or as a nuanced participating stylesheet fragment in a family of stylesheet fragments.
Contrast this to the forced modular approach in Crane Softwrights Ltd.'s ResultXSLT™[Crane Resources], which takes a similar approach of annotating a prototypical result of transformation. The resulting XSLT stylesheet, however, does not have any built-in knowledge of the source document vocabulary and must be used in conjunction with other stylesheet fragments that supply this source tree knowledge using XSLT and XPath.
Deciding on which of these two environments to use depends on the modularity decided for the end result stylesheets. If you wish to create a modular stylesheet to accommodate multiple source vocabularies, then ResultXSLT™ is used and the accompanying XSLT fragments are written by the user to supply the XPath of the source documents. If you wish to create a monolithic XSLT stylesheet without knowledge of XSLT, then LiterateXSLT™ is used to embed the XPath of the source documents into the prototypical result.
Copyright (C) - Crane Softwrights Ltd. - http://www.CraneSoftwrights.com/links/res-lxslts.htm Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: - Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. - Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. - The name of the author may not be used to endorse or promote products derived from this software without specific prior written permission. THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. Note: for your reference, the above is the "Modified BSD license", this text was obtained 2003-07-26 at http://www.xfree86.org/3.3.6/COPYRIGHT2.html#5
It is often difficult for a stylesheet writer to conceptualize the result tree order when writing an XSLT stylesheet.
Yet, when writing an XSLT stylesheet, the author is obliged to ensure the transformation result is produced in result parse order. The author must orient their primary thinking around the result tree and treat the source tree with a secondary attitude. Having determined "what belongs next" in the result tree, the author goes to the source tree to find the information that is necessary to produce the result.
A common "good practice" when approaching writing an XSLT stylesheet is to mock up the result as a result instance. This helps the author of the stylesheet determine precisely the correct result structure that is needed in order to produce the result rendering. XSL-FO [XSL 1.1] processors accept hand-authored XML instances of the XSL-FO vocabulary as the input to the rendering process. This allows the stylesheet author to create a complete mockup of the result as a result instance, using an XSL-FO processor to render the mockup and the author to visually confirm the desired result.
At this stage of stylesheet development, the stylesheet author must "deconstruct" the result instance into template rules, divine the XPath patterns to match each of these template rules, divine the XPath select expressions to apply the template rules to the source nodes of the input data, establish components of the stylesheet to be localized for maintenance purposes, and organize the stylesheet into an order that promotes successful long-term maintenance.
Because XSLT is a Turing Complete functional language it is theoretically impossible to validate the use of XPath expressions and patterns without running the stylesheet and checking the results. Unfortunately, validly written XPath expressions that do not address constructs found in the source return empty node-set results rather than errors, thus compounding the stylesheet author's responsibility to correctly enter all steps of all XPath location paths used in expressions and patterns.
Recognizing that the nested application of templates produces the nested result instance, Crane Softwrights Ltd. observed that the presence of the applied templates in the result instance can be indications of the granularity of the template rules used in a stylesheet. It was further hypothesized that annotations seeded in the result instance can formally record such indications as having to be triggered from particular points in the source node tree.
The XSL-FO rules for foreign namespaces promotes this approach of benign annotation.
XSL-FO processors are obliged to ignore elements and attributes of foreign namespaces that are
not found inside of the <instream-foreign-graphic>
construct.
This allows much of an annotated result XSL-FO instance to be directly viewed in an XSL-FO
processor. Unfortunately, once a portion of the result tree is embedded within a foreign
namespace (as happens when conditional result nodes are wrapped in a fragment of XSLT), the
wrapped fragment is not processed by the XSL-FO engine. However, for much of the content of
the stylesheet the attribute annotations are perfectly benign and do not interfere with the
presentation process. Such is also true for the HTML result vocabulary, though even HTML
constructs inside of foreign namespace constructs are presented.
Early experiments produced very exciting results, and the LiterateXSLT process became formalized. To address various cosmetic issues of the synthesized stylesheet, additional annotations were divined and added to the environment. The end result is an effective stylesheet synthesis methodology and toolset deployed in a real-world project. Nevertheless, using only a handful of constructs can satisfy many requirements.
Moreover, unique opportunities for validating the XPath expressions utilized in stylesheets were identified. Such validation opportunities cannot exist for XSLT stylesheets yet were algorithmically available when approaching the task of writing the stylesheet while working from the perspective of the desired result. Unfortunately, the practical complexity of implementing the theoretically simple process is limiting the successful deployment of this validation technique.
This methodology involves transforming a prototypical result XML document into an XSLT stylesheet. The prototypical document has knowledge of the structure of source documents, annotated by the user by using XPath expressions. The synthesized stylesheet can then be run against source documents in order to produce bona fide result instances transformed as directed by the XPath expressions. Figure 1, “The LiterateXSLT™ environment” illustrates the many process paths, starting with the prototypical literate result drawn at the left of the diagram.
The typical path is drawn in the center of the diagram using thick lines and labeled
"1". Here the prototypical literate result is transformed using the
Crane-LiterateXSLT.xsl
stylesheet to create the synthesized stylesheet
in the preparation step. This is then used in the runtime step to transform production
sources into production results for use in a production process.
The path labeled "2" wraps the literate result instance in an envelope instance, where
the envelope instance can contain supplemental stylesheet content for the synthesized
result. This path is needed when the production process is not tolerant of foreign elements,
or when you wish to divorce raw XSLT content from the literate result, perhaps because the
custody of the literate result is in the hands of someone unfamiliar with XSLT. This path
also utilizes Crane-LiterateXSLT.xsl
to create the synthesized
stylesheet.
The path labeled "3" validates that there are no violations in the use of the annotation vocabulary in the literate result instance. The LiterateXSLT vocabulary is agnostic to the XML vocabulary it is annotating and can accommodate any result elements or attributes.
The path labeled "4" runs the production process on the prototypical instance in order to diagnose aspects of the prototypical instance. This path can only be followed if the production process is tolerant of foreign namespaces. The path labeled "5" strips the prototypical instance of LiterateXSLT namespaces, thus allowing an un-annotated version of the prototypical instance to be diagnosed with the production process. This path must be followed if the production process is not tolerant of foreign namespaces.
The Crane-LiterateXSLT.xsl
file is an XSLT 2.0 stylesheet used to
transform the prototypical XML input into the synthesized XSLT stylesheet output. This is
shown in Figure 1, “The LiterateXSLT™ environment” in paths "1" and "2". XSLT is also shown being used in
the diagnostic path "5" using striplit.xsl
and at runtime to transform
the production sources to the product result instances using the synthesized
stylesheet.
The free Saxon XSLT 2.0 processor[Saxon] has been used to test these
stylesheets. The xslt2.bat
invocation of Saxon in the
support/
subdirectory is invoked as follows:
..\support\xslt2.bat source.xml stylesheet.xsl result.xml
The invocation assumes that the saxon9.jar
file has been installed
in the support\
subdirectory.
The Crane-LiterateXSLT.rnc
file is a RELAX-NG grammar[RELAX-NG] in its compact syntax. It is agnostic to any XML vocabulary being
used in the prototypical result instance and only validates the use of the LiterateXSLT
constructs. This is shown in Figure 1, “The LiterateXSLT™ environment” in path "3".
The free Jing[Jing] validating processor has been used to test this
grammar expression. The rnc.bat
invocation of Jing in the
support/
subdirectory is invoked as follows:
..\support\rnc.bat ..\support\Crane-LiterateXSLT.rnc myinstance.xml
The invocation assumes that the jing.jar
file has been installed in
the support\
subdirectory.
See Section 6, “ Included demonstration files in scenario/
subdirectory” and Section 7, “ Regression test files” for examples of the
invocation of the processes.
The synthesized stylesheet is assumed to use the xsl:
prefix for the
XSLT vocabulary. You are welcome to use any other prefix as well for the XSLT vocabulary as
it is acceptable that two prefixes be mapped to the same URI. If your literate result
happens to use the prefix xsl:
for something other than the XSLT URI
"http://www.w3.org/1999/XSL/Transform
", or some other prefix for the
XSLT URI, everything will still work but your synthesized stylesheet will have a number of
necessary namespace declarations that will appear to clutter the instance.
The prefixes "x:
" and "a:
" are reserved for use by
Crane in the synthesized stylesheet and annotated XPath expressions.
All namespaces of the prototypical instance are used in the synthesized stylesheet and all XPath addresses are assumed to be using the prefixes assigned in the prototypical instance. The absence of a prefix is assumed to make reference to elements in no namespace, not the default namespace. There is no way in this environment to specify the XPath default namespace feature of XSLT 2.0. For these reasons, all XPath addresses to elements that are in a namespace must use names that are prefixed.
The prefixes "x:
" and "a:
" are reserved for use by
Crane in the synthesized stylesheet and annotated XPath expressions.
The Crane-LiterateXSLT.xsl
stylesheet runs using XSLT 2.0 and
declares the use of XSLT 2.0 in the synthesized stylesheet. The synthesized stylesheet can
still run using an XSLT 1.0 processor provided that the XPath expressions in the annotations
are restricted to XSLT/XPath 1.0 functions.
At its simplest, LiterateXSLT can be used as a black-box "fill in the blanks" process where the annotations address the source content to replace in the target content. With only three annotations, a result can be built from source without regard for the structure of the synthesized XSLT stylesheet.
At its most full-featured, LiterateXSLT
can be used to manage the
creation of the resulting stylesheet in a lot of nuanced detail. Such detail is important when
the end result is going to be incorporated as part of an amalgam stylesheet, managed as an
integral resource, with identified entry points and distinguished top-level stylesheet
constructs.
The following few constructs are the basics needed to seed a prototypical result instance with enough information to synthesize an operable production stylesheet. The other elements and attributes of the vocabulary address nuances of the structure of the synthesized stylesheet, which are not important if the stylesheet is not going to be exploited by other stylesheets in the system.
Typical black-box operation is satisfied solely with the use of a few of the attributes:
Section 8.5.1, “x:condition=
” adds an element and its descendants to the result tree
only after testing a condition to be true; if the condition evaluates to false the
result element is not added to the result tree
Section 8.8.1, “x:content-value=
” replaces the content of a result element with the
string value of the evaluation of an XPath address that obtains information from the
source file
Section 8.6.1, “x:match=
” copies the result tree for every node matched by this
attribute; also establishes the basis for relative XPath addresses used in LiterateXSLT
attributes of elements that are descendants of the element with this attribute
note it will be necessary to distinguish between two or more different parts of
the result that are triggered on any one node that is being matched correspondingly
twice or more: the distinctions are made by using a different Section 8.6.2, “x:mode=
” attribute, or within the same mode by using a different Section 8.6.3, “x:priority=
” attribute
Section 8.7.1, “a:*=
” specifies prototypical instance attributes containing XPath
expressions to override attributes that have the same local name as found in the same
start tag
One element that may need to be used is Section 8.2.1, “<x:branch>
”, but only in
situations such as populating mixed content from the prototypical instance with source
content, as described in Section 9.1, “Mixed content”, and when wrapping a number of sibling
result elements under a single Section 8.5.1, “x:condition=
”.
The LiterateXSLT
process can create a nuanced standalone XSLT
stylesheet fragment from the prototypical result instance seeded with process signals. These
nuances would be important to XSLT stylesheet developers needing the synthesized stylesheet
to exhibit certain properties, unlike black-box operation where such nuances are
irrelevant.
When used as a participating stylesheet in a family of stylesheets importing and including the synthesized stylesheet, a number of nuanced attributes will ensure needed entry points are reified as part of the stylesheet interface.
When used as the initial stylesheet syntheses for independent maintenance and support, the prototypical result instance is abandoned and the synthesized stylesheet is the starting point for stylesheet development. Using a number of the nuanced attributes will create the base stylesheet with constructs in a particular order or with a particular granularity that can then be utilized in development.
A relative XPath address is an address not beginning with "/
". Such an
address is resolved to the closest ancestral Section 8.6.1, “x:match=
” or Section 8.10.1, “x:for-each=
” XPath pattern address that establishes a current position in the
source tree. A relative XPath addressed used outside the context of any such ancestral pattern
address is interpreted as an absolute XPath address when the stylesheet is used standalone in
production.
An absolute XPath address is an address beginning with "/
". Such an
address is resolved regardless of any current position established in the source tree.
One XHTML and three XSL-FO examples are included in the LiterateXSLT environment
scenario/
subdirectory: a simple example demonstrating the process and
two complex real-world examples from the Universal Business Language (UBL) [UBLTC] project used as the basis of the development of LiterateXSLT.
The testhtml.bat
file will run on a Windows command line and will
invoke the HTML browser through the association of the .html
file extension
and the use of the start
command. When the contrived comparison instance is
present, the literate result instance XPath expressions are validated against the
instance.
The testfo.bat
file will run on a Windows command line and will invoke
the XSL-FO processor through the association of the .fo
file extension and
the use of the start
command. When the contrived comparison instance is
present, the literate result instance XPath expressions are validated against the
instance.
This is a simple example that generates a small XSLT stylesheet. Note that no attempts at including documentation-oriented LiterateXSLT constructs have been made.
No contrived comparison instance is provided for this example.
The hello.html
prototypical result instance has a simple presentation
of paragraphs:
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:x="http://www.CraneSoftwrights.com/ns/literate-xslt" x:match="greetings" x:indent="yes"> <head> <title>A demonstration of LiterateXSLT™</title> </head> <body> <p x:match="greeting" x:content-value="." x:name="each-greeting"> The content of greeting goes here! </p> <p x:see="each-greeting"> More here </p> <p x:see="each-greeting"> More here </p> </body> </html>
Note how three paragraphs are included for debug purposes: when this file is presented
on a browser the LiterateXSLT
attributes are ignored and the presentation
is as expected with three paragraphs. However, when the XSLT stylesheet is created, the
Section 8.11.3, “x:see=
” attributes indicate the second and third paragraphs have already
been accommodated in the given named template rule indicated using Section 8.10.2, “x:name=
”
(in this case "each-greeting
"), thus these two are not to be included in
the stylesheet. Therefore, only a single paragraph is accommodated in the stylesheet, and it
is instantiated for each element named "greeting
".
Note that the relative XPath address "greeting
" is resolved based on
the closest ancestral Section 8.6.1, “x:match=
” attribute, that being
"greetings
", which itself is relative to the root node because of no
ancestral x:match=
attribute. The Section 8.8.1, “x:content-value=
”
indicates that the string value of the element (the XPath address ".
") is
to be injected as the value as content of the "p
" element.
The prototypical result instance is converted into the synthesized XSLT stylesheet using
Crane-LiterateXSLT.xsl
:
..\support\xslt2 hello.html ..\support\Crane-LiterateXSLT.xsl hello.xsl
The production data file hello.xml
has the following
structure:
<greetings> <greeting xml:lang="en">Hello! / Hi!</greeting> <greeting xml:lang="fr">Salut / Bonjour</greeting> ... <greeting xml:lang="sv">Hej</greeting> <greeting xml:lang="tr">Merhaba</greeting> </greetings>
The production data file is processed using the synthesized XSLT stylesheet into the production result:
..\support\xslt2 hello.xml hello.xsl hello-result.html
The resulting hello-result.html
document shows how the value of each
of the greeting elements has been injected into copies of the "p
"
element:
<html xmlns="http://www.w3.org/1999/xhtml"> <head> <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /> <title>A demonstration of LiterateXSLT</title> </head> <body> <p>Hello! / Hi!</p> <p>Salut / Bonjour</p> ... <p>Hej</p> <p>Merhaba</p> </body> </html>
To run this test for HTML, execute:
testhtml hello
To run this test for XSL-FO, execute:
testfo hello
Both of these examples demonstrate the use of a complex LiterateXSLT literate result instance to produce an ordered and documented XSLT stylesheet.
The contrived comparison instance Order.xml
is synthesized from the
UBL 0p70 W3C Schema document model. This instance does not validate against the document
model, but every element and attribute found in this contrived instance is a possible
element or attribute in a valid instance because the document model explicitly references
the construct in the content models.
To run these tests, execute:
testfo office Order
testfo UN220Order Order
The test.bat
file calls the testone.bat
for each of
the *-proto.xml
regression tests found in test/
subdirectory. The testone.bat
file recreates the
*-out.xml
file for each regression test and compares the result against
the expected result in the test/expected/
subdirectory. This comparison is
done using the deepEqual.xsl
stylesheet instead of a typical
diff
utility because of the intelligent interpretation of XML rules (such
as the irrelevant order of attributes).
The vocabulary is based primarily on attributes defined in the LiterateXSLT namespaces, though there are also a few elements. Attributes tend to be more benign to XML processing than interfering with structure using embedded elements. This places a burden on the literate stylesheet writer to express much of the desired functionality as a collection of attributes in the pro-forma result.
Two namespaces are defined in this vocabulary:
http://www.CraneSoftwrights.com/ns/literate-xslt
for elements and attributes that are predefined facilities
throughout this document this is the assumed URI for the namespace with the
prefix "x
"
http://www.CraneSoftwrights.com/ns/literate-xslt/attribute
for the specific facility of changing the value of an attribute that is not in any namespace
throughout this document this is the assumed URI for the namespace with the
prefix "a
"
this instruction is deprecated in favor of using a benign result vocabulary
construct like the HTML <span>
or the XSL-FO
<inline>
and one of the content attributes described in Section 8.8, “ Content replacement facilities” or one of the branch attributes described in Section 8.9, “Branch control attributes”
when such a construct is not available, then this instruction is used and is replaced with the result of interpreting the attributes
contains verbatim synthesized stylesheet content to be added arbitrarily to the synthesized stylesheet
can be anywhere before the payload in the envelope instance
can only be in the literate result instance as the repeatable first element
children of an element with an x:match=
,
x:name=
or x:content-name=
attribute
the content is placed before, inside and after the synthesized
xsl:template
instruction based on the use of
x:prefix=
, x:infix=
and
x:suffix=
for the template
no content of this element can contain any LiterateXSLT elements or attributes
label="value"
attribute is required for referencing purposes
note that to preserve all white-space in the entire x:verbatim
element use the standard xml:space="preserve"
attribute
contains verbatim synthesized stylesheet content that replaces all sibling prototypical content
no content of this element can contain any LiterateXSLT elements or attributes
this is necessary in order for the literate result to contain example output for analysis purposes but not for runtime purposes
when generating the synthesized stylesheet the sibling example output is
discarded and only the descendents of <x:replace>
are
used
document element of an envelope instance
used when it is necessary not to include any foreign elements in the literate result instance
not allowed to be used inside the literate result instance
only in the envelope instance
comprised of any number of x:verbatim
constructs followed by a
single x:payload
construct
envelope wrapper of a single element child being the document element of the literate result
not allowed to be used inside the literate result instance
only in the envelope instance
declare a global variable for referencing in XPath
name=
specifies the name of the variable
select=
specifies the evaluation of the variable; mutually
exclusive with the non-empty content of the declaration
as=
specifies the memory orientation for the information
declare a local variable for referencing in XPath
name=
specifies the name of the variable
select=
specifies the evaluation of the variable; mutually
exclusive with the non-empty content of the declaration
as=
specifies the memory orientation for the information
this is respected only when following a construct that instantiates a template in the result stylesheet
declare a local template parameter for referencing in XPath
name=
specifies the name of the variable
select=
specifies the evaluation of the variable; mutually
exclusive with the non-empty content of the declaration
as=
specifies the memory orientation for the information
tunnel=
specifies the automatic passing of the parameter
required=
specifies the necessity of the parameter
this is respected only when following a construct that instantiates a call to a template in the result stylesheet
declare a local template parameter for referencing in XPath
name=
specifies the name of the variable
select=
specifies the evaluation of the variable; mutually
exclusive with the non-empty content of the declaration
as=
specifies the memory orientation for the information
tunnel=
specifies an automatically-passed of the
parameter
this will send the content of the element to the XSLT processor’s reporting channel (e.g. standard error) if the effective boolean evaluation of the condition XPath expression evaluates as true
a common need when composing such messages is to quote content from the input
document; this is achieved using constructs such as <x:branch
x:value-of=”{XPath}”/>
terminate="yes-or-no-default-no"
when “yes” the
XSLT processor is signaled to stop processing (most processors will stop with an error
indication)
condition="effective-boolean-expression-default-true"
an XPath
expression tested as true or false, when the expression absent (not the evaluation) it
is assumed to be true
By default, white-space in the source tree is controlled by the whim of the XSLT processor and how it has been invoked by the production or workflow environment. Typically for XSLT 1.0 processors all white-space-only text nodes are preserved, and it is awkward or impossible to modify the behavior. Typically for XSLT 2.0 processors all white-space-only text nodes are ignored, and it is an easy command-line invocation to modify the behavior.
These attributes provide nuanced interpretation after the source tree has been built by the XSLT processor. These attributes do not provide any access to the invocation method of the XSLT processor as that is the responsibility of the production or workflow processes.
x:preserve-space="white-space-separated-list-of-element-patterns"
indicate the list of element types where white-space-only child text nodes are not to be ignored
only allowed on the document element of the literate result
By default, the result tree is not indented and any white-space indentation used in the prototypical instance is ignored. Use these attributes for nuances that need to be different.
x:exclude-result-prefixes="white-space-separated-list-of-prefixes"
indicate which synthesized stylesheet namespaces are not included in the production result
only allowed on the document element of the literate result
note that this will not prune unwanted namespaces copied from the source document, only those placed in the stylesheet as part of its synthesis
x:xpath-default-namespace="namespace-uri"
declare a namespace to use for un-prefixed names in XPath addresses
x:indent="yes-or-no-default-no"
indicate that the resulting instance is to be indented according to the whims of the XSLT processor
only allowed on the document element of the literate result
xml:space="default"
standard XML facility for marking all descendent text white-space-only text nodes insignificant and able to be discarded
note that all non-white-space-only text nodes are automatically encapsulated in
the synthesized stylesheet with xsl:text
xml:space="preserve"
standard XML facility for marking all descendent text white-space-only text nodes significant
those nodes not in x:verbatim=
are placed in
xsl:text
instructions so as not to be disturbed by any
post-processing
note that all non-white-space-only text nodes are automatically encapsulated in
the synthesized stylesheet with xsl:text
x:condition="effective-boolean-expression"
the element to which this attribute is attached is only added to the result tree if the effective boolean evaluation of the XPath expression evaluates as true.
in the situation where there are many adjacent sibling elements each needing to be
under the same boolean condition, use this attribute on an Section 8.2.1, “<x:branch>
”
element as the parent of the siblings
These attributes establish bases for the use of relative XPath addresses in the attributes of descendent result elements.
x:match="pattern-expression"
create a template rule whose top element is the element with this attribute and
the match pattern is the value of this attribute (except for the
<x:branch>
element where all the children of
<x:branch>
become the children of this template rule)
if no x:apply-templates=
attribute is present, then also
replace the element with an xsl:apply-templates
instruction using
the pattern as the select expression
this is mutually exclusive with x:call-template=
note that the same pattern expression cannot be used in two or more
x:match=
attributes without some kind of distinction using
x:mode=
and or x:priority=
x:mode="mode-name"
used when creating a template rule and when pushing nodes for the standard
xsl:template
and xsl:apply-templates
semantics
for mode=
a:any-NCName-here="result-stylesheet-value-for-named-attr"
any number of attributes in this namespace can be used in an element to supply the replacement string value for the prototype value of an attribute (of the same name and in any namespace) in the synthesized stylesheet
if there is a default value for the attribute, that is specified in the example element and only replaced with the calculated value for the annotation if there is a value calculated (which may not happen if the XPath expression addresses an absent node)
if there is no default value for the attribute, this annotation assumes the attribute to be created is in no namespace and the attribute is only added if there is a value calculated for the annotation
note: to replace the prototype values of a number of fixed attributes that have the same name but are in multiple namespaces, utilize named attribute sets and throw away the annotated instance attributes
note: it is not possible in this version of the vocabulary to specify replacing the prototype values of a number of variable attributes that have the same name but are in multiple namespaces
x:keep="space-separated-list-of-prefixed-attribute-names"
any attributes on the current element that are not on this list are not included in the synthesized stylesheet
x:keep=""
throws away all input attributes
attributes from the LiterateXSLT environment are automatically removed from the synthesized stylesheet
this is mutually exclusive with x:lose=
x:lose="space-separated-list-of-prefixed-attribute-names"
any attributes on the current element that are on this list are not included in the synthesized stylesheet
x:lose=""
keeps all input attributes
attributes from the LiterateXSLT environment are automatically removed from the synthesized stylesheet
this is mutually exclusive with x:keep=
x:set="unique-name-of-attribute-set"
define an attribute set with the given name to contain all of the kept attributes of the element
any attributes lost with x:lose=
or kept with
x:keep=
are assumed to not belong to the set
all those in the defined set are automatically removed from the element with this attribute
if the list of attributes is different than the set of determined by the above
conditions, use x:verbatim=
elsewhere in the instance for this
definition
x:content-value="XPath-expression"
replace the content of the element with an xsl:value-of
instruction
this is mutually exclusive with other x:content-*=
attributes
any child content of the element is ignored
x:content-apply="select-expression"
replace the content of the element with an xsl:apply-templates
instruction with the given expression
an empty string value indicates no select attribute
this is mutually exclusive with other x:content-*=
attributes
except x:content-mode=
any child content of the element is ignored
x:content-mode="mode-name"
this adds a mode for the x:content-apply
application of
templates
this is mutually exclusive with other x:content-*=
attributes
except x:content-apply=
x:content-name="template-name"
create a template rule of the given name using the contents of the element as the template content
replace the content of the element with an xsl:call-template
instruction
this is mutually exclusive with other x:content-*=
attributes
x:content-call="template-name"
replace the content of the element with an xsl:call-template
instruction
this is mutually exclusive with other x:content-*=
attributes
x:content-copy="select-expression"
replace the content of the element with an xsl:copy-of
instruction with the given expression
an empty string value indicates no select attribute
this is mutually exclusive with other x:content-*=
attributes
x:content-variable="variable-name"
declare a top-level variable with the given name being the value of the element
replace the content of the element with an xsl:value-of
instruction to the variable named in the argument
this is mutually exclusive with other x:content-*=
attributes
x:content-verbatim="label-of-verbatim-element"
this replaces the content of the template with the labeled verbatim content declared elsewhere
this does not interfere with x:infix=
injection before the
template content
this is mutually exclusive with other x:content-*=
attributes
x:content-post="template-name"
all generated content is passed through a post-processing template rule to be added to the result tree
the post-processing template needs to accept a parameter of the name “value” that contains the value to be post-processed
this does not interfere with x:infix=
injection before the
template content
Note these are allowed on any result element and are not the only attributes allowed on
<x:branch>
. The following attributes also are allowed on
<x:branch>
in order to encompass a set of siblings with a single
stylesheet construct: Section 8.5.1, “x:condition=
”, Section 8.6.1, “x:match=
”, Section 8.10.1, “x:for-each=
” and Section 8.10.2, “x:name=
”.
x:copy-of="select-expression"
replace the element and its descendants with the result of evaluating the XPath expression (which may be a set of nodes)
See also Section 8.6.1, “x:match=
” for a template creation attribute.
x:for-each="select-expression"
replace the element with an xsl:for-each
instruction and use
the element as the child of the instruction (except for the
<x:branch>
element where all the children of
<x:branch>
become the children of this instruction)
the XPath expression attribute value cannot be empty
this is mutually exclusive with x:apply-templates=
and
x:call-template=
x:name="template-name"
create a template rule whose top element is the element with this attribute and
the name is the value of this attribute (except for the
<x:branch>
element where all the children of
<x:branch>
become the children of this template rule)
note that the presence of an x:match=
attribute also might
affect the creation of this template rule
if no x:match=
attribute is present, then also replace the
element with an xsl:call-template
instruction using the given
template name
this also identifies a particular element with x:match=
for
referencing by x:see=
this is mutually exclusive with x:call-template=
x:apply-templates="select-expression"
replace the element with an xsl:apply-templates
instruction and
ignore the children of this element
an empty string value indicates no select attribute
this is mutually exclusive with x:call-template=
and
x:for-each
x:call-template="template-name"
replace the element with an xsl:call-template
instruction
this is mutually exclusive with any of x:name=
,
x:apply-templates=
or x:match=
x:see="template-name-with-match" or x:see=""
throw away the element and all of its descendants
the label points to another instance of the element or some other element somewhere in the instance has the handling of this logic for this element
not having a check and balance may produce an undesirable result without any errors being reported
x:content-infix="label-of-verbatim-elements"
add a copy of the verbatim element to the synthesized stylesheet after the start
tag of the template created with x:content-name=
also see x:content-prefix=
and
x:content-suffix=
x:content-prefix="label-of-verbatim-elements"
add a copy of the verbatim element to the synthesized stylesheet before the start
tag of the template created with x:content-name=
also see x:content-infix=
and
x:content-suffix=
x:content-suffix="label-of-verbatim-elements"
add a copy of the verbatim element to the synthesized stylesheet after the start
tag of the template created with x:content-name=
also see x:content-infix=
and
x:content-prefix=
x:infix="label-of-verbatim-elements"
add a copy of the verbatim element to the synthesized stylesheet after the start tag of the element with this attribute
also see x:prefix=
and x:suffix=
x:prefix="label-of-verbatim-elements"
add a copy of the verbatim element to the synthesized stylesheet before the element with this attribute
also see x:infix=
and x:suffix=
x:comment="string"
create a comment in the synthesized stylesheet with this content placed before the element with this attribute is otherwise processed
if this happens to be in an element that creates a template rule, the line with the reference to the template rule gets the comment, not the template rule itself
use x:prefix=
to set the comment before the template
rule
x:order="numeric-order-of-template-in-result-stylesheet"
used when x:match=
or x:name=
creates a
template in order to place the template in relative order to other templates in the
file
x:ignore=""
the element is ignored entirely when generating a stylesheet
this is useful to include the element for the purposes of validating the prototypical instance after the literate properties have been stripped
x:strip=""
this doesn’t change the resulting stylesheet but is useful in combination
with the stripping stylesheet Section 10.4, “
striplit.xsl
” to elide undesired
elements from the resulting schema check
Consider the following source tree in the regression test
branch-value-of-in.xml
:
<in-file>XYZ</in-file>
And the desire to inject that content into the middle of a mixed-content element with
other string data in branch-value-of-out.xml
:
<out-file>abcXYZdef</out-file>
The Section 8.2.1, “<x:branch>
” element is used as the placeholder in mixed content that
is replaced with the content from the source file as in
branch-value-of-proto.xml
:
<out-file xmlns:x="http://www.CraneSoftwrights.com/ns/literate-xslt" >abc<x:branch x:value-of="/in-file"/>def</out-file>
When elements cannot be used a packaging mechanism is available to use with XML external
parsed general entities to envelope the LiterateXSLT environment around any well-formed XML
instance that does not have an internal declaration subset. If your instance does have an
internal declaration subset, move the entire DOCTYPE
declaration to the
envelope instance.
This technique can be used to utilize a literate result instance without modifying it:
<!--example envelope for LiterateXSLT™--> <!DOCTYPE x:envelope [ <!ENTITY literateXSLTpayload SYSTEM "test.fo"><!--point to instance--> ]> <x:envelope xmlns:x="http://www.CraneSoftwrights.com/ns/literate-xslt"> <x:verbatim label="stuff-1"> ... </x:verbatim> <x:verbatim label="stuff-2"> ... </x:verbatim> <x:payload> &literateXSLTpayload;<!--bring instance into the XML process--> </x:payload> </x:envelope>
Refer to Section 8.2.4, “<x:envelope>
”, Section 8.2.2, “<x:verbatim>
” and Section 8.2.5, “<x:payload>
” for details.
A number of stylesheets are included in the package to help with creating the literate result instance.
Performs an XPath deep equal comparison of the source tree and the tree of the file
opened as a document node in the parameter named other
. Terminates with
an error message if the two trees are not XML deep equal. When using Saxon such a parameter
is opened as a document node by specifying: "+other=uri"
Replaces all elements in the source document's element's default namespace with elements
in the same namespace but using the namespace prefix supplied in the
prefix=
invocation parameter binding.
This stylesheet checks all elements for use of non-namespace attributes of the same name as LiterateXSLT attributes in elements other than XSLT elements. This is the sanity check for having mistyped the literate result, accidentally leaving off the namespace prefix for LiterateXSLT for a synthesized stylesheet element. This turns out to be a more common than one might think.
When the literate result has legitimate non-namespace attributes of the same name as
LiterateXSLT attributes, it is necessary to customize a check stylesheet that imports
checkuse.xsl
while overriding all legitimate attributes in the context
of legitimate elements with empty template rules.
This strips any LiterateXSLT vocabulary from an instance. Run this on the literate result instance to remove all Literate XST elements and attributes when the production process cannot tolerate their presence. Figure 1, “The LiterateXSLT™ environment” illustrates the use of this in the path labeled "5".
Exposes the XPath address of all non-white-space-only text nodes, attributes and empty elements of the source document. Note that text nodes are normalized before being tested, such that any white-space-only text nodes that might have been significant are not included in the report.
These diagnostic stylesheets work in series, first with keyxml.xsl
taking a sample production source instance producing a mimicked instance with the content
replaced with key information, then either (or both) a text or HTML cross reference
summarizing all of the XPath address to the information and a copy of the
information.
Example of use:
rem Produce keyed version of sample XML file ..\support\xslt2 hello.xml ..\support\keyxml.xsl hello-key.xml rem Produce html version of key report ..\support\xslt2 hello-key.xml ..\support\key2html.xsl hello-key.html rem Produce text version of key report ..\support\xslt2 hello-key.xml ..\support\key2text.xsl hello-key.txt
Replaces all attributes and non-white-space only text nodes of an XML instance with an
ordinal number (the key). The number is surrounded with exclamation marks to help
distinguish the content. The original value of child text nodes is stored in the
x:value=
attribute. The path of the element node is stored in
x:path=
attribute. All non-namespace-attributes are copied into the
a:
namespace using the same local name.
Run this program on a copy of your production source instance to produce a filled instance. Run the synthesized stylesheet with the filled instance and examine the results for missing or incorrect content.
Reads a keyed instance (see keyxml.xsl
) and produces in either HTML
or text a cross reference reporting the key value and the XPath address of the data
represented by the key and the value for that key from the test file.
The parameter repeat=no
will suppress like sibling elements from
being reported.
The parameter value=no
will suppress the value column from being
reported.
The parameter label=some text
will add a label of the given text to
the top of the file for auditing purposes
During the development of the LiterateXSLT environment, it quickly became obvious that simple typographical errors entered by the author of the literate result instance in XPath expressions and patterns can waste a lot of time in diagnostics. A syntactically valid XPath expression or pattern is not an error in XPath if the XML construct addressed by the expression is not present in the instance. This is a common mistake of stylesheet writers and the source of a lot of grief in debugging stylesheets, but is theoretically impossible to validate within an XSLT stylesheet.
The unique nature of a literate result instance compared to an XSLT stylesheet opens the opportunity to check for typographical errors in XPath expressions and patterns in an algorithmic and unambiguous fashion, resulting in a tremendous saving of time and detection of simple errors on the part of the literate result instance author.
This trio of diagnostic stylesheets and a program work in series, taking two inputs as a literate result instance in the first stage and a sample production source instance in the third stage. The third stage produces a report of XPath expressions found in LiterateXSLT attributes that cannot be found as nodes in the sample production source instance.
The theory behind these validation tools is based on the XSL-FO literate result instance not being algorithmic. Because of the computing science theory called "the halting problem", an XSLT stylesheet cannot be analyzed for correct use of XPath expressions because XSLT is Turing Complete: the only way to check if a program works is to run the program. This means that the XPath expressions in an XSLT stylesheet cannot be independently analyzed for being valid XPath expressions in the source XML.
This theoretical halting problem is not a characteristic of the literate result instance. The literate result instance is the result of an XSLT stylesheet and is concrete; it is not algorithmic. Given the nested portions of the literate result instance are created as a result of the triggering of XPath expressions, the nested use of XPath expressions in the literate result instance unambiguously represent absolute XPath expressions in the source XML documents.
Unfortunately the XPath expression language has restrictions on the first step of multiple step location paths. Many location paths are nested in the literate result instance. When these nested location paths are simply concatenated together, the resulting expression is not a valid XPath expression.
The essence of the lit2xpath.xsl
XSLT stylesheet is to obtain and
concatenate all of the nested XPath expressions from a literate result instance. The
essence of the xpathall.py
Python[Python] application
is to reduce XPath expressions with invalid intermediate location steps into sets of
equivalent XPath expressions with valid intermediate location steps. The essence of the
xml4lit.xsl
XSLT stylesheet is to compare the collection of reduced
XPath expressions against all valid XPath expressions allowed for the document
type.
If there is an XPath expression in the reduced set that does not address any of the constructs in the instance of all valid XPath expressions, it is assumed there is an error in the literate result instance and the author has mistyped the appropriate XPath select expression or match pattern.
Analyze a literate result instance input, emitting a summary instance of concatenated LiterateXSLT attribute values of XPath expressions, with the XPath locations where the end of the expression is used in the literate result instance.
Analyze the output from lit2xpath.xsl
creating a nested hierarchy
of element and attribute descriptions of constructs required to satisfy the XPath
expressions found in the original literate result instance.
Please note this diagnostic covers many of the types of XPath expressions but does not attempt to implement and verify all the many and varied possible combinations and expressions one can write using the powerful XML Path Language. Unfortunately, as of the time of writing of this document, this utility does not successfully reduce all possible concatenations of XPath expressions and patterns into valid XPath expressions. This is an area of continued development.
To protect XPath expressions that are not properly processed by these diagnostic
tools, wrap your literate result instance fragments in an x:verbatim
or
x:replace
construct, as the contents of these constructs is not
analyzed by these diagnostic processes.
Analyze the output from xpathall.py
reporting those XPath
expressions found in the original literate result instance that do not have a corollary in
the comparison XML instance passed via the invocation parameter named
"xmlfile
".
By default this is considered a fatal error with the report going to the message port
using the message termination facility in XSLT. Setting the invocation parameter named
"noerrors
" to any non-empty string will cause the program to send the
report to the standard output and not use the message termination facility in XSLT.
Note that the comparison XML instance file is a contrivance created from an analysis of the document model against which instances are written. This contrived instance is not meant to validate against the document model because document models often offer choices and conditionality to the presence of constructs in an instance. The contrived instance is synthesized to contain each and every possible element and attribute allowed anywhere by the document model, thus would never validate against the document model. It is a common practice when using stylesheets that the input data is validated against the document model before being styled, thus it is not the responsibility of the stylesheet to perform validation of the content. Therefore, the contrived comparison instance can be used as a benchmark by the literate result instance for the theoretical presence of the element or attribute addressed in an XPath expression.
This last stage, therefore, can confirm that an XPath expression or pattern written in the literate result instance is impossible if the expression is unable to address anything in the contrived comparison instance. However, this last stage cannot confirm that an XPath expression or pattern is impossible due to constraints expressed in the document model that governs the creation of the source instances.
The conceptual foundation of a graphical user interface to the LiterateXSLT environment is
described in the document guilit.htm
.
2019-04-18 - bug fixes
2019-04-11 - add explicit error exit with a message using
x:message
2019-02-17 - add declaration of parameters to template rules
2019-02-16 - add declaration of global and local variables
2018-04-17 - x:condition
anywhere, add post-processing of obtained
values, add default namespace
2012-07-31 - accommodated important nuances of using xml:space
and
other attribute handling
2008-11-16 - changes after five years of feedback and use
updated for XSLT 2.0
introduced new keywords and summarized the "simple black-box operation" in contrast to the "full-featured operation"
introduced one backward incompatibility: literate attributes are now assumed to be
standalone XPath expressions and not attribute value templates; where the old system
accepted an attribute value template with calculations in brace brackets, these must now
be changed to XPath expressions (perhaps needing to take advantage of
concat()
where there used to be two or more XPath expressions in one
attribute, or adding quotes around where there used to be a simple text value); introduced
because it seemed that every use of an overriding attribute was with an attribute value
template, and XSLT 2.0 makes attribute handling easier.
no changes at this time to the utility an diagnostic stylesheets (due to lack of feedback); except for Saxon warning these XSLT 1.0 stylesheets are being run with an XSLT 2.0 processor, the operation should be unchanged though this has not been verified
2003-07-30 - first release
[Crane Resources] Crane Softwrights Ltd. Free developer resources
[Jing] James Clark Home page
[Python] Python http://www.python.org
[RELAX-NG] James Clark, Makoto Murata ISO/IEC 19757-2 RELAX-NG (Regular Language for XML)
[Saxon] Michael Kay Saxon
[UBLTC] Jon Bosak, Tim McGrath OASIS UBL Technical Committee 2001
[XPath 1.0] James Clark, Steve DeRose XML Path Language (XPath) Version 1.0 1999-11-16
[XPath 2.0] Anders Berglund, et al XML Path Language (XPath) Version 2.0 2007-01-23
[XSL 1.1] Anders Berglund Extensible Stylesheet Language Version 1.1 2006-12-05
[XSLT 1.0] James Clark XSL Transformations (XSLT) Version 1.0 1999-11-16
[XSLT 2.0] Michael Kay XSL Transformations (XSLT) Version 2.0 2007-01-23