Crane's UBL 2 instance filters

$Date: 2008/10/26 14:20:05 $(UTC)


Table of Contents

1. Instance pruning with filters
2. UBL 2.0 version filters
2.1. UBL 2.0 XSLT filters
2.2. UBL 2.0 Python filters
Bibliography

1. Instance pruning with filters

The OASIS Universal Business Language [UBL 2.0] defines a family of powerful, flexible and large document types to provide for many possible uses of information in business documents.

Figure 1. Original UBL process model

Original UBL process model

As future minor versions are released, new information items or new allowed quantities of existing information items may be defined in the UBL document models. Instances of later minor versions may then have more information items than allowed for instances of earlier minor versions.

When a deployment of UBL receives an instance, that instance may be of a minor version defined after UBL was deployed by the recipient. If that instance violates the model constraints of the deployment version in a first-pass schema validation, it may be that the application is unable to inspect the instance for information that might still be valuable.

An instance is created by a system that may have been at any version of UBL, before or after the revision being supported by the recipient. The recipient filters the instance if the instance fails first-pass schema validation. The recipient's application will know that the instance being processed is less than the received instance if this failure to have passed initial validation is indicated to the application. If initial validation succeeded, the creator has not included any information items not recognized by the recipient, regardless of the version of UBL models being used. This is depicted in the following figure:

Figure 2. Instance pruning process with a version filter

Instance pruning process with a version filter

These UBL 2 instance filters from Crane Softwrights Ltd. will prune a UBL instance of all information items not defined for a particular revision of the UBL models. The end filtered result can then be checked for model constraints with a second-pass schema validation, and if there are no violations, the application can then work with the content.

Note

It is the instance creator's responsibility to populate the UBLVersionID item with the particular minor version of UBL against which the model is known to validate. While it is ideal that the minor version quoted is the "high-water mark" of the constructs used, this analysis may be untenable and the minor version may merely be the installed minor version of UBL being used by the instance creator.

In this version of the filter package, only filters for UBL 2.0 are supported.

Note

These filters have been synthesized from UBL document models. Not every filter has been individually tested. Should you find anything questionable regarding these filters, please contact us at info@CraneSoftwrights.com with the details and some sample data.

2. UBL 2.0 version filters

These are version filters supporting the update release 2.0 of UBL. All constructs allowed by a subsequent minor version of UBL 2 will be removed if not supported by the this release of UBL 2. These filters are fully compatible with those of the original release of 2.0, but now properly support UBL constructs for extensions.

2.1. UBL 2.0 XSLT filters

The Extensible Stylesheet Language Transformations 1.0 [XSLT 1.0] implements a tree-based instance construction processing paradigm. The source XML being processed is read into an in-memory tree. The result XML stream is not maintained in memory but is serialized as a construction resulting from an analysis of the source in-memory tree.

These filters have been tested with the Saxon 6.5.5 [Saxon].

The filter CraneUBL2Filter.xsl supports all document models simultaneously.

The 31 other filters are named with the particular document model, each supporting only those constructs expected in the document model.

2.2. UBL 2.0 Python filters

The Simple API for XML [SAX] interface implements a stream-based processing paradigm. The source XML being processed is read not as an in-memory tree but as a series of real-time events, thus not requiring a memory footprint whose size is related to the instance size. The result XML stream is constructed by the programming logic in the language using the SAX interface

The Python [Python] programming language version 2.5.1 has a functioning SAX interface (earlier versions have bugs preventing these filters from working properly; later versions are assumed to continue supporting the interface correctly). Python is an interpretive language, not natively supporting XML but uses a library implementation of SAX.

Note

The performance of the interpretive Python/SAX 2.5.1 implementation pales in comparison to the speedy Saxon/XSLT native-XML implementation. Nevertheless, if the memory footprint of XSLT precludes its use in your environment, then this implementation or a SAX implementation in another programming language may more suit your needs.

The filter CraneUBL2Filter.py supports all document models simultaneously.

The 31 other filters are named with the particular document model, each supporting only those constructs expected in the document model.

Each of these 32 filters imports the CraneUBLFilterBase.py common module. This common module is not designed to be invoked in a standalone fashion.

Bibliography

[UBL 2.0] Jon Bosak, Tim McGrath, G. Ken Holman Universal Business Language Version 2.0 (documentation) (ZIP) 2006-12-12

[Python] Python http://www.python.org

[SAX] David Megginson Simple API for XML

[Saxon] Michael Kay Saxon 6.5.5 for XSLT 1.0

[XSLT 1.0] James Clark XSL Transformations (XSLT) Version 1.0 1999-11-16