Character dump program using XSL-FO

This Python utility dumps the character contents of a supplied file assuming a particular character encoding for the bytes in the file. Each character in the file is displayed in a large font, with the corresponding hexadecimal Unicode character code point displayed below.

Some Unicode non-displayed characters are rendered using their abbreviated name, such as the C0 control characters and the Unicode directionality characters LRE, RLE, LRO, RLO and PDF. This utility was created specifically to debug streams of characters incorporating these direction changing special characters.

Because of the powerful text formatting features of XSL-FO, the resulting information is an XSL-FO 1.0 conformant XML instance ready for processing by an XSL-FO engine. No vendor extensions are used. This is an example of using XSL-FO independent of XSLT in that this Python program generates XSL-FO directly, ready for processing.

The file is downloaded here:

An example session using the test files would be along the lines of:


Usage:  [-opt]* filename

 -lf           = act on linefeed as well as interpret linefeed
 -a4           = A4 page size
 -a4l          = A4 landscape page size (default)
 -us           = US letter page size
 -usl          = US letter landscape page size
 -e {encoding} = character encoding system used in the input file
               = default: Latin1
               = other typical values: utf-8, utf-16, utf-16-le, utf-16-be

T:\ftemp>python -lf dtest1.txt >t:\

T:\ftemp>python -lf dtest1.txt >

T:\ftemp>python -lf dtest2.txt >

T:\ftemp>python -lf -e utf-8 dtest2.txt >

2002-11-11  09:03             6,056
2002-11-10  12:54                34 dtest1.txt
2002-11-11  09:03             6,199
2002-11-10  12:57                35 dtest2.txt
2002-11-11  09:03             6,056
2002-11-11  08:58             6,566
               6 File(s)         24,946 bytes


Limitation: only those character sets supported by your Python registry of codecs (encoders and decoders) can be used.

Side note: the verb "to dump" regarding file contents is an old computing term used in diagnostics. The noun "a dump" of a file or memory is the diagnostic formatting of the file or memory contents. Dumps typically have hidden information that otherwise isn't seen in standard listings.

Crane logo

Please consider to

towards our
free resources.

+1 (613) 489-0999 (Voice)
+1 (613) 489-0995 (Fax)

Link traversal: This web site relies heavily on client-side redirection. If certain links do not work for you, please ensure you have this behaviour enabled in your browser.

Site navigation:

Small print: All use of this web site and all business conducted with Crane Softwrights Ltd. is subject to the legal disclaimers detailed at ... please contact us if you have any questions. All trademarks, servicemarks, registered trademarks, and registered servicemarks are the property of their respective owners.

Link legend: links that are marked with this dotted underline will open up a new browser window, otherwise the same browser window is used for the link target. 

Last changed: $Date: 2006/12/28 00:05:31 $(UTC) (Privacy policy)