This article was published in 1999, before XSL became a W3C Recommendation. Since then, the transformation part of XSL has been referred to as XSLT (not "XTL", as suggested below), and the presentation vocabulary is called XSL-FO (not "XFO", as suggested below). The arguments against Formatting Objects on the web, however, are just as valid now as in 1999.

Formatting Objects considered harmful

Håkon W Lie <> April 15, 1999


The W3C Working Group on XSL is currently producing two specifications: a transformation language (called "XTL" in this document) and a set of formatting objects written in XML (called "XFO" in this document). The idea is for XTL to transform XML data and documents into set of formatting objects which subsequently can be rendered. On the ladder of abstraction from presentation to semantics, XFO is at the level of presentational HTML elements. A Web of XFO documents can be compared to a Web of HTML documents with only FONT and BR tags. Although not intended to be used on the Web, it's unlikely that it can be prevented. XFO is therefore a threat to accessibility, device-independence and the dream of a semantic Web. The note ends with some suggestions on how to solve the problem.


XML holds the promise of being the cornerstone in the building of a semantic Web. By capturing semantics which is outside the scope of HTML, new formats written in XML will facilitate, amongst other things, better document cataloging and discovery services for authors and users.

The XSL Working Group is working on two specifications which, if successful, will change this picture. The first is a transformation language (called "XTL" in this document), and the second is a DTD for formatting objects written in XML (called "XFO" in this document).

A common use of XTL is to transform XML data and documents into HTML on the server side. Several experimental implementations support XTL and they allow content providers to use their favorite DTDs internally while serving HTML to the huge installed base of Web browsers. XTL provides a declarative way of specifying simple transformations, and this is a good thing.

XTL can also be used to generate XFO. Formatting objects describe how chunks of information are formatted before presented to a human user. The push for XFO comes from vendors with a noble goal: they would like to improve the quality of printed material from the Web. Unfortunately, when transforming documents into XFO, all semantics is removed and only the human presentation is left. Moreover, the presentation is tied to a certain output media (which most likely is visual).

If XFO is deployed on the Web, accessibility, device-independence and semantics will be the victims. It's important to note that this problem only arises when XFO are shipped across the Web. When contained within formatters, formatting objects are not not harmful.

Code examples

This section will give three examples of how XTL can be used. The first example transforms from XML to HTML, the second transforms from XML to XFO and the third transforms from XML to HTML/CSS. All examples use this simple XML element as input:

<Heading1>The headline</Heading1>

Example 1: XML to HTML

The first XTL sheet [1] transforms the XML element into HTML:

<xsl:template match="Heading1">

The result is:

<H1>The headline</H1>

The resulting HTML is at a high enough level of abstraction that device-independence and accessibility is preserved. What is lacking in information about how to present it.

Example 2: XML to XFO

In this example, the XTL sheet transforms the XML element into a formatting object:

<xsl:template match="Heading1">
  <fo:block font-size="1.3em" margin-top="1.5em" margin-bottom="0.4em">

The result is:

<fo:block font-size="1.3em" margin-top="1.5em" margin-bottom="0.4em">
  The headline

The difference between example 1 and example 2 is one of semantics vs. presentation. When transformed into HTML, the semantics of the XML is preserved since the H1 element is globally recognized as being a headline of level 1. When transformed into XFO, semantics is removed and replaced by presentational properties.

Example 3: XML to HTML/CSS

The last example transforms XML into an HTML element with associated CSS stylistic properties:

<xsl:template match="Heading1">
  <H1 STYLE="font-size:1.3em; margin-top:1.5em; margin-bottom:0.4em">

The result is:

<H1 STYLE="font-size:1.3em; margin-top:1.5em; margin-bottom:0.4em">
   The headline

The result preserves the semantics while also containing information on how to present the content. This is the best of both worlds.

(When authoring with CSS, one would normally move the stylistic properties into a separate style sheet. This eases maintenance and makes documents smaller. However, both forms are valid and one can programatically convert between the two.)

A demonstration

XFO was not designed to be used over the Web, and most people interested enough to read this far will agree that the use of XTL+XFO described in this document is abuse. However, it seems that there is no way to stop the abuse. Rather, it seems like conforming implementations are required to support XFO on the Web.

Here's a demonstration:

  1. download the XML/XSL browser from InDelv
  2. point it to a document which only contains XFO

The document referred to in the second step in the above list contains:

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="dummy.xsl"?>
<fo:block start-indent="24pt" font-size="10pt">A document 
  <fo:inline-link font-size="18pt">without</fo:inline-link> 

The linked XTL sheet contains:

<xsl:stylesheet xmlns:xsl=""
  xmlns:fo="" result-ns="fo">
  <xsl:template match='/'>

Here is another XTL sheet which is equally good at doing nothing:

<xsl:stylesheet xmlns:xsl=""
     xmlns:fo="" result-ns="fo">
  <xsl:template match="*|@*">
      <xsl:apply-templates select="*|@*|text()"/>

So, straight out of the box, XTL+XFO browsers will display XFO documents from the Web.


When removing document semantics and replacing it with presentational properties, the content moves downwards on the ladder of abstraction and important information is lost. For example, generating an aural presentation based on the output of example 2 is much harder than basing the aural rendition on semantic markup which is present in the output from example 1 and 3. In a scenario where only visual formatting objects are published on the Web, aural renditions are brought back to perform the role of the "screen reader", i.e. special software that uses heuristics to decode information meant for another medium. Also, other services which take advantage of semantic markup -- e.g. search engines -- will perform worse.

For these reasons, I believe W3C should encourage authors to publish documents in semantically rich HTML and XML [2] with attached style sheets. The style sheets should be evaluated on the client side. This gives us the best of both worlds: rich applications and rich presentations.


This section contains answers to questions that often come up in discussions about the use of XFO on the Web.

The XTL to XFO transformations don't have to take place on the server side. Can't you preserve semantics by performing the transformation in the client?

Yes. Given the XML source and the XTL sheet the transformation can take place on the client side. This preserves semantics, and the number of bytes sent over the Web will generally be smaller. In this scenario, however, there is no need for an XML vocabulary to express formatting objects since the client will both transform and present the content. This highlights an important point which isn't clear from the title of the document: it's not formatting objects per se that are harmful (any system that does formatting uses some kind of formatting objects). The harm is done when formatting objects are stored and shipped over the Web.

Ok, but even if transformations take place on the server, accessibility can still be preserved. By defining formatting objects for all media types, presentations for all sorts of devices can be generated, no?

In theory, yes. In practice, no. For example, to successfully present content aurally, there are four prerequisites:

  1. there must be a specification for aural formatting objects
  2. there must be implementations of aural formatting objects
  3. the fact that the user has an aural client must be known to the server
  4. all web sites must install XTL sheets to transform content into aural formatting objects

Among these, the first two will require much time and work. The third is undesirable, while the fourth is impossible in practice. Besides, caching suffers.

Semantic markup is dead. We should use RDF instead, no?

RDF is important, but will not replace semantic markup. That has never been the intention of W3C Metadata efforts.

But, in order to do high-quality printing we need transformations. CSS doesn't have transformations and is therefore unusable. Isn't that why XTL and XFO were developed?

High-quality printing is very hard, and can't be done without looking at the shape of the glyphs. Neither XFO nor CSS takes this approach. Instead, they both have the same property/value model and as long as the properties and values are the same, their potential for improving printing is the same. Transformations are, if not a prerequisite of printing, at least a helpful tool. The transformation step comes before the styling part, and XTL can equally well be used with CSS as it can with XFO.

In my organization, I have thousands of legacy word processing documents where styles have been used inconsistently. Don't you think it's better to use XFO and admit that there is no semantics rather than using HTML and claim there is?

Use presentational HTML or PDF for your documents. We can't risk losing the semantic Web due to legacy documents.

W3C is developing SVG and the elements defined in the SVG WD don't have much semantics. They're more like formatting objects. Aren't they just as harmful?

No. Compared to the GIF images SVG will replace, the move represents an upwards climb on the ladder of abstraction. XFO, on the other hand, represents a steep downwards step compared to a CSS-based solution.

Most of the HTML documents on the Web are presentational. Why do you think XFO can make the situation worse?

W3C has been working hard to deprecate presentational elements in HTML, but it's true that they are still widely used. However, the crucial difference between using HTML and XFO is that HTML has the ability to represent semantics as well. HTML allows authors to do the right thing, but this is not the case with XFO.

But the semantics HTML can capture is so shallow. What's the point of using it?

Consider one example from Braille renderings. Since Braille characters use much space, words are often contracted to fit more text on one page. However, some words -- for example program variables -- should not be contracted. HTML gives you the ability to express this (using the VAR element) and this is crucial to improve Braille renderings. XFO, on the other hand, gives access to the text but information that can be used to decide if a word can be contracted or not is lost.

Avoiding disaster

Here are some commented ideas on how to avoid the disaster scenario which has been outlined above:


The use of XFO on the Web is a threat to accessibility, device-independence and a semantic Web. Although not designed for this use, the current architecture does not prevent it. Technical barriers should be put in place to avoid XFO from being used as a document format on the Web. If technical barriers aren't possible, XFO should not become a W3C Recommendation.


[1] I call it an "XTL sheet" rather than an "XTL style sheet" since the XTL language has no notion of style.

[2] Publishing semantically rich XML should be encouraged when the semantics is globally known, e.g. MathML. Publishing arbitrary XML should be discouraged.