Tangled in the Threads

Jon Udell, June 7, 2000

Math on the Web

Goodbye compound documents, hello universal data

Scientists find it painfully ironic that the Web, which was invented expressly to help them collaborate, still lacks decent support for one of the primary forms of scientific discourse: mathematical equations. As a result, mathematical content is typically delivered on the Web in one of a number of unsatisfactory ways. Equations can be viewed as bitmapped images included within HTML files, or within PDF files, or in a Java viewer (e.g., WebEQ), or in a free or commercial plug-in (e.g., LiveMath, IBM techExplorer). In all of these cases, however, equations end up as foreign data types. Things you can't do with this foreign content include:

For years there's been discussion of a more HTML-native way to create and display math. Happily, those efforts are finally bearing fruit. A second version of the W3C's working draft of MathML, a math-oriented application of XML, is in the "last call" stage, and both commercial vendors and the open source community are lining up in support of it.

Why the slow start? Well, pre-existing tools, notably Donald Knuth's TEX -- a superlative and widely-used system for math typesetting and composition, weren't easy to integrate with the Web's original model. Only recently has XML become firmly established as a general strategy for extending the Web. It's one thing to embed a foreign chunk of math into an HTML page, like this:

Example 1: foreign math
As you can see from this example:

<embed src="SomeMathPlugin">

, the result...

It's something else again to weave the math right into the fabric of the page, like this:

Example 2: native math
As you can see from this example:
  <mfrac style="font-size: 14pt">
      <mi href="/definitions#k" xml:link="simple">k</mi>
, the result...

Note the style attribute attached to the <mfrac> element. Note also how the term k links to an address (/definitions#k), so that the reader of this equation can click on the k to find out more about it. This is the kind of deep and fluid integration that the phrase "math for the Web" conjures in our imaginations. And it's now tantalizingly close to being real. I didn't just pull the markup in Example 2 out of thin air. I captured it from the View Source window of Amaya, the W3C's testbed browser. Here's Amaya displaying Example 2:

Clearly Amaya is rendering the font style attribute attached to the fraction. But a couple of things aren't obvious in this screen shot. First, the hyperlink wrapped around the k really is there, and really works. Second, Amaya isn't just a viewer, it's an editor. Everything that it's displaying -- the text and the equation -- is also live for editing.

Currently at version 3.1, Amaya isn't nearly stable enough or complete enough for real work, but it's an intoxicating glimpse of the way things ought to be. In addition to MathML, incidentally, its XHTML capabilities are noteworthy. You can use its "Save as XHTML" feature to automate the grunge work of transforming a file of normal HTML into a file of well-formed and more maintainable XHTML.

Beyond compound documents

Since the advent of OLE and OpenDoc, we've aspired to "document-centric" computing. We longer want GUI windows to represent monolithic documents backed by single applications. Rather, we want them to contain heterogenous documents that represent task-focused activities and that are supported by a suite of cooperating applications. While technologies like OLE embedding enabled a window to contain a mixture of textual, graphical, and tabular components, each of these was still backed by an application that represented the data in its own proprietary way. You could mix the components on the page, and pass data among them using DDE or the clipboard, but you couldn't really blend them together. The model was more like Example 1 than like Example 2.

What justifies all the XML hype is that it promises to fundamentally unify the representation of data. When a page contains a chunk of XHTML text, and a chunk of MathML equations, and a chunk of SVG graphics, it isn't just a canvas with three things stuck onto it. It's a fabric woven from three kinds of threads. Textual elements belonging to each thread are subject to the same search and styling mechanisms. Every element is part of the DOM (document object model) and, as such, available to controlling scripts.

Greg Wilson, coordinator of the Software Carpentry Project, has a nice explanation for why universal data representation matters so much. He asks: "Why did UNIX succeed?" The stock answer is that UNIX invented the idea of software components that could easily be arranged in many useful ways. But what made that possible? The answer, says Greg, is that these components agreed on way of representing data -- as line-oriented text made of tokens delimited by whitespace -- that enabled a host of glue technologies (the shells, sed, awk, Perl.) We're now seeing a new generation of glue technologies -- XML-RPC, SOAP -- that agree on the much richer data representation afforded by XML. For interconnecting Web services, these XML-over-HTTP protocols have rapidly become the de facto standard. The next frontier will be Web applications that move beyond Example 1's compound document model, and agree on standard XML data representation that allows different datatypes to blend and interact. Exciting stuff!

Mozilla and MathML

The Mozilla project is taking MathML very seriously. The MathML support isn't in the standard build yet, but if you want to try it you can download a MathML-enabled version of Mozilla. Unlike Amaya, Mozilla isn't (yet) a MathML editor, only a viewer, but as such it's making terrific progress. What's especially interesting is that, as noted in a progress report by Roger Sidje, a key (and non-Netscape-employed) contributor, MathML is the first major Mozilla component driven entirely by outside interests unaligned with Netscape's commercial priorities.

The initial goal of the Mozilla MathML effort is to leverage -- and stress-test! -- the browser's new layout engine ("Gecko"). To that end, it will focus on presentation markup as opposed to content markup. What's the difference? The former says things like "put a row of elements here, then underneath, a box containing these elements." The latter says things like "apply this operation to these subexpressions." The W3C's working draft explains thusly:

Both content and presentation tags are necessary in order to provide the full expressive capability one would expect in a mathematical markup language. Often the same mathematical notation is used to represent several completely different concepts. For example, the notation xi may be intended (in polynomial algebra) as the i-th power of the variable x, or as the i-th component of a vector x (in tensor calculus). In other cases, the same mathematical concept may be displayed in one of various notations. For instance, the factorial of a number might be expressed with an exclamation mark, a Gamma function, or a Pochhammer symbol.

Thus the same notation may represent several mathematical ideas, and, conversely, the same mathematical idea often has several notations. In order to provide authors with the ability to precisely control notation while at the same time encoding meanings in a machine-readable way, both content and presentation markup are needed.

There are deep subtleties here. Sometimes presentation markup embeds within content markup so that the application gets hints about how to render equations; sometimes content markup embeds within presentation markup, to specify which mathematical meaning is intended; sometimes the two occur in parallel, when the relationship between content and presentation cannot easily be inferred. Even for specialized math software, mastering these subtleties will take a long time. So how can Mozilla compete? It doesn't have to. The purpose of its MathML support is to help people communicate mathematically on the Web, in accordance with this excellent mission statement from the W3C:

In practical terms, the observation that mathematics on the Web should provide for both specialized and general needs naturally leads to the idea of a layered architecture. One layer consists of powerful, general software tools exchanging, processing and rendering suitably encoded mathematical data. A second layer consists of specialized software tools, aimed at specific user groups, which are capable of easily generating encoded mathematical data which can then be shared with a particular audience.

MathML is designed to provide the encoding of mathematical information for the bottom, more general layer in a two-layer architecture. It is intended to encode complex notational and semantic structure in an explicit, regular, and easy-to-process way for renderers, searching and indexing software, and other mathematical applications.

In light of this goal, consider this screenshot posted to news://news.mozilla.org/netscape.public.mozilla.mathml, which illustrates the composition of an email message that intermixes textual and mathematical content:

Now that's a picture of the way things ought to be! And not just for math on the Web, but for all kinds of content. Scientific collaboration is an ongoing discourse involving text, graphics, math, and other datatypes. Why shouldn't email, the medium through which so much of that discourse is conducted, natively understand those datatypes? Clearly it should. This inspiring glimpse of the future reminds us that the days of the 65-character-per-line ASCII-art email message are, inevitably, numbered.

Jon Udell (http://udell.roninhouse.com/) was BYTE Magazine's executive editor for new media, the architect of the original www.byte.com, and author of BYTE's Web Project column. He's now an independent Web/Internet consultant, and is the author of Practical Internet Groupware, from O'Reilly and Associates. His recent BYTE.com columns are archived at http://www.byte.com/index/threads

Creative Commons License
This work is licensed under a Creative Commons License.