The TeX/MathML Map File Specification

## The structure of the map file

• Namespaces:
• local: xmlns:pat = "http://www.orcca.on.ca/mathml/tex2mml.xml"
• general: xmlns = "http://www.w3.org/1998/Math/MathML"

• Root element: pat:tex2mmlmap

• Allowed children of pat:tex2mmlmap:
• pat:template

• Allowed children of pat:template:
• pat:tex
• pat:mml
• pat:img

• Allowed children of pat:tex:
• none

• Allowed children of pat:mml:
• MathML elements
• pat:rep
• pat:variable

• Allowed children of pat:img:
• none

• Table of attributes used with the above elements:

 Element name Attribute(s) Purpose pat:tex2mmlmap version Map file version pat:template - Contains the tuple defining the mapping pat:tex op Matching TeX macro/symbol name params (optional) TeX macro parameters (if any) prec (optional) Template's precedence (TeX to MathML only) pat:mml op Matching MathML main operation pat:variable name Identifies a variable by its name attribute (optional) Means that the result should be added as an attribute to its parent element map (optional) Specifies how the attribute values should be mapped pat:rep - Contains the pattern to be repeated
Some concrete examples and explanations can be found here.

## Templates

In essence, the map file is a collection of templates. The purpose of each template is to specify how a particular TeX or MathML fragment can be mapped into other formats. Therefore, each template is a tuple (or, more precisely, it is a triple at the moment) specifying how pieces of TeX and MathML (and possibly an image) correspond to each other.

When mapping from TeX to MathML, a template is chosen based on the op attribute of the pat:tex element of a template. The value of the attribute op can be either a TeX symbol, such as + or *, any other token, such as \\, or a macro name, e.g. \cr, \root or \matrix. It is possible to have more than one template for a particular op. In that case the prec attribute is used. Precedence is specified by an integer. The rule is that for a particular TeX operator/macro a template with the highest precedence is chosen first. If the precedence for a template is not specified explicitely, then it is assumed to be zero by default. When several templates have the same precedence they are considered in the order in which they appear in the map file.

It is important to bear in mind infix operators (such as _, ^, etc.) have the op attribute set to a special value of \PSEUDO, and normally templates for infix operators have higher precedence than other templates. Therefore, one should keep this in mind when choosing precedences for templates.

### Special conventions for the syntax used inside the parameters attribute of the pat:tex element

An expression found in the parameters attribute of the pat:tex element is treated as normal TeX, except for the following special "macros" which are recognized and preprocessed by the Tex2Mml application:

• \patVAR[!|*|+|empty]{varName}

Several types of variables are recognized:

• Variables matching exactly one token - \patVAR!{varName}
• Variables matching zero or more tokens - \patVAR*{varName}
• Variables matching one or more tokens - \patVAR+{varName}
• Variables matching zero or one tokens - \patVAR{varName}

Names of variables have to be at least one character long, and start with a letter, followed by one or more letters, digits or underscores. They are also allowed to be prefixed by an optional tilde prefix. The regular expression to describe a variable name is:

varName := (~ | empty) letter (letter | digit | _)*

In addition to the information about how many tokens a variable matches, it also has a type. A variable can be either scalar or non-scalar. A variable is defined to be non-scalar if it occurs within a pattern; otherwise it is scalar. A repetition pattern in the "extended" TeX syntax can be created by using the following "macro":

• \patREP[*|+]{pattern}

Two types or patterns are recognized:

• A pattern which occurs zero or more times - \patREP*{...}
• A pattern which occurs one or more times - \patREP+{...}

When \patVAR occurs inside a pattern, it automatically becomes non-scalar (see the examples below). When a variable occurs inside a \patREP it also has to be a descendant of a pat:rep element in the pat:mml element (see the next section for the important information about the correspondence between the macros \patREP and \patVAR and the elements pat:rep and pat:variable).

#### Examples

<pat:tex op="\PSEUDO" params="\patVAR+{num} \over \patVAR+{den}" prec="666"/>

<pat:tex op="\matrix" params="{\patREP+{\patVAR+{a}\patREP*{&\patVAR+{b}}\cr}"/>

<pat:tex op="\gcd" params="(\patVAR+{argA}\patREP*{,\patVAR+{argI}})"/>

<pat:tex op="\gcd"/>

The following can be observed in the above examples:

Recursive \patREP's are allowed (arbitrarily deeply nested). Note that certain characters are not allowed as CDATA in XML, and must be escaped (by using XML entities, e.g. &, must be marked up as "&amp;").

### Special conventions for the syntax used inside the pat:mml element

The XML markup contained within the pat:mml element is the MathML markup corresponding to the TeX found in the op and the params attributes of the pat:tex element. Because the document has no DTD, in theory any valid XML can appear under pat:mml. However, there are two elements that have special meanings (other elements are assumed to be valid MathML). The first of these elements represents a variable, and has name pat:var. It must have the name attribute (variable name), and be an empty element. The name of the variable must be one of the names appearing in the params attribute of the pat:tex element. If a variable is non-scalar (e.g. variables a, b and argI in the examples above), it must be a child of the other special element named pat:rep . Just like \patREP, pat:rep corresponds to a repetition pattern, but this time in MathML. Every non-scalar variable must have pat:rep as its predecessor. Conversely, every pat:rep must have at least one non-scalar variable as its descendant (in the case of nested pat:rep's it does not iclude variables occurring inside the nested pat:rep's; therefore, in the \matrix example below variable a and the outer \patREP+ form a pair, as do variable b and the inner \patREP*, but the variable b and the outer \patREP+) do not. Scalar variables are allowed to appear anywhere (as long as they are descendants of pat:mml).

### Examples of complete templates

The following template will match the opening parenthesis in TeX, and map it into the <mo>(</mo> element. By default, this template's precedence is 0:
<pat:template>
<pat:tex op="("/>
<pat:mml op="(">
<mo> ( </mo>
</pat:mml>
</pat:template>

The second template will transform the TeX macro \alpha into the corresponding UNICODE character:
<pat:template>
<pat:tex op="\alpha"/>
<pat:mml op="&#x03B1;">
<mo> &#x03B1; </mo>
</pat:mml>
</pat:template>

The next template is an example of an infix TeX macro. Because it should be processed before any other macro/symbol, its precedence is set higher than most other templates in the map file. This template also features two scalar variables num and den:
<pat:template>
<pat:tex op="\PSEUDO" params="\patVAR+{num}\over\patVAR+{den}" prec="666"/>
<pat:mml op="mfrac">
<mfrac>
<pat:variable name="num"/>
<pat:variable name="den"/>
</mfrac>
</pat:mml>
</pat:template>

Here is a more complicated example involving non-scalar variables firstCol and rest; it shows how a matrix can be transformed:
<pat:template>
<pat:tex op="\matrix"
params="{\patREP+{\patVAR+{firstCol}\patREP*{&\patVAR+{rest}}\cr}}"/>
<pat:mml op="mtable">
<mtable>
<pat:rep>
<mtr>
<mtd> <pat:variable name="firstCol"/> </mtd>
<pat:rep>
<mtd> <pat:variable name="rest"/> </mtd>
</pat:rep>
</mtr>
</pat:rep>
</mtable>
</pat:mml>
</pat:template>

In the following example you can see how not only tags, but also attributes can be generated. Note that if pat:variable has the attribute attribute it will be placed as an attribute on its parent element (in this case mfenced), and only the fact that pat:variable is a child of mfenced matters, but not the former's position among other children.
<pat:template>
<pat:tex op="\left" params="\patVAR!{lDelim} \patVAR*{expr} \right\patVAR!{rDelim}"/>
<pat:mml op="">
<mfenced separators="">
<pat:variable name="lDelim" attribute="open"/>
<pat:variable name="expr"/>
<pat:variable name="rDelim" attribute="close"/>
</mfenced>
</pat:mml>
</pat:template>

Finally, here is an example showing how attribute values can also be mapped. This often comes in handy when one needs to map TeX environments, such as array, tabular, etc. While TeX usually uses one-letter specifiers for justification (e.g. l means "left-justify") and alignment values, MathML is oftentimes more verbose, and hence a correspondence between them has to be specified:
<pat:template>
<pat:tex op="\begin" params="{array} {\patREP*{\patVAR!{hjust}}} \patREP*{\patVAR*{firstCol}\patREP*{&\patVAR*{rest}}\\} \end{array}"/>
<pat:mml op="">
<mtable>
<pat:rep>
<pat:variable name="hjust" attribute="columnalign" map="l=left c=center r=right"/>
</pat:rep>
<pat:rep>
<mtr>
<mtd> <pat:variable name="firstCol"/> </mtd>
<pat:rep>
<mtd> <pat:variable name="rest"/> </mtd>
</pat:rep>
</mtr>
</pat:rep>
</mtable>
</pat:mml>
</pat:template>