The TeX/MathML Map File Specification


The structure of the map file



Templates

In essence, the map file is a collection of templates. The purpose of each template is to specify how a particular TeX or MathML fragment can be mapped into other formats. Therefore, each template is a tuple (or, more precisely, it is a triple at the moment) specifying how pieces of TeX and MathML (and possibly an image) correspond to each other.

When mapping from TeX to MathML, a template is chosen based on the op attribute of the pat:tex element of a template. The value of the attribute op can be either a TeX symbol, such as + or *, any other token, such as \\, or a macro name, e.g. \cr, \root or \matrix. It is possible to have more than one template for a particular op. In that case the prec attribute is used. Precedence is specified by an integer. The rule is that for a particular TeX operator/macro a template with the highest precedence is chosen first. If the precedence for a template is not specified explicitely, then it is assumed to be zero by default. When several templates have the same precedence they are considered in the order in which they appear in the map file.

It is important to bear in mind infix operators (such as _, ^, etc.) have the op attribute set to a special value of \PSEUDO, and normally templates for infix operators have higher precedence than other templates. Therefore, one should keep this in mind when choosing precedences for templates.

Special conventions for the syntax used inside the parameters attribute of the pat:tex element

An expression found in the parameters attribute of the pat:tex element is treated as normal TeX, except for the following special "macros" which are recognized and preprocessed by the Tex2Mml application:


Examples

<pat:tex op="\PSEUDO" params="\patVAR+{num} \over \patVAR+{den}" prec="666"/>

<pat:tex op="\matrix" params="{\patREP+{\patVAR+{a}\patREP*{&\patVAR+{b}}\cr}"/>

<pat:tex op="\gcd" params="(\patVAR+{argA}\patREP*{,\patVAR+{argI}})"/>

<pat:tex op="\gcd"/>
The following can be observed in the above examples:

Recursive \patREP's are allowed (arbitrarily deeply nested). Note that certain characters are not allowed as CDATA in XML, and must be escaped (by using XML entities, e.g. &, must be marked up as "&amp;").

Special conventions for the syntax used inside the pat:mml element

The XML markup contained within the pat:mml element is the MathML markup corresponding to the TeX found in the op and the params attributes of the pat:tex element. Because the document has no DTD, in theory any valid XML can appear under pat:mml. However, there are two elements that have special meanings (other elements are assumed to be valid MathML). The first of these elements represents a variable, and has name pat:var. It must have the name attribute (variable name), and be an empty element. The name of the variable must be one of the names appearing in the params attribute of the pat:tex element. If a variable is non-scalar (e.g. variables a, b and argI in the examples above), it must be a child of the other special element named pat:rep . Just like \patREP, pat:rep corresponds to a repetition pattern, but this time in MathML. Every non-scalar variable must have pat:rep as its predecessor. Conversely, every pat:rep must have at least one non-scalar variable as its descendant (in the case of nested pat:rep's it does not iclude variables occurring inside the nested pat:rep's; therefore, in the \matrix example below variable a and the outer \patREP+ form a pair, as do variable b and the inner \patREP*, but the variable b and the outer \patREP+) do not. Scalar variables are allowed to appear anywhere (as long as they are descendants of pat:mml).

Examples of complete templates

The following template will match the opening parenthesis in TeX, and map it into the <mo>(</mo> element. By default, this template's precedence is 0:
<pat:template>
  <pat:tex op="("/>
  <pat:mml op="(">
    <mo> ( </mo>
  </pat:mml>
</pat:template>
The second template will transform the TeX macro \alpha into the corresponding UNICODE character:
<pat:template>
  <pat:tex op="\alpha"/>
  <pat:mml op="&#x03B1;">
    <mo> &#x03B1; </mo>
  </pat:mml>
</pat:template>
The next template is an example of an infix TeX macro. Because it should be processed before any other macro/symbol, its precedence is set higher than most other templates in the map file. This template also features two scalar variables num and den:
<pat:template>
  <pat:tex op="\PSEUDO" params="\patVAR+{num}\over\patVAR+{den}" prec="666"/>
  <pat:mml op="mfrac">
    <mfrac>
      <pat:variable name="num"/>
      <pat:variable name="den"/>
    </mfrac>
  </pat:mml>
</pat:template>
Here is a more complicated example involving non-scalar variables firstCol and rest; it shows how a matrix can be transformed:
<pat:template>
  <pat:tex op="\matrix"
       params="{\patREP+{\patVAR+{firstCol}\patREP*{&\patVAR+{rest}}\cr}}"/>
  <pat:mml op="mtable">
    <mtable>
      <pat:rep>
        <mtr>
          <mtd> <pat:variable name="firstCol"/> </mtd>
          <pat:rep>
            <mtd> <pat:variable name="rest"/> </mtd>
          </pat:rep>
        </mtr>
      </pat:rep>
    </mtable>
  </pat:mml>
</pat:template>
In the following example you can see how not only tags, but also attributes can be generated. Note that if pat:variable has the attribute attribute it will be placed as an attribute on its parent element (in this case mfenced), and only the fact that pat:variable is a child of mfenced matters, but not the former's position among other children.
<pat:template>
  <pat:tex op="\left" params="\patVAR!{lDelim} \patVAR*{expr} \right\patVAR!{rDelim}"/>
  <pat:mml op="">
    <mfenced separators="">
      <pat:variable name="lDelim" attribute="open"/>
      <pat:variable name="expr"/>
      <pat:variable name="rDelim" attribute="close"/>
    </mfenced>
  </pat:mml>
</pat:template>
Finally, here is an example showing how attribute values can also be mapped. This often comes in handy when one needs to map TeX environments, such as array, tabular, etc. While TeX usually uses one-letter specifiers for justification (e.g. l means "left-justify") and alignment values, MathML is oftentimes more verbose, and hence a correspondence between them has to be specified:
<pat:template>
  <pat:tex op="\begin" params="{array} {\patREP*{\patVAR!{hjust}}} \patREP*{\patVAR*{firstCol}\patREP*{&\patVAR*{rest}}\\} \end{array}"/>
  <pat:mml op="">
    <mtable>
      <pat:rep>
        <pat:variable name="hjust" attribute="columnalign" map="l=left c=center r=right"/>
      </pat:rep>
      <pat:rep>
        <mtr>
          <mtd> <pat:variable name="firstCol"/> </mtd>
          <pat:rep>
            <mtd> <pat:variable name="rest"/> </mtd>
          </pat:rep>
        </mtr>
      </pat:rep>
    </mtable>
  </pat:mml>
</pat:template>