# EnhanceNLMTeXwMathML

## What is this ?

This is a demo page for the EnhanceNLMTeXwMML tool. It takes, as input, a valid XML file that can contain mathematical
formulae both as textual TEX formulas within the text data of various element, or
formatted as <inline-formula> or <disp-formula> element with its internal EuDML v1.0
DTD structure, containing a TEX-encoded version of the formula in the <tex-math>
element. It returns the same file with unchanged content except that a <mml:math>
element is added as an alternative within <*-formula>, with a MathML representation of
the formula derived from the provided TEX version, and a similar structure is created for
textual TEX formulae.
It is a batch tool that will upgrade any existing metadata with (presentation)
MathML for any formula written in TEX, as long as the TEX functions are known
to the compiler.

The tool is written in Java and can either be called directly from another Java
program, or can be made available as a REST service.
EnhanceNLMTeXwMML relies on TeX2NLM for the actual conversion from TEX
strings to MathML (a demo is available here).
This tool is free software governed by the CeCILL-C license that can be found at
http://www.cecill.info/.

## How to use it ?

The assumptions for this tool are the following:

- It is applied to an XML Java DOM document (it must conform to EuDML specification v1.0, but could also be just a fragment thereof).
- This XML document can contain formulae in TEX both
- within the text data of various elements
- or formatted with the NLM JATS basic structure:
- a mandatory <inline-formula> (or <disp-formula> for displayed math);
- an optional <alternatives> element if the formula already has multiple versions;
- a mandatory <tex-math> element holding valid TEX code (with xml special characters escaped, for example & and <
escaped using & and < entities);

- Requirements for the TeX2NLM tool apply for the character strings that contains TEX formulas, whether in a <tex-math> element or not.

The result is another similar Java DOM document, where every <*-formula> element that
had a TEX version now has an <alternatives> child, containing both TEX and MathML
versions (and other alternatives that were previously present), and every TEX formula
found within the text data of other elements is replaced by the appropriate <*-formula>
element containing both the original TEX and itâ€™s MathML equivalent. If no such formulae
are present in the document, the result is identical to the source.