# TeX2NLM

## What is this ?

This page is a demo for the TeX2NLM tool. It takes, as input, an UTF-8 encoded string of characters that is expected
to be valid TEX code and returns the same content with formulas identified as such and
conforming to EuDML NLM structure (a <*-formula> elementâ€”i.e. a <disp-formula> or
<inline-formula> elementâ€”for each formula, containing an <alternatives> element with
a child <tex-math> element holding the original TEX code, and a <mml:math> element
with a MathML representation of the formula derived from the provided TEX.
The tool is written in Java and is provided as a standalone library that embeds a
linux version of Tralics, but can be configured to use another version, appropriate to the
system the tool is executed on (version 2.14 or higher is required). It also provides a set
of configuration files, and more can be added. Once configured properly, this becomes
completely transparent for the rest of the system.
This library is free software governed by the CeCILL-C license that can be found at
http://www.cecill.info/.
This tool was made to be used primarily by EnhanceNLMTeXwMML , that adds a MathML versions of TEX formulas in NLM documents (demo available here). It is also expected that a formula search function will use
this tool, by converting TEX entered by the website user to MathML on the fly, and then
using the EuDML tool for Mathematical Indexing and Search (MIaS) to build a search
query from the MathML.

## How to use it ?

The assumptions for this tool are the following :
- It is applied to an UTF-8 encoded string of characters (typically, the content of
a childless XML element holding textual information with
TeX-encoded mathematical formulas).
- The TEX commands switching to math mode must be explicit in the TEX code as $
in the supplied example (it could also be \[..\], \begin{align}, etc.) ;
- Once unescaped, the TeX code contains no unspecified
macros and compiles with allowed and configured
TeX commands.

The tool identifies each formula in the input string,
generates a standard NLM structure for each of them,
and returns a Java DOM Element containing the result.