LaTeXML

LaTeXML is a comprehensive LaTeX to XML converter written by Bruce Miller for the DLMF project at NIST and it’s source code can be found at GitHub.

In this post I will show how to install and use it.

Install

First of all download the source code:

$ git git clone https://github.com/brucemiller/LaTeXML.git
$ git remote add kwarc https://github.com/KWARC/LaTeXML.git
$ git fetch --all
$ git checkout kwarc/master

Nota

We will use the version from kwarc because the changes to generate EPUB still not be present in the official release.

Them solve some Perl modules dependencies. If you are using Debian:

$ sudo apt-get install libarchive-zip-perl
$ sudo apt-get install libimage-magick-perl
$ sudo apt-get install libimage-size-perl
$ sudo apt-get install libxml-libxml-perl
$ sudo apt-get install libxml-libxslt-perl
$ sudo apt-get install libparse-recdescent-perl

Otherwise:

$ cpanm -S XML::LibXML
$ cpanm -S XML::LibXSLT
$ cpanm -S Parse:RecDescent
$ cpanm -S Image::Magick

After solve the dependencies:

$ perl Makefile.PL
$ make
$ make test
# make install

Using

Some options of latexml:

--destination=file
specifies destination file (default stdout).
--noparse
suppresses parsing math

Covert a small LaTeX file to XML:

$ cat sample.tex
\documentclass{article}
\begin{document}
Some text.
\end{document}
$ latexml --destination=sample.xml sample.tex
$ cat sample.xml
<?xml version="1.0" encoding="UTF-8"?>
<?latexml searchpaths="/tmp/tmp.RBD3ySAca3"?>
<?latexml class="article"?>
<?latexml RelaxNGSchema="LaTeXML"?>
<document xmlns="http://dlmf.nist.gov/LaTeXML">
<resource src="LaTeXML.css" type="text/css"/>
<resource src="ltx-article.css" type="text/css"/>
<para xml:id="p1">
<p>Some text.</p>
</para>
</document>

To get a HTML file we have to use latexmlpost. Some options of latexmlpost:

--format=html|html5|xhtml|xml
requests the output format.
--destination=file
specifies output file (and directory).

Convert a small LaTeX file to HTML5:

$ latexml --destination=sample.xml sample.tex
$ latexmlpost --format=html5 --destination=sample.html sample.xml
$ cat sample.html
<!DOCTYPE html><html>
<head>
<title></title>
<!--Generated on Thu Jan 23 12:33:25 2014 by LaTeXML (version 0.7.99)
http://dlmf.nist.gov/LaTeXML/.-->

<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<link rel="stylesheet" href="LaTeXML.css" type="text/css">
<link rel="stylesheet" href="ltx-article.css" type="text/css">
</head>
<body>
<div class="ltx_page_main">
<div class="ltx_page_content">
<section class="ltx_document">
<div id="p1" class="ltx_para">
<p class="ltx_p">Some text.</p>
</div>
</section>
</div>
<footer class="ltx_page_footer">
<div class="ltx_page_logo">Generated  on Thu Jan 23 12:33:25 2014 by <a
href="http://dlmf.nist.gov/LaTeXML/">LaTeXML <img
src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAsAAAAOCAYAAAD5YeaVAAAAAXNSR0IArs4c6QAAAAZiS0dEAP8A/wD/oL2nkwAAAAlwSFlzAAALEwAACxMBAJqcGAAAAAd0SU1FB9wKExQZLWTEaOUAAAAddEVYdENvbW1lbnQAQ3JlYXRlZCB3aXRoIFRoZSBHSU1Q72QlbgAAAdpJREFUKM9tkL+L2nAARz9fPZNCKFapUn8kyI0e4iRHSR1Kb8ng0lJw6FYHFwv2LwhOpcWxTjeUunYqOmqd6hEoRDhtDWdA8ApRYsSUCDHNt5ul13vz4w0vWCgUnnEc975arX6ORqN3VqtVZbfbTQC4uEHANM3jSqXymFI6yWazP2KxWAXAL9zCUa1Wy2tXVxheKA9YNoR8Pt+aTqe4FVVVvz05O6MBhqUIBGk8Hn8HAOVy+T+XLJfLS4ZhTiRJgqIoVBRFIoric47jPnmeB1mW/9rr9ZpSSn3Lsmir1fJZlqWlUonKsvwWwD8ymc/nXwVBeLjf7xEKhdBut9Hr9WgmkyGEkJwsy5eHG5vN5g0AKIoCAEgkEkin0wQAfN9/cXPdheu6P33fBwB4ngcAcByHJpPJl+fn54mD3Gg0NrquXxeLRQAAwzAYj8cwTZPwPH9/sVg8PXweDAauqqr2cDjEer1GJBLBZDJBs9mE4zjwfZ85lAGg2+06hmGgXq+j3+/DsixYlgVN03a9Xu8jgCNCyIegIAgx13Vfd7vdu+FweG8YRkjXdWy329+dTgeSJD3ieZ7RNO0VAXAPwDEAO5VKndi2fWrb9jWl9Esul6PZbDY9Go1OZ7PZ9z/lyuD3OozU2wAAAABJRU5ErkJggg=="
alt="[LOGO]"></a>
</div></footer>
</div>
</body>
</html>

Convert a small LaTeX file to EPUB3:

$ latexmlc sample.tex --destination=book.epub

Refências

[1]How to install CPAN modules de Perl.org.