In academia many use TeX typesetting system for their writing needs. The route from LaTeX to PDF is well documented. I use LaTeX to write this blog that carries content in HTML format. Together, these routes are sufficient to serve all my writing needs – research paper submissions, class notes, text book (that I have begun writing), lecture presentation and internet writing. Some notes on the LaTeX to HTML route for blogging:
If you use WordPress as your blog software, typing in LaTeX is already enabled if your blog is hosted through their free http://wordpress.com acocunt. You need to type in your LaTeX commands between $latex… $. If you have a WordPress blog hosted in a separate domain, there are several options to enable math in your blog through LaTeX. If your server allows a latexinstallation (Dreamhost, my webhost, already runs an installation), enabling server side conversion of latex symbols to images is done through theLaTeXRender wordpress plugin. Other ways exist, when server side install oflatex is not possible – see for instance, LaTeX Render as an offline tool or usingCodeCogs equation editor.
The disadvantage of this method is that once the internet writing is done (and the blog posted), you are left with a text containing LaTeX markup along with other HTML markups (for list, images, links etc.). If you want this text to be reused in your class notes or research papers (as I require), you probably need to clean it up and reformat to suit those needs.
A better way is to use LaTeX to generate content once in your local computer that runs a TeX installation and convert the content either as PDF or HTML, as required.
Assuming we know to write in LaTeX, we would have our content in, say, aexample.tex file. Using a default article class file, this file contains all the standard TeX markups for math equations using standard packages like amsmath, the hypperref package to create hyperlinks and the graphicx package for including and aligning images.
For converting this example.tex to example.html, we can use the TeX4ht package. This package comes with MikTeX 2.7, the TeX program for Windows. For earlier versions of MikTeX, a separate install of TeX4ht is necessary.
TeX4ht converts all LaTeX math markups into PNG images (default) usingImage Magick software – which should be installed before running TeX4ht. The other orignial images used in the LaTeX file are retained in their formats. Hyperlink conversion is smooth.
The command line usage of TeX4ht can get tedious (but will definitely work), if you want to use the various features of TeX4ht while generating the HTML files. There is a TeXConverter program developed by Steve Meyer that allows TeX to HTML conversion through mouse clicks, with easy access to additional TeX4ht features including separate image directories and style files. The TeX Converter needs an initial configuration update, after which it should work fine with Win XP. In Windows Vista, for some reason the paths in the .ini file is not recognized. I am yet to find a work around.
But using TeX Converter or command line conversion are optional. Another way is to hack and add optional commands into your TeX editor program itself, to call TeX4ht tools to perform the conversion. I do this.
Following one of these conversion methods, we now have a file that isexample.html
If you have a simple HTML based website or one that uses Blosxom to show the files in a blog format, you can simply ftp the example.html and the associated style files and images into the blog root folder and you are done.
Unfortunately, other feature-rich blog or content management software likeWordPress, Drupal, Joomla use editors that don’t recognize some of the XHTML tags and indents generated by TeX4ht. If you do a copy paste of the HTML source of example.html into the text editor of these software, the XHTML is not cleaned and the final file looks ugly with hanging markups. To clean this XHTML properly, we could write separate Python scripts, if we are as smart as John Hawks. There are also some workarounds.
Simplest is this: open example.html in a web browser; copy the content directly from the browser and paste into the WYSIWYG type rich text editor of the blogor CMS software and save. For this you need to enable the rich text editor mode in these CMS tools. The content should now look fine. Of course, if you look at the content through the HTML editor (not the rich text editor), you can see the ugly indents and tags are retained. But the direct copy paste allows proper auto wrapping of the text.
Another way is to remove the TeX4ht generated tabs in the example.html source. Removing these tabs and indent wraps manually could be tedious. But there are text editors that do this clean-job for us. I recommend Notepad++, a remarkably powerful yet open source program. A crude way is to open theexample.html file in Notepad++, choose Select All text and apply Join Lines command from the Edit menu. Copy the resulting content into your blog or CMS editor (in the HTML mode, not the rich text mode) and save. Your content should look fine. Of course, all the indent tags are retained but they will not be not visible in the browser.
Another way, if you use Notepad++, is to use HTML tidying script through the built-in TextFX plugin.
(If you use other text editors, HTML tidying is available as scripts separately in the HTML tidy project.)
To round off, you can use Notepad++ to Find/Replace the image folder link before each image filename that appears in example.html to match your blogimage folder location (say, wp-content/uploads as default in WordPress). This action can be set as a macro. So once your example.html is ready, opening it in Notepad++ and two clicks and copy paste content, you are done.
For geeks: Notepad++ itself handles a host of files including TeX files with decent color markup. Custom user commands can be set in Notepad++ (through hot keys) for running latex or pdflatex for TeX to PDF conversions and running htrun or htlatex for TeX to HTML conversion. This way, Notepad++ serves as your single content generator and manipulator – at least in Windows.
Programs/scripts mentioned above
Code cogs wordpress plugin ; LaTeX Render an offline LaTeX to image conversion tool ; CodeCogs, a LaTeX equation editor ; Image Magick image editing software ; TeX4ht package for TeX to HTML conversion using MikTeX ;MikTeX 2.7, Windows TeX installation ; TeXConverter program ; Notepad++text editor ; TextFX plugin for Notepad++ ; HTML tidy scripts for cleaning HTML documents.