LaTex via HTML to Word using tex4ht/htlatex

I write my thesis in LaTex using Kile. It is great and in comparison to using either WYSIWYG office suite, I actually have fun writing. Kile is by far the best LaTeX editor for my needs, although Texmaker gets pretty close.

Once in a while I’m sending drafts to my tutor for correction. This would be no problem if I could just send the .tex and .bib files. However my tutor is used to the MS Word correction and review system. Therefore I tried to find a straightforward way to convert my .tex files into a .doc/.docx file. After 3 days(!) of complete frustration I finally succeeded. Here is a quick report on what I did to get a .doc file with the following things working:

• tables
• figures
• bibtex references
• in-document reference
• equations

what doesn’t work:

• figure dimensions ignored
• figure crops ignored
• minipages

My setup consists of 2 tex files containing tables, equations and lots of figures and  a bibtex file. Finally what worked for me is converting the tex files into an html document using htlatex.

create the file myhtml.cfg with the following content:

\Preamble{html}
\Configure{graphics*}
{pdf}
{\Needs{"convert \csname Gin@base\endcsname.pdf
\csname Gin@base\endcsname.png"}%
\Picture[pict]{\csname Gin@base\endcsname.png}%
}
\begin{document}
\EndPreamble

test

This file is necessary in my case to display my PDF figures in the html document. Then execute these commands one after another.

pdflatex document.tex
bibtex document.aux
htlatex document.tex myhtml

The latex command above can be simply “latex document.tex”, however I use pdflatex. Now, open the html document with either LibreOffice or MS Office and save it in the desired format. That should be it.

Here’s the catch. If it works smoothly for you I’m more than happy, if you encounter problems, I want to mention here that in the process of my frustration I followed a guide on ubuntuforums.org on how to update tex4ht, which seems to be the metapackage of htlatex. I cannot reconstruct anymore if updating did the trick finally or not. The update process is a real pain in the ass, however if you really need to convert I think tex4ht/htlatex is your best shot at doing so.

This entry was posted in Planet, Ubuntu, Uncategorized and tagged , , , , , , , , , , , , . Bookmark the permalink.

3 Responses to LaTex via HTML to Word using tex4ht/htlatex

1. Yves Paris says:

I am experimenting direct conversion from LaTeX source to open office format (odt) using the following command:

htlatex filename.tex “xhtml,ooffice” “oofice/! -cmozhtf” “-coo -cvalidate”
It seems to do a fairly decent conversion job.
The pictures need to have the file extensions in the call otherwise the conversion routine will not find them.
Then it is easy to convert th eodt file to docx.

2. duru says:

Sorry for the trivial questions, I am a real novice.
As far as I understood the following part need to be tailored to our file names if we have any png files.
” {\Needs{“convert \csname Gin@base\endcsname.pdf
\csname Gin@base\endcsname.png”}%
\Picture[pict]{\csname Gin@base\endcsname.png}%
}”

If we have no png files we can remove this part? Then, we add this to the beginning of ourTex document and run. Then run the second set of commands. Here we change “document” to our file name or is it generic?

pdflatex document.tex
htlatex document.tex myhtml

Many thanks!

• gastarbeiten says:

Hi duru,

You only need to run this command if you are having problems with your figures in the conversion process. Your quoted text above is the content of a separate file you have to create. And, yes finally you have to run these commands adjusted to you filenames. e.g.: pdflatex /path/to/yourtexfile/yourfilename.tex

best regards!