Skip to content
Tony Zeng edited this page Apr 18, 2017 · 4 revisions

XWPFDocument 2 XHTML

org.apache.poi.xwpf.converter.xhtml provides the DOCX 2 XHTML converter based on Apache POI XWPF.

You can test this converter with the REST Converter service http://xdocreport-converter.opensagres.cloudbees.net/

Download

Download this converter with :

  • maven :
<dependency>
  <groupId>fr.opensagres.xdocreport</groupId>
  <artifactId>org.apache.poi.xwpf.converter.xhtml</artifactId>
  <version>XDOCREPORT_VERSION</version>
</dependency>

where XDOCREPORT_VERSION is the XDocReport version (ex : 1.0.0).

Sample

Here a sample to convert org.apache.poi.xwpf.usermodel.XWPFDocument to XHTML format :

import org.apache.poi.xwpf.converter.xhtml.XHTMLOptions;
import org.apache.poi.xwpf.converter.xhtml.XHTMLConverter;

...

// 1) Load DOCX into XWPFDocument
InputStream in= new FileInputStream(new File("HelloWord.docx"));
XWPFDocument document = new XWPFDocument(in);

// 2) Prepare XHTML options (here we set the `ImageManager` to store image and resolve iamge src)
XHTMLOptions options = XHTMLOptions.create().setImageManager( new ImageManager( new File(root), "images" ) );

// 3) Convert XWPFDocument to XHTML
OutputStream out = new FileOutputStream(new File("HelloWord.htm"));
XHTMLConverter.getInstance().convert(document, out, options);

XHTML Settings

Extract image

If your docx have images and you wish display in the HTML you can configure class ImageManager with the XHTMLOptions by

options.setImageManager( new ImageManager( new File(baseDir), "images" ) );

in which it will default do:

  • extract image under baseDir/imageSubDir/
  • resolve image src attribute in html

You can see a sample with our JUnit XHTMLConverterTestCase

Embed Image

Using Base64

If you want to embed image into html using base64, you can use Base64EmbedImgManager:

XHTMLOptions options = XHTMLOptions.create().indent( 4 ).setImageManager(new Base64EmbedImgManager());

You can find full example here: XHTMLConverterEmbedImgTest

Clone this wiki locally