Skip to content
angelozerr edited this page Aug 4, 2015 · 1 revision

MS Office Converters

Docx Converters

Here the list of XDocReport fr.opensagres.xdocreport.converter.IConverter implemented for docx which can be used with ConverterRegistry? :

Stable Docx Converter

ID Output format OSGi fragment (Git Link) Dependencies Status
Docx 2 PDF via POI-IText PDF [fr.opensagres.xdocreport.converter.docx.xwpf]( Apache POI + iText Stable
Docx 2 XHTML via POI XHTML [fr.opensagres.xdocreport.converter.docx.xwpf]( Apache POI Stable
Docx 2 PDF via docx4j XHTML [fr.opensagres.xdocreport.converter.docx.docx4j]( docx4j + FOP Stable
Docx 2 XHTML via docx4j XHTML [fr.opensagres.xdocreport.converter.docx.docx4j]( docx4j Stable

After several experimentation (see converters below), we decided to keep "Docx 2 PDF via POI-IText" because "Docx 2 PDF via FOP" converter are more slowly than "Docx 2 PDF via POI-IText" because steps are docx -> XSL-FO -> FO -> FOP.

fr.opensagres.xdocreport.converter.docx.xwpf is based on the org.apache.poi.xwpf.converter which is enable to convert :

  • docx to PDF.
  • docx to XHTML.

Here the process of org.apache.poi.xwpf.converter converter for PDF conversion :

  1. load docx in Java Structure Apache POI - HWPF org.apache.poi.xwpf.usermodel.XWPFDocument
  2. loop for each POI Java model (paragraph, table, etc) and create for each POI model a PDF iText model.

Experimental Docx Converter

Here converters use FOP and XSL to transform document.xml (by using styles.xml) from the docx to FO or XHTML.

ID Output format OSGi fragment (Git Link) Status
Docx 2 PDF via FOP PDF [fr.opensagres.xdocreport.converter.fop.docx]( Experimental
Docx 2 FO via XSL-FO FO [fr.opensagres.xdocreport.converter.fop.docx]( Experimental
Docx 2 XHTML via XSL XHTML [fr.opensagres.xdocreport.converter.fop.docx]( Experimental
Clone this wiki locally