Skip to content
Quan Nguyen edited this page Jan 10, 2017 · 10 revisions

The following code example shows common usage of the library. Make sure *.dll and tessdata folder are in the same directory and the search path, and the .jar files are in the classpath.


Windows Dependency: The Tesseract Binaries shipped with Tess4J depend on the Visual C++ Redistributable for VS2013. This dependency needs to be installed on a Windows machine before running Tess4J.


JNA Interface Mapping Example

import java.io.File;
import net.sourceforge.tess4j.*;

public class TesseractExample {

	public static void main(String[] args) {
		File imageFile = new File("eurotext.tif");
		/**
		 * JNA Interface Mapping
		**/
		ITesseract instance = new Tesseract();
		
		/**
		 * You either set your own tessdata folder with your custom language pack or
		 * use LoadLibs to load the default tessdata folder for you.
		 **/
		instance.setDatapath(LoadLibs.extractTessResources("tessdata").getParent());

		try {
			String result = instance.doOCR(imageFile);
			System.out.println(result);
		} catch (TesseractException e) {
			System.err.println(e.getMessage());
		}
	}
}

JNA Direct Mapping Example

import java.io.File;
import net.sourceforge.tess4j.*;

public class TesseractExample {

	public static void main(String[] args) {
		File imageFile = new File("eurotext.tif");
		
		/**
		* JNA Direct Mapping
		**/
		ITesseract instance = new Tesseract1();
		
        /**
		 * You either set your own tessdata folder with your custom language pack or
		 * use LoadLibs to load the default tessdata folder for you.
		 **/
		instance.setDatapath(LoadLibs.extractTessResources("tessdata").getParent());

		try {
			String result = instance.doOCR(imageFile);
			System.out.println(result);
		} catch (TesseractException e) {
			System.err.println(e.getMessage());
		}
	}
}

JUnit HOCR Example

@Test
public void testHOCRCreation() {

	String result = "";
	File imageFile =  new File(this.testResourcesDataPath, "eurotext.tif");
	/**
	 * JNA Interface Mapping
	 **/
	Tesseract instance = new Tesseract();
	
	/**
	 * You either set your own tessdata folder with your custom language pack or
	 * use LoadLibs to load the default tessdata folder for you.
	 **/
	instance.setDatapath(LoadLibs.extractTessResources("tessdata").getParent());

	try {
		/**
		 * HOCR | Set the HOCR option in order to get the desired result from the doOCR method.
		 **/
		instance.setHocr(true);
		result = instance.doOCR(imageFile);
		System.out.println(result);
	} catch (TesseractException e) {
		System.err.println(e.getMessage());
	}

	assertTrue(result.contains("<div class='ocr_page'"));

}

More can be found under UnitTest package. Tutorial: Development with Tess4J in NetBeans, Eclipse, and Command-line