Chapter 2. Getting started with Tika

 

Equipped with sufficient background on Apache Tika, you’re probably thinking to yourself: how do I start leveraging Tika in my own application? Tika is a modern Java application, and its development has undergone the natural evolution that most Java applications do: beginning as a set of Java classes exported as an API, followed by a basic command-line interface, and culminating with a graphical user interface (GUI) for the command-line neophyte (or those with a preference for visual interfaces).

Executing Tika at runtime is a separate step from building Tika from source code. Because Tika is an open source project at the Apache Software Foundation and provided under the Apache License version 2.0 (ALv2),[1] many of its users (you may be one of them) will be perfectly comfortable grabbing the Tika source code and building/integrating it into their applications. To do so, you’ll need some basic knowledge of the primary Tika build tool, Apache Maven, along with some basic knowledge of JUnit tests in order to make sure the Tika software will execute correctly in your environment.

2.1. Working with Tika source code

2.2. The Tika application

2.3. Tika as an embedded library

2.4. Summary

sitemap