Chapter 9. The big picture
This chapter covers
It’s time to start thinking big. After looking at the details of how Tika works, you’re probably already thinking of how to integrate it with your applications. The purpose of this chapter is to give you ideas about where and how Tika best fits with different kinds of applications, architectures, and requirements.
We’ll do this in two parts. First we’ll focus on functionality and look at common information-processing systems. We’ll start with search engines and then look at document management and text mining as examples of other information-processing systems where Tika comes in handy. The question is about what such systems can achieve with Tika and where Tika fits in the system architecture. Then in the latter part of this chapter we’ll turn to nonfunctional features such as modularity and scalability. The question there is how to use Tika to best meet such requirements.
Throughout this book we’ve mentioned search engines as common places where Tika is used, so let’s take a closer look at what a search engine does and where Tika fits in. If you already know search engines, you can probably skip this section. If not, we’ll start with a quick reminder of what a search engine does before a more detailed discussion of the components in a search engine.