9 Natural Language Processing with TensorFlow: Sentiment Analysis

This chapter covers,

Understanding the basic characteristics of a classification based text dataset and cleaning the text inputs with a combination of python libraries like Pandas and NLTK
Analysing text specific attributes such as the vocabulary size and sequence length and converting text to numerical representations to feed into the model
Creating data pipeline to handle text sequences with TensorFlow
Implementing a recurrent deep learning model for analysing sentiments in reviews and understand the underlying mechanics of deep sequential models like LSTMs in the process
Training the model on an imbalanced product reviews (different amounts of examples for each label)
Recognizing the role of word embeddings in NLP and implementing word embeddings to improve the deep learning model performance

In the previous chapter, we looked at a compute vision application called image segmentation. Other than images, text data is considered to be a prominent modality of data. For example, the world wide web is teeming with text data. We can safely assume that it is the most common modality of data available in the web. Therefore, natural language processing has been and will be a deeply rooted topic, enabling us to harness the power of the freely available text (e.g. language modelling) and build machine learning products that can leverage textual data to produced meaningful outcomes (e.g. sentiment analysis).

9.1 What the text? Explore and process text

9.2 Getting text ready for the model

9.2.1 Splitting training/validation and testing data

9.2.2 Analyze the vocabulary

9.2.3 Analyzing the sequence length

9.2.4 Text to words and then to numbers with Keras

9.3 Defining an end-to-end NLP pipeline with TensorFlow

9.4 Happy reviews mean happy customers: Sentiment analysis

9.4.1 LSTM Networks

9.4.2 Defining the final model

9.5 Training and evaluating the model

9.6 Injecting semantics with word vectors

9.6.1 Word embeddings

9.6.2 Defining the final model with word embeddings

9.6.3 Training and evaluating the model

9.7 Summary