9 Natural Language Processing with TensorFlow: Sentiment Analysis
This chapter covers,
- Understanding the basic characteristics of a classification based text dataset and cleaning the text inputs with a combination of python libraries like Pandas and NLTK
- Analysing text specific attributes such as the vocabulary size and sequence length and converting text to numerical representations to feed into the model
- Creating data pipeline to handle text sequences with TensorFlow
- Implementing a recurrent deep learning model for analysing sentiments in reviews and understand the underlying mechanics of deep sequential models like LSTMs in the process
- Training the model on an imbalanced product reviews (different amounts of examples for each label)
- Recognizing the role of word embeddings in NLP and implementing word embeddings to improve the deep learning model performance
In the previous chapter, we looked at a compute vision application called image segmentation. Other than images, text data is considered to be a prominent modality of data. For example, the world wide web is teeming with text data. We can safely assume that it is the most common modality of data available in the web. Therefore, natural language processing has been and will be a deeply rooted topic, enabling us to harness the power of the freely available text (e.g. language modelling) and build machine learning products that can leverage textual data to produced meaningful outcomes (e.g. sentiment analysis).