Table of Contents

 

Copyright

Brief Table of Contents

Table of Contents

Preface

Acknowledgments

About this Book

About the Authors

About the Cover Illustration

Chapter 1. Data science in a big data world

1.1. Benefits and uses of data science and big data

1.2. Facets of data

1.2.1. Structured data

1.2.2. Unstructured data

1.2.3. Natural language

1.2.4. Machine-generated data

1.2.5. Graph-based or network data

1.2.6. Audio, image, and video

1.2.7. Streaming data

1.3. The data science process

1.3.1. Setting the research goal

1.3.2. Retrieving data

1.3.3. Data preparation

1.3.4. Data exploration

1.3.5. Data modeling or model building

1.3.6. Presentation and automation

1.4. The big data ecosystem and data science

1.4.1. Distributed file systems

1.4.2. Distributed programming framework

1.4.3. Data integration framework

1.4.4. Machine learning frameworks

1.4.5. NoSQL databases

1.4.6. Scheduling tools

1.4.7. Benchmarking tools

1.4.8. System deployment

1.4.9. Service programming

1.4.10. Security

1.5. An introductory working example of Hadoop

1.6. Summary

Chapter 2. The data science process

2.1. Overview of the data science process

2.1.1. Don���t be a slave to the process

2.2. Step 1: Defining research goals and creating a project charter

2.2.1. Spend time understanding the goals and context of your research

2.2.2. Create a project charter