Chapter 1. Data All Around Us: The Virtual Wilderness

 

3.1. Data as the object of study

3.1.1. The users of computers and the internet became data generators

3.1.2. Data for its own sake

3.1.3. Data scientist as explorer

3.2. Where data might live, and how to interact with it

3.2.1. Flat files

3.2.2. HTML

3.2.3. XML

3.2.4. JSON

3.2.5. Relational databases

3.2.6. Non-relational databases

3.2.7. APIs

3.2.8. Common bad formats

3.2.9. Unusual formats

3.2.10. Deciding which format to use

3.3. Scouting for data

3.3.1. First step: Google search

3.3.2. Copyright and licensing

3.3.3. The data you have: is it enough?

3.3.4. Combining data sources

3.3.5. Web scraping

3.3.6. Measuring or collecting things yourself

3.4. Example: microRNA and gene expression

Exercises

Summary

What's inside