chapter three

3 Running Random Simulations in NumPy

This section covers

Basic usage of the NumPy library.
Simulating random observation using NumPy.
Visualizing simulated data.
Estimating unknown probabilities from simulated observations.

NumPy, which stands for Numerical Python, is the engine that powers Pythonic data science. Python, despite its many virtues, is simply not suited for large-scale numeric analysis. Hence, data scientists must rely on the external NumPy library to efficiently manipulate and store numeric data. NumPy is an incredibly powerful tool for processing large collections of raw numbers. Thus, many of Python’s external data-processing libraries are NumPy-compatible. One such library is Matplotlib, which we introduced in the previous section. Other NumPy-driven libraries will be discussed in later portions of the book. This section focuses on randomized numerical simulations. We will leverage NumPy to analyze billions of random data-points. These random observations will allow us to learn hidden probabilities.

3.1 Simulating Random Coin-Flips and Die-Rolls Using NumPy

NumPy should already be installed within your working environment as one of the Matplotlib requirements. Let’s proceed to import NumPy as np, based on common NumPy usage convention.

Note

NumPy can also be installed independently of Matplotlib by calling "pip install numpy" from the command-line terminal.

Listing 3.1. Importing NumPy

import numpy as np

3 Running Random Simulations in NumPy

This section covers

3.1 Simulating Random Coin-Flips and Die-Rolls Using NumPy

Note

Listing 3.1. Importing NumPy

3.1.1 Analyzing Biased Coin-Flips

3.2 Computing Confidence Intervals Using Histograms and NumPy Arrays

3.2.1 Binning Similar Points in Histogram Plots

3.2.2 Deriving Probabilities from Histograms

3.2.3 Shrinking the Range of a High Confidence Interval

3.2.4 Computing Histograms in NumPy

3.3 Leveraging Confidence Intervals to Analyze a Biased Deck of Cards

3.4 Using Permutations to Shuffle Cards

3.5 Summary