This section covers
- Basic usage of the NumPy library.
- Simulating random observation using NumPy.
- Visualizing simulated data.
- Estimating unknown probabilities from simulated observations.
NumPy, which stands for Numerical Python, is the engine that powers Pythonic data science. Python, despite its many virtues, is simply not suited for large-scale numeric analysis. Hence, data scientists must rely on the external NumPy library to efficiently manipulate and store numeric data. NumPy is an incredibly powerful tool for processing large collections of raw numbers. Thus, many of Python’s external data-processing libraries are NumPy-compatible. One such library is Matplotlib, which we introduced in the previous section. Other NumPy-driven libraries will be discussed in later portions of the book. This section focuses on randomized numerical simulations. We will leverage NumPy to analyze billions of random data-points. These random observations will allow us to learn hidden probabilities.
NumPy should already be installed within your working environment as one of the Matplotlib requirements. Let’s proceed to import NumPy as np, based on common NumPy usage convention.
Note
NumPy can also be installed independently of Matplotlib by calling "pip install numpy" from the command-line terminal.