This chapter covers
- Rediscovering NumPy from a performance perspective
- Leveraging NumPy views for computing efficiency and memory conservation
- Introducing array programming as a paradigm
- Configuring NumPy internals for efficiency
It is difficult to overstate the importance of NumPy [for doing] data analytics with Python. This book could might well be called "High Performance Python with NumPy." NumPy will be found somewhere in your stack: You use Pandas? NumPy. You use scikit-learn? NumPy. Dask? NumPy. SciPy? NumPy. Matplotlib? NumPy. TensorFlow? NumPy. If you are doing data analytics in Python, almost surely your answer includes NumPy.
NumPy provides N-dimensional array objects—[such as matrices, though there are many others]—along with functionality to manipulate them. The implementation is extremely efficient with its core written in Fortran and C. Almost all data analysis problems can be modeled at their core by N-dimensional arrays; this is why NumPy is pervasive in this field.
Given the core importance and pervasiveness of NumPy in Python’s data analysis science some topics related to it will be discussed in other chapters, notably: