Given the fact that it's one of the fundamental packages for
scientific computing, NumPy is one of the packages that you must be able
to use and know if you want to do data science with Python. It offers a
great alternative to Python lists, as NumPy arrays are more compact,
allow faster access in reading and writing items, and are more
convenient and more efficient overall.
In addition, it's (partly) the fundament of other important packages
that are used for data manipulation and machine learning which you might
already know, namely, Pandas, Scikit-Learn and SciPy:
-
The Pandas data manipulation library builds on NumPy, but instead
of the arrays, it makes use of two other fundamental data structures:
Series and DataFrames,
-
SciPy builds on Numpy to provide a large number of functions that operate on NumPy arrays, and
-
The machine learning library Scikit-Learn builds not only on NumPy, but also on SciPy and Matplotlib.
You see, this Python library is a must-know: if you know how to work
with it, you'll also gain a better understanding of the other Python
data science tools that you'll undoubtedly be using.
It's a win-win situation, right?
Nevertheless, just like any other library, NumPy can come off as
quite overwhelming at start; What are the very basics that you need to
know in order to get started with this data analysis library?
This cheat sheet means to give you a good overview of the possibilities that this library has to offer.
Go and check it out for yourself!