Home      |       Contents       |       About

Prev: -       |       Next: Array construction

What is numpy?

  • Numpy (short for 'Numerical Python'): is the fundamental (but not the only) package required for high performance scientific computing and data analysis
  • NumPy introduces and makes extensive use of the ndarray (nth-dimensional array) type, which is a homogeneous multidimensional array: it is a table of elements (usually numbers), all of the same type, indexed by a tuple of positive integers.
  • It offers standard mathematical functions for fast operations on entire arrays of data without having to write loops ('array oriented computing')
  • An ndarray object can be of any dimension but most usually we deal with one-, two- or three- dimensional arrays (see figure below).

ndarray vs. list

  • A common question is why we do we need the ndarray object since we can do the same operations with lists?
  • The typical answer includes two legs:
    • (a) Speed of code execution: A list is a flexible collection of various objects (integers, floats, strings, other objects, etc.). This flexibility however requires system memory and time for implementing various processes necessary for list construction and management. An array instead is a much more compact (although less flexible) object minimizing the code execution time and required system memory.
    • (b) Vectorization: arrays are equipped with lots of functions that support vectorization, which is the ability to apply operations (addition, multiplication, etc.) on entire arrays instead of single numbers (scalars). Thus, many operations that would require loops if using lists are expressed in one single line of code with numpy arrays.

Read more on the ndarray vs. list issue

Getting numpy

  • You can get numpy and any documentation you need from numpy homepage
  • However: the easiest way of installing numpy is through some of the popular python distributions that contain several major Python packages

numpy provides a basis

  • Although numpy offers everything one needs for big data processing usually it is used as the basis for the operation of more efficient and user friendly packages that automate a lot more the data analysis tasks.
  • Such packages are:

. Free learning material
. See full copyright and disclaimer notice