Home      |       Contents       |       About

Prev: What is 'numpy'?       |       Next: Array construction (cont'd)

Array construction

  • Arrays can be constructed in many various ways and you should be familiar with the most common ones presented here

To import numpy in your code:

...always use:

In [ ]:
import numpy as np 

This will allow you to clearly distinguish the imported namespace using the 'np.' prefix

Construct a numpy array

  • array(): Use the 'array' method with an array_like object (e.g. a list) as argument
In [1]:
import numpy as np

# Pass a list argument to array() method  
alist = [1,2,3,4,5]
ar = np.array(alist)
print(ar,'\n')

# or directly 
ar = np.array([1,2,3,4,5])
print(ar,'\n')

# Be careful for this frequent error
#ar = np.array(1,2,3,4,5)    # WRONG: argument should be a list; brackets should be included  

# You may explicitely define the type of the array elements
b = np.array([1.2, 3.5, 5.1], dtype='float')
print(b,'\n') 

c = np.array([1,2,3,4,5], 'int')
print(c,'\n')

d = np.array([1,2,3,4,5])
print(d,'\n',type(d))
[1 2 3 4 5] 

[1 2 3 4 5] 

[ 1.2  3.5  5.1] 

[1 2 3 4 5] 

[1 2 3 4 5] 
 <class 'numpy.ndarray'>
  • You may define the type of the array mmber items. See more on numpy data types here
  • Numpy offers ways for specifying more advanced datatypes and manipulate the data by names (known as a "structured array"). You can get the idea here

Array vs. List: measuring the performance

  • Arrays are optimized data structures that support significantly more efficient data processing. Below we use the IPython 'magic' command %timeit to calculate the time required for raising all list and array members to the power of 2
In [2]:
import numpy as np

alist = [i for i in range(10000)]
ar = np.array(alist) 

print('list performance using comprehension:')
%timeit [i**2 for i in alist]

print('\nndarray performance using comprehension:')
%timeit [i**2 for i in ar]

print('\nndarray performance using vectorized pow():')
%timeit pow(ar,2)
list performance using comprehension:
100 loops, best of 3: 4.09 ms per loop

ndarray performance using comprehension:
100 loops, best of 3: 2.11 ms per loop

ndarray performance using vectorized pow():
The slowest run took 7.44 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 8.84 µs per loop

Array dimensions

  • The array elements are always arranged in dimensions (or 'axes')
  • The number of array dimensions is the number of independent indices needed to identify a single member item in the array.
  • The array dimension is returned by the array ndim property. Obviously 1D arrays return ndim = 1
In [3]:
import numpy as np
ar = np.array([1,2,3,4,5])
print(ar.ndim)
1

Constructing 2D, 3D.. arrays

  • By passing list of lists as argument to array() we can construct ndarray objects of greater than 1 dimensions (2D, 3D, etc.)
In [4]:
import numpy as np

# 2D arrays
# Note the external brackets enclosing the two lists
br = np.array([[1,2,3,4],[5,6,7,8]])    
print(br,'\n', br.ndim,'\n')

cr = np.array([[1,2,3],[5,6,7],[8,9,10]])    
print(cr,'\n', cr.ndim,'\n')
[[1 2 3 4]
 [5 6 7 8]] 
 2 

[[ 1  2  3]
 [ 5  6  7]
 [ 8  9 10]] 
 2 

In [5]:
import numpy as np

# 3D arrays
# Note the external brackets enclosing the two lists of lists (2D arrays)  

ar3 = np.array([[[1,2,3,4],[5,6,7,8]], [[10,20,30,40],[50,60,70,80]]])  
print(ar3,'\n', ar3.ndim,'\n')
[[[ 1  2  3  4]
  [ 5  6  7  8]]

 [[10 20 30 40]
  [50 60 70 80]]] 
 3 

Examples

  • Construct an one-dimensional array with 10 pseudo-random integers in [1,100]. Print the array and the array dimension.
In [6]:
import numpy as np 

ar = np.array([np.random.randint(1,101) for i in range(10)])
print(ar)
print(ar.ndim)
[100  46  29  18  68  43  92  51  30  85]
1
  • See that we have called the numpy random.randint() function (not the Python random.randint), which is part of numpy.random module; you can read more about this module here
  • Get used to reading the numpy documentation: it is of key importance in learning how to make the most out of it.
  • Construct a two-dimensional array with 10 pseudo-random integers in [1,100] (5 in each of two rows). Print the array and the array dimensions.
In [7]:
import numpy as np

ar = np.array([[np.random.randint(1,101) for i in range(5)],
               [np.random.randint(1,101) for i in range(5)]])

print(ar)
print(ar.ndim)
[[54 37 77  8 47]
 [90 75 29 65 38]]
2

. Free learning material
. See full copyright and disclaimer notice