Home      |       Contents       |       About

Prev: What is pandas?       |      Next: Indexes and NaN values

The 'Series' object

  • A Series object is a one-dimensional array (similar to that of numpy ndarray) but with a significant addition: it includes a data label index (even if one is not explicitly specified).

Example:

  • Suppose that your data are country popoulation and you want to index it by the country two-letter code. The Series object for this data representation is the following:
    • Population data in one column
    • Label indexes (country code) in another

Series

Constructing a Series object

Constructing a Series from a list

  • Most of the times you will populate Series objects by reading data from external files. However, it is useful to understand that a Series object can be constructed by calling the Series() constructor and passing an array-like object as argument. The simplest case is to pass a list argument.
In [1]:
import pandas as pd

s = pd.Series([10,20,30,40,50], dtype='int')
print(s)
print(s.index, s.values)
0    10
1    20
2    30
3    40
4    50
dtype: int32
Int64Index([0, 1, 2, 3, 4], dtype='int64') [10 20 30 40 50]
  • In this example, 's' is constructed as a Series object by passing a list to the Series constructor (the syntax is similar to the construction of a numpy array).
  • As we did not define any specific index, sequential integers starting from zero (the integers 0..4 in the example) are automatically inserted as indexes in the constructed Series object.
  • See also that we can refer to indexes and values by calling s.index and s.values respectively

Constructing a Series from an array

  • Similar to the above example but with a numpy array instead of a list.
In [2]:
import numpy as np
import pandas as pd

s = pd.Series(np.array([chr(i) for i in range(65,70)]))
print(s)
print(s.index, s.values)
0    A
1    B
2    C
3    D
4    E
dtype: object
Int64Index([0, 1, 2, 3, 4], dtype='int64') ['A' 'B' 'C' 'D' 'E']

Constructing a Series from a dictionary

  • You can think of a Series as a dictionary, that is, an object mapping the indexes on data values.
  • Furthermore, you can construct a Series object by passing a dictionary as argument to the Series() constructor.

  • When a dictionary is passed to the Series() constructor then:

    • dict keys become Series indexes
    • dict values become Series values
In [3]:
import numpy as np
import pandas as pd

data = {'a':1,
        'b':2,
        'c':3}
s = pd.Series(data)
print(s)

# and vice-versa: you get back your dictionary by passing a Series to the dict() constructor
d = dict(s)
d
a    1
b    2
c    3
dtype: int64
Out[3]:
{'a': 1, 'b': 2, 'c': 3}
  • Another example of constructing a Series from a dictionary
In [4]:
import numpy as np
import pandas as pd

# Data: Unemployment percentages in various countries
un_data = {'Greece':27,
           'Spain':21,
           'Italy':20}
uns = pd.Series(un_data)     # uns is constructed as a Series object based on un_data dictionary
print(uns, '\n')
Greece    27
Italy     20
Spain     21
dtype: int64 

Accessing Series values

  • What you already know about accessing values using integer indexes (as in lists) or string indexes (as in dictionaries) holds true for accessing Series data, as the examples below demonstrate.
  • However, accessing data in a Series or DataFrame object is a bit more complex and we thoroughly analyze it in next sections.
In [5]:
import numpy as np
import pandas as pd

un_data = {'Greece':27,
           'Spain':21,
           'Italy':20}
uns = pd.Series(un_data)     

print(uns['Greece'])
print(uns['Greece']+uns['Spain'])

uns['Greece'] = 30
print(uns['Greece'])
27
48
30
In [6]:
import numpy as np
import pandas as pd

s = pd.Series(np.array([chr(i) for i in range(65,70)]))
print(s)

print(s[0], s[3]+s[4])
s[0] = 'NEW'
s[1] = 'OLD'
print(s)
0    A
1    B
2    C
3    D
4    E
dtype: object
A DE
0    NEW
1    OLD
2      C
3      D
4      E
dtype: object

Series supports vectorized operations

  • A Series (just like ndarrays) is an object supporting vectorized operations
In [7]:
import numpy as np
import pandas as pd

s = pd.Series([10,20,30,40,50])
print(s)

print(s[s>30])       # printing only Series values > 30 

print(s*2)           # multiplies all Series data by 2

print(np.sqrt(s))    # computes the square root of all Series data
0    10
1    20
2    30
3    40
4    50
dtype: int64
3    40
4    50
dtype: int64
0     20
1     40
2     60
3     80
4    100
dtype: int64
0    3.162278
1    4.472136
2    5.477226
3    6.324555
4    7.071068
dtype: float64

. Free learning material
. See full copyright and disclaimer notice