Prev: The 'DataFrame' object | Next: Modifying Columns in DataFrame

Data access in DataFrame¶

Accessing data in a DataFrame object can be done in many ways. Below we present the most important of them, that enable accessing data:
- 1) By column
- 2) By row (.ix, .iloc, .loc)
- 3) By scalar values (.at, .iat)

First make sure that you run the code below so that df is properly constructed for the following examples to run smoothly

import pandas as pd

df = pd.read_csv("../../data/sampledata.csv",sep=',',skiprows=3, header=0, index_col=0)
df

1. Accessing data by column¶

Selecting columns can be done in two ways:
- a) Using the column names as indexes
- b) Using the position (integer index) of the columns

a) Using column names¶

Just provide the column name as index, the way you would do it in a dictionary

c = df['Country']
print(type(c))
c

<class 'pandas.core.series.Series'>

Code
ESP       Spain
FRA      France
GRC      Greece
IRL     Ireland
ITA       Italy
MLT       Malta
PRT    Portugal
CYP      Cyprus
Name: Country, dtype: object

You may provide the column label as a list, in which case you get a new DataFrame

c = df[['Country']]
print(type(c))
c

<class 'pandas.core.frame.DataFrame'>

In case a column name is a single non-digit string, using the dot notation will work the same

df.Country
# but not this: df.2007

Code
ESP       Spain
FRA      France
GRC      Greece
IRL     Ireland
ITA       Italy
MLT       Malta
PRT    Portugal
CYP      Cyprus
Name: Country, dtype: object

For more columns privide a list of column names (note the double brackets)

df[['Country', '2007', '2015']]

b) Using column positions¶

Provide a list of column position. Note the list (not simply an indeger index)

df[[1]]

For more columns privide a list of column positions (note the double brackets)

df[[1, 3, 5]]

List comprehensions can also be used to define a list of integer indices

df[[i for i in range(3)]]

2. Accessing data by row¶

Selecting rows can be done in three ways:
- a) Using the .loc indexer for selection by label (see pandas docs on .loc)
- b) Using the .iloc indexer for selection by position (integer-based indexer) (see pandas docs on .iloc)
- c) Using the .ix indexer that is a handy combination of the above (see pandas docs on .ix)

a) Using the .loc indexer for selection by label¶

Provide a list with row labels

# Note that a row is now constructed as a new DataFrame
gr = df.loc[['GRC']]
print(type(gr))
gr

<class 'pandas.core.frame.DataFrame'>

You can provide a single label as an single index (not in a list). In this case the row is now constructed as a new Series object

gr = df.loc['GRC']
print(type(gr))
gr

<class 'pandas.core.series.Series'>

Country      Greece
2006       11020362
2007       11048473
2008       11077841
2009       11107017
2010       11121341
2011       11104899
2012       11045011
2013       10965211
2014       10892413
2015       10823732
Name: GRC, dtype: object

More rows: provide the list of labels

df.loc[['GRC','ESP','PRT']]

Slicing rows is also possible (note that single brackets are used)

df.loc['ESP':'GRC']
# You also get a new DataFrame

b) Using the .iloc indexer for selection by position¶

Provide the list of row positions

gr = df.iloc[[2]]
print(type(gr))
gr

<class 'pandas.core.frame.DataFrame'>

You can provide a single label as an single index (not in a list). In this case the row is now constructed as a new Series object

esp = df.iloc[0]
print(type(esp))
esp

<class 'pandas.core.series.Series'>

Country       Spain
2006       44397319
2007       45226803
2008       45954106
2009       46362946
2010       46576897
2011       46742697
2012       46773055
2013       46620045
2014       46480882
2015       46418269
Name: ESP, dtype: object

More rows: provide the list of integer position indexes

df.iloc[[2,4,6]]

Slicing rows is also possible (note that single brackets are used)

df.iloc[2:5]

c) Using the .ix indexer¶

.ix is the most general indexer and supports bot inputs in label and integer format. Before using it however it is advisable to read the pandas documentation on .ix (here)

# .ix with row label
df.ix[['PRT']]

# .ix with list of row positions
df.ix[[0,2,4]]

# Slicing with labels
df.ix['ESP':'GRC']

# Slicing with positions
df.ix[3:5]

Accessing scalars (single data items)¶

Selecting scalars can be done in two major ways:
- a) Using the [ ] notation to provide column & row indexes
- b) Using the .at and .iat indexers

Using the [ ] notation¶

ita = df['Country']['ITA']
ita

'Italy'

# Also works without brackets for non-digit string indices
esp = df.Country.ESP
esp

'Spain'

Using the .at indexer¶

Provide first the row label and then the column name (remember: .at is a non-integer indexer)

d = df.at['FRA','Country'] 
d

'France'

d = df.at['FRA','2015'] 
d

66808385

Using the .iat indexer¶

d = df.iat[0,0] 
d

'Spain'

d = df.iat[7,9] 
d

1153658

x = 0
for i in range(len(df)):
    for n in range(len(df.columns)):
        if i==n: 
            print(df.iat[i,n])

Spain
63621376
11048473
4489544
59095365
414508
10557560
1129303

Obviously you can use .ix (also: .loc & .iloc) indexers along with column indexing to select single data

m = df.ix['MLT']['Country']
m

'Malta'

m = df.ix['MLT'].Country
m

'Malta'

m = df.ix['MLT']['2010']
m

414508

Copyright¶

. Free learning material
. See full copyright and disclaimer notice

	Country	2006	2007	2008	2009	2010	2011	2012	2013	2014	2015
Code
ESP	Spain	44397319	45226803	45954106	46362946	46576897	46742697	46773055	46620045	46480882	46418269
FRA	France	63621376	64016229	64374990	64707044	65027512	65342776	65659790	65972097	66495940	66808385
GRC	Greece	11020362	11048473	11077841	11107017	11121341	11104899	11045011	10965211	10892413	10823732
IRL	Ireland	4273591	4398942	4489544	4535375	4560155	4576794	4586897	4598294	4617225	4640703
ITA	Italy	58143979	58438310	58826731	59095365	59277417	59379449	59539717	60233948	60789140	60802085
MLT	Malta	405308	406724	409379	412477	414508	416268	419455	423374	427364	431333
PRT	Portugal	10522288	10542964	10558177	10568247	10573100	10557560	10514844	10457295	10401062	10348648
CYP	Cyprus	1048293	1063040	1077010	1090486	1103685	1116644	1129303	1141652	1153658	1165300