Home      |       Contents       |       About

Prev: Combining dictionary and list       |       Next: Dictionary functions & methods

Dictionary vs. List

  • Dictionary and lists have similarities and also differences that make each appropriate for specific programming tasks. Let's explore this in more detail

Dict - List comparison

  • Similarities

    • Both are indexed: list is integer-based index while dict is key-based indexed (key should be of immutable type)
    • Both are mutable: changing their items can happen "in place" (in memory they occupy without constructing new object)
    • Both are iterable: they possess a __next__ method that returns items one by one (but remember that for a dict this happens without any particular order)
  • Differences

    • List is a sequence - Dict is a collection: however, len() function operates properly for both of them returning their lenght (number of items they contain; in a dictionary an item is a key:value pair)
    • List is ordered - Dict is not: a list is an ordered object (can be sorted). A dict is unordered, ordering is meaningless and items may returned in any order (careful when you are using a dict as a iterable)
    • List uses integer-based indexing - Dict can use any immutable type as index: this makes dict based code much more readable and understandable by humans; for example, you get more information about what the code does if you read di['client_name'] compared to li[1]
    • Dict is mapping - List is not: a dict always constructs some type of mapping between keys and values. A list is always some sequence of single items.

Dict - List use

  • Use a list:

    • ...when you need a flexible sequence for all kind of computations and when integer-based indexing does not impose significant restrictions on your data representation
    • ...also when ordering (sorting, etc.) of data is important
  • Use a dict:

    • ...when you need a 'key:value' representation of data, which essentialy is a "look-up" table: you look for a value using a key
    • ...also if you have to construct a sparse collection of items than a dense one

For example:

  • Using a sequence of numbers to represent data that will be ordered somehow is a typical task for using a list
In [1]:
import random
alist = [random.randint(1,100) for i in range(10)]
[7, 15, 24, 37, 41, 46, 63, 83, 91, 95]
  • Using a mapping sparse collection of data is a typical task for using a dict. Remember also that keys can be extracted and sorted, thus allowing "sorted" reference to dict pairs.
In [51]:
movies = {1975: 'Holy Grail',
          1979: 'Life of Brian',  
          1983: 'The Meaning  of Life'}
print(movies,'\n')                        # movies are printed without any specific order

sorted_years = sorted(movies.keys())      # sorted_years is now a sorted list of keys

for year in sorted_year:
    print(year, movies[year])
{1979: 'Life of Brian', 1983: 'The Meaning  of Life', 1975: 'Holy Grail'} 

[1975, 1979, 1983] 

1975 Holy Grail
1979 Life of Brian
1983 The Meaning  of Life
  • What if I need a dictionary that is already ordered?
    Then you needan OrderedDict object: a dictionray that 'remembers' the order of items as they were initially given. You need to import the collections module where the OrderedDict 'lives'.
    Documentation can be found here
In [67]:
# A simple OrderedDict example
import collections

country = [('Name','France'),('Capital city','Paris'),('Population',50),('Currency','Euro')]
dunord = dict(country)
dord = collections.OrderedDict(country)

for i in dunord:
    print(i, end=' ')
for i in dord:
    print(i, end=' ')    
Name Currency Population Capital city 

Name Capital city Population Currency 

Dictionary as a hash table

  • List is flexible but...:
    • ...although list is a very flexible structure this flexibility requires that several particular list-relevant processing is happenning 'under the hood'. Thus, the price for flexibility is reduced efficiency.
    • Reduced efficiency is not a problem when data size is relatively small. But, it really turns to be a restriction when dealing with big data. This is also the reason for introducing a more efficient object, namely 'ndarray' (learn more about ndarrays in numpy section).
  • Dict is efficient because...:

    • ...a dict is constructed as a hash table. In general, a hash table uses a hash function to implement a mapping between input keys and output values. The advantage is that the time required to perform a task in a dict (for example, access a certain item) is constant as compared to the time required in a list which increases as the size of the list increases.
      • Read more on hash table here
    • Technically, the above difference is reported using the big-O notation by stating that dict performance is of O(1) (meaning: standard) while list performance is of O(n) (increases as n list size increases)
  • You can read more on this issue and also see a graphical representation of dict-list efficiency here

Efficiency of the 'in' operator

  • In the following example, we check the efficiency of item look-up operation in the list and dict objects
  • The notebook magic command %timeit returns the average time required for an operation
  • See that he code constructs a 1million items list and dict and looks for the item '999999' in both of them
  • The times measured are (on average)
    • ~ 18.5 msec for the list
    • ~ 95.5 nsec for the dict (with 1 msec = 1000 nsec)
  • Thus, this operation in a dict is faster executed by a factor of ~200.
In [57]:
# timeit example 

alist = [i for i in range(1000000)]
adict = {alist[i] : alist[i] for i in alist}

k = 999999

%timeit k in alist 
%timeit k in adict
100 loops, best of 3: 18.7 ms per loop
10000000 loops, best of 3: 95.6 ns per loop

. Free learning material
. See full copyright and disclaimer notice