Lecture 8 – Lists and Strings

Data 94, Spring 2021

Lists

We use square brackets to create lists.

In [1]:
years = [1998, 2001, 2007]
years
In [2]:
schools = ['cal', 1868, 'cal poly', 1901, 'columbia', 1754]
schools
In [3]:
nums = [2 + 2, 5, 5 - 1]
nums

Types and comparisons

In [4]:
type([3, 1, 2])
Out[4]:
list
In [5]:
type([]) 
Out[5]:
list
In [6]:
[3, 1, 2] == [3, 1, 2]
Out[6]:
True
In [7]:
[3, 1, 2] == [3, 1, 2, -4]
Out[7]:
False

Working with lists

In [8]:
len([9, 2.5, 7])
Out[8]:
3
In [9]:
max([9, 2.5, 7])
Out[9]:
9
In [10]:
# Earliest in dictionary
min(['hello', 'hi', 'abbey'])
Out[10]:
'abbey'
In [11]:
# TypeError!
min(['hello', 2.5, 'abbey'])
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-11-1ba5dc693a10> in <module>
      1 # TypeError!
----> 2 min(['hello', 2.5, 'abbey'])

TypeError: '<' not supported between instances of 'float' and 'str'
In [12]:
sum([9, 2.5, 7])
Out[12]:
18.5
In [13]:
[1, 2] + [3, 4] * 2
Out[13]:
[1, 2, 3, 4, 3, 4]

Append

In [14]:
groceries = ['eggs', 'milk']
groceries
Out[14]:
['eggs', 'milk']
In [15]:
groceries.append('bread')
In [16]:
groceries
Out[16]:
['eggs', 'milk', 'bread']

Containment

In [17]:
3 in [3, 1, 'dog']
Out[17]:
True
In [18]:
10 not in [3, 1, 'dog']
Out[18]:
True
In [19]:
not 10 in [3, 1, 'dog']
Out[19]:
True
In [20]:
[3, 1] in [3, 1, 'dog']
Out[20]:
False

Quick Check 1

In [ ]:
 
In [ ]:
 
In [ ]:
 

Indexing

In [21]:
nums = [3, 1, 'dog', -9.5, 'berk']
In [22]:
nums[0]
Out[22]:
3
In [23]:
nums[3]
Out[23]:
-9.5
In [24]:
nums[5]
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-24-523f62328233> in <module>
----> 1 nums[5]

IndexError: list index out of range

Slicing

In [25]:
nums = [3, 1, 'dog', -9.5, 'berk']
In [26]:
nums[1:3]
Out[26]:
[1, 'dog']
In [27]:
nums[0:4]
Out[27]:
[3, 1, 'dog', -9.5]
In [28]:
# If you don't include 'start',
# the slice starts at the
# beginning of the list
nums[:4]
Out[28]:
[3, 1, 'dog', -9.5]
In [29]:
# If you don't include 'stop',
# the slice goes until the
# end of the list
nums[2:]
Out[29]:
['dog', -9.5, 'berk']

Negative indexing

In [30]:
nums = [3, 1, 'dog', -9.5, 'berk']
In [31]:
nums[len(nums) - 1]
Out[31]:
'berk'
In [32]:
nums[-1]
Out[32]:
'berk'
In [33]:
nums[-3]
Out[33]:
'dog'
In [34]:
nums[-3:]
Out[34]:
['dog', -9.5, 'berk']

Quick Check 2

In [ ]:
 
In [ ]:
 

Example: While loops with lists

In [35]:
def square_all(vals):
    output = []
    i = 0
    while i < len(vals):
        val_squared = vals[i] ** 2
        output.append(val_squared)
        i += 1
    return output
In [36]:
square_all([1, 10, 3, 4])
Out[36]:
[1, 100, 9, 16]

.index

.index tells us the position of an element in a list – if it is in the list.

In [37]:
[9, 8, 14, -1].index(14)
Out[37]:
2
In [38]:
[9, 8, 14, -1].index(15)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-38-aa968c19edb0> in <module>
----> 1 [9, 8, 14, -1].index(15)

ValueError: 15 is not in list
In [39]:
# Two occurrences of 2
# Gives index of first one
[1, 2, 4, 2, 4].index(2)
Out[39]:
1
In [40]:
[1, 2, 4, 2, 4].count(2)
Out[40]:
2

Example: next day of the week

In [41]:
def next_day(day):
    week = ['Sunday', 'Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday']
    curr = week.index(day)
    return week[(curr + 1) % 7]
In [42]:
next_day('Wednesday')
Out[42]:
'Thursday'
In [43]:
next_day('Saturday')
Out[43]:
'Sunday'

Strings

In [44]:
university = 'uc berkeley'
In [45]:
list(university)
Out[45]:
['u', 'c', ' ', 'b', 'e', 'r', 'k', 'e', 'l', 'e', 'y']
In [46]:
university[3:7]
Out[46]:
'berk'
In [47]:
university[1]
Out[47]:
'c'
In [48]:
university[-8:]
Out[48]:
'berkeley'
In [49]:
# Weird slicing that can be used
# to reverse a string or list
university[::-1]
Out[49]:
'yelekreb cu'

Finding characters

In [50]:
'alfalfa'.find('f')
Out[50]:
2
In [51]:
'alfalfa'.rfind('a')
Out[51]:
6
In [52]:
'alfalfa'.find('b')
Out[52]:
-1
In [53]:
'alfalfa'.index('b')
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-53-60965c2ddfae> in <module>
----> 1 'alfalfa'.index('b')

ValueError: substring not found

Differences between strings and lists

In [54]:
test_list = [8, 0, 2, 4]
test_string = 'zebra'
In [55]:
test_list[1] = 99
In [56]:
test_list
Out[56]:
[8, 99, 2, 4]
In [57]:
test_string[1] = 'f'
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-57-1fedbdd7d292> in <module>
----> 1 test_string[1] = 'f'

TypeError: 'str' object does not support item assignment
In [58]:
test_string[:1] + 'f' + test_string[2:]
Out[58]:
'zfbra'

Demo

Let's use data about survivors of the Titanic, downloaded from here.

In [59]:
from datascience import *
table = Table.read_table('data/titanic.csv').select(['Name', 'Age', 'Sex', 'Fare', 'Survived'])
table
Out[59]:
Name Age Sex Fare Survived
Braund, Mr. Owen Harris 22 male 7.25 0
Cumings, Mrs. John Bradley (Florence Briggs Thayer) 38 female 71.2833 1
Heikkinen, Miss. Laina 26 female 7.925 1
Futrelle, Mrs. Jacques Heath (Lily May Peel) 35 female 53.1 1
Allen, Mr. William Henry 35 male 8.05 0
Moran, Mr. James nan male 8.4583 0
McCarthy, Mr. Timothy J 54 male 51.8625 0
Palsson, Master. Gosta Leonard 2 male 21.075 0
Johnson, Mrs. Oscar W (Elisabeth Vilhelmina Berg) 27 female 11.1333 1
Nasser, Mrs. Nicholas (Adele Achem) 14 female 30.0708 1

... (881 rows omitted)

Soon, we will learn how to load in data like the above and extract columns as lists. But for now, just run the following cell.

In [60]:
names = list(table.column('Name'))
ages = list(table.column('Age'))
survived = list(table.column('Survived'))
In [61]:
ages[:5]
Out[61]:
[22.0, 38.0, 26.0, 35.0, 35.0]
In [62]:
names[:5]
Out[62]:
['Braund, Mr. Owen Harris',
 'Cumings, Mrs. John Bradley (Florence Briggs Thayer)',
 'Heikkinen, Miss. Laina',
 'Futrelle, Mrs. Jacques Heath (Lily May Peel)',
 'Allen, Mr. William Henry']
In [63]:
100 * sum(survived) / len(survived)
Out[63]:
38.38383838383838