Lecture 3 – Variables and Types

Data 94, Spring 2021

Functions

In [1]:
# Evaluates to 3
max(2, 3)
Out[1]:
3
In [2]:
# Evaluates to 4
max(4, min(1, 9))
Out[2]:
4
In [3]:
# Evaluates to -5
-abs(max(4, 5, -1))
Out[3]:
-5
In [4]:
# After you run this cell, notice the lack of Out[N]: below
print(5)
5
In [5]:
# After you run this cell, notice the Out[N]
5
Out[5]:
5

Quick Check 1

In [ ]:
 
In [6]:
max
Out[6]:
<function max>

Variables

In [7]:
x = 3
In [8]:
x
Out[8]:
3
In [9]:
3 = x
  File "<ipython-input-9-97d1cbcfdbdb>", line 1
    3 = x
    ^
SyntaxError: cannot assign to literal
In [10]:
complicated = 3 + 4**(min(1, 2, abs(14*18**2)))
In [11]:
complicated
Out[11]:
7
In [12]:
three = 3
four = 4
three
Out[12]:
3
In [13]:
three = three + four
four = four + 5
In [14]:
three
Out[14]:
7
In [15]:
four
Out[15]:
9
In [16]:
turtle + 5
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-16-8503ace1c834> in <module>
----> 1 turtle + 5

NameError: name 'turtle' is not defined

Quick Check 2

In [ ]:
 

Types

In [17]:
x = -2 + 3
type(x)
Out[17]:
int
In [18]:
y = 15 - 14.0 / 2
type(y)
Out[18]:
float
In [19]:
name = 'junior rampure'
type(name)
Out[19]:
str
In [20]:
len('junior rampure')
Out[20]:
14
In [21]:
empty = ""
len(empty)
Out[21]:
0
In [22]:
'go ' + ' bears' + '!!!'
Out[22]:
'go  bears!!!'
In [23]:
name_1 = 'Carol'
name_2 = 'Jean'
print('Hi ' + name_1 + ', my name is ' \
      + name_2 + '. How are you doing today?')
Hi Carol, my name is Jean. How are you doing today?
In [24]:
# Notice how there are quotes in the output, but not above!
'Hi ' + name_1 + ', my name is ' \
      + name_2 + '. How are you doing today?'
Out[24]:
'Hi Carol, my name is Jean. How are you doing today?'

Typecasting

In [25]:
# If a float is passed: cuts off decimal
int(3.92)
Out[25]:
3
In [26]:
int(-2.99)
Out[26]:
-2
In [27]:
int("-5")       
Out[27]:
-5
In [28]:
# If a string is passed, it must contain an int
int("4.1")
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-28-bc9758ad056b> in <module>
      1 # If a string is passed, it must contain an int
----> 2 int("4.1")

ValueError: invalid literal for int() with base 10: '4.1'
In [29]:
float(3)
Out[29]:
3.0
In [30]:
float("3.14159265")
Out[30]:
3.14159265
In [31]:
float(-19.0)
Out[31]:
-19.0
In [32]:
float(-14.0 + "3.0")
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-32-c0fae37873e8> in <module>
----> 1 float(-14.0 + "3.0")

TypeError: unsupported operand type(s) for +: 'float' and 'str'
In [33]:
str(13 + 14 + 15/2)
Out[33]:
'34.5'
In [34]:
str("1")
Out[34]:
'1'

Quick Check 3

In [ ]:
 

Demo

You should ignore most of this code.

In [35]:
from datascience import *
import numpy as np

data = Table.read_table('data/countries.csv')
data = data.relabeled('Country(or dependent territory)', 'Country') \
           .relabeled('% of world', '%') \
           .relabeled('Source(official or UN)', 'Source')
data = data.with_columns(
    'Country', data.apply(lambda s: s[:s.index('[')] if '[' in s else s, 'Country'),
    'Population', data.apply(lambda i: int(i.replace(',', '')), 'Population'),
    '%', data.apply(lambda f: float(f.replace('%', '')), '%')
)

I've loaded in a table of information from Wikipedia, containing the population of each country both in absolute terms ("Population") and as a proportion of the total global population ("%").

In [36]:
data
Out[36]:
Rank Country Population % Date Source
1 China 1405936040 17.9 27 Dec 2020 National population clock[3]
2 India 1371366679 17.5 27 Dec 2020 National population clock[4]
3 United States 330888778 4.22 27 Dec 2020 National population clock[5]
4 Indonesia 269603400 3.44 1 Jul 2020 National annual projection[6]
5 Pakistan 220892331 2.82 1 Jul 2020 UN Projection[2]
6 Brazil 212523810 2.71 27 Dec 2020 National population clock[7]
7 Nigeria 206139587 2.63 1 Jul 2020 UN Projection[2]
8 Bangladesh 169885314 2.17 27 Dec 2020 National population clock[8]
9 Russia 146748590 1.87 1 Jan 2020 National annual estimate[9]
10 Mexico 127792286 1.63 1 Jul 2020 National annual projection[10]

... (232 rows omitted)

Unsurprisingly, values in the "Country" column are stored as strings.

In [37]:
data.column('Country').take(5)
Out[37]:
'Brazil'
In [38]:
# numpy.str_ is a fancy version of a string; for our purposes they are the same
type(data.column('Country').take(5))
Out[38]:
numpy.str_
In [39]:
print(data.column('Country').take(5))
Brazil

Values in the "Population" column are stored as integers.

In [40]:
data.column('Population').take(5)
Out[40]:
212523810
In [41]:
# Again, numpy.int64 is a fancy version of an integer
type(data.column('Population').take(5))
Out[41]:
numpy.int64

And values in the "%" column are stored as floats.

In [42]:
data.column('%').take(5)
Out[42]:
2.71
In [43]:
type(data.column('%').take(5))
Out[43]:
numpy.float64

For fun, replace the variable country with one that you like and you'll see its population formatted nicely as a string.

In [44]:
country = 'Venezuela'

# Notice the meaningful variable names!
# You don't know how the code works, but given the variable names you know what it's doing.
pop = data.where('Country', are.equal_to(country)).column('Population').take(0)
percent = data.where('Country', are.equal_to(country)).column('%').take(0)

# Why must we write `str(pop)` instead of just pop?
output = country + " has a population of " + str(pop) + ", which is " + str(percent) + "% of the world's total population."
print(output)
Venezuela has a population of 28435943, which is 0.363% of the world's total population.

Practice

Sometimes, we will include extra practice problems at the end for you to work on after class. These are not required or graded in any way, but they're highly recommended.

To see the solution for a given problem, you can click the triangle next to "Answer".

Question 1

WITHOUT running any code, what does the following expression evaluate to?

max(min(14, 15), abs(16 - min(17, max(-5, 15), -9)))

You should write out your thought process on paper.

Answer 25

Question 2

Recall, the Pythagorean theorem is $$c^2 = a^2 + b^2$$

Using the Pythagorean therem, set the variable q2 equal to the length of the longest side of a right-angled triangle whose other side-lengths are side_1 and side_2. Your code should not use the numbers 8 or 13 directly, but it should instead use the variables side_1 and side_2.

Hint: x**0.5 computes the square root of x.

In [ ]:
side_1 = 8
side_2 = 13

q2 = ... # YOUR CODE HERE
Answer q2 = (side_1**2 + side_2**2)**0.5

Question 3

Assign the variable q3 to a string that reads 'Lisa Jobs is 45 years old and lives in Canada.'

You MUST use the four variables defined below to create the string. This is similar to the example on slide 50 of the lecture.

Hint: You will need to cast the variable age to an integer so that you can change it from 42 to 45.

In [ ]:
first_name = 'Lisa'
last_name = 'Jobs'
age = "42"
location = 'Canada'

q3 = ... # YOUR CODE HERE
Answer q3 = first_name + ' ' + last_name + ' is ' + str(int(age) + 3) + ' years old and lives in ' + location + '.'