Lecture 25 – Visualizing Categorical Variables

Data 94, Spring 2021

In [1]:
from datascience import *
import numpy as np

Table.interactive_plots()

Bar charts

In [2]:
schools = Table.read_table('data/r1_with_students.csv')
In [3]:
schools
Out[3]:
University Number_students Score_Result Control City State
Auburn University 26641 33.4 Public Auburn AL
Boston College 12904 45.9 Private (non-profit) Chestnut Hill MA
Boston University 25662 68.4 Private (non-profit) Boston MA
Brandeis University 5375 50.3 Private (non-profit) Waltham MA
Brown University 9391 70 Private (non-profit) Providence RI
California Institute of Technology 2240 94.5 Private (non-profit) Pasadena CA
Carnegie Mellon University 13430 81.3 Private (non-profit) Pittsburgh PA
Case Western Reserve University 10654 60 Private (non-profit) Cleveland OH
Clemson University 21436 30.7 Public Clemson SC
Columbia University 26586 87 Private (non-profit) New York NY

... (86 rows omitted)

In [4]:
schools.group('Control')
Out[4]:
Control count
Private (non-profit) 36
Public 60
In [5]:
schools.group('Control').barh('Control')