In-Class Notebook, Mar 3, 2020

There are a bunch of different ways we might think about the example of the application from the fake data from yesterday. We saw a test where the null hypothesis was that the application offering rate for black renters is equal to the overall application offering rate. With that hypothesis, we saw a binomial test, and here's a slightly more filled out version of that test.

In [1]:
import pandas as pd
from scipy.stats import binom_test
df = pd.read_csv("classdata/simulated_housing_test.csv")
In [5]:
# looking to refresh our memory of the names of variables and such
df.head()
Out[5]:
application race rent
0 1 white 526.0
1 1 white 514.0
2 1 white 512.0
3 1 white 485.0
4 1 white 505.0
In [6]:
general_prob_app = df.application.value_counts()[1] / len(df)
In [7]:
general_prob_app
Out[7]:
0.625
In [8]:
number_black_app_offered = df[df.race == 'black'].application.value_counts()[1]
number_black_testers = len(df[df.race == 'black'])
p = binom_test(number_black_app_offered, number_black_testers, general_prob_app)
In [9]:
print(p)
0.11180948620251761
In [ ]:
 

links