Notebook

Using Pandas for Analyzing Data - Grouping and Aggregating

In [2]:
%matplotlib
import numpy as np
import pandas as pd
Using matplotlib backend: Qt4Agg

Read the csv file of your choice

In [4]:
ver=pd.read_csv("ver.csv")

Melt data

In [3]:
melt = pd.melt(ver, id_vars = 'loan_purpose_name')

Obtain the first five rows of melted data

In [4]:
melt.iloc[0:5,:]
Out[4]:
loan_purpose_name variable value
0 Refinancing action_taken 1
1 Refinancing action_taken 1
2 Home purchase action_taken 1
3 Home purchase action_taken 1
4 Refinancing action_taken 1
In [5]:
melt = pd.melt(ver, id_vars = 'county_name')
melt.iloc[0:5,:]
Out[5]:
county_name variable value
0 Grand Isle County action_taken 1
1 Chittenden County action_taken 1
2 Chittenden County action_taken 1
3 Chittenden County action_taken 1
4 Chittenden County action_taken 1

Return descriptive statistics of the dataset

In [6]:
ver.describe()
Out[6]:
action_taken agency_code applicant_ethnicity applicant_income_000s applicant_race_1 applicant_sex census_tract_number co_applicant_ethnicity co_applicant_race_1 co_applicant_sex purchaser_type hud_median_family_income loan_amount_000s number_of_1_to_4_family_units number_of_owner_occupied_units minority_population population tract_to_msamd_income logloanamt logincome
count 34573.000000 34573.000000 34573.000000 34573.000000 34573.000000 34573.000000 34573.000000 34573.000000 34573.000000 34573.000000 34573.000000 34573.000000 34573.000000 34573.000000 34573.000000 34573.000000 34573.000000 34573.000000 34573.000000 34573.000000
mean 2.062824 5.868597 2.139213 110.978394 5.084170 1.483846 5411.158321 3.246898 6.214965 3.093165 1.628843 69460.284037 178.622682 1824.831632 1249.137593 5.512357 4156.241489 105.873500 4.979917 4.413553
std 1.605966 2.694220 0.397004 159.169449 0.593289 0.697764 4742.848056 1.420300 1.482264 1.572967 2.378320 6401.564966 116.451796 617.210275 544.292733 3.237173 1737.671376 23.755471 0.722132 0.697454
min 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 0.000000 63800.000000 1.000000 140.000000 53.000000 2.110000 299.000000 35.049999 0.000000 0.000000
25% 1.000000 3.000000 2.000000 53.000000 5.000000 1.000000 33.010000 2.000000 5.000000 2.000000 0.000000 63800.000000 109.000000 1383.000000 853.000000 3.530000 2850.000000 92.940002 4.691348 3.970292
50% 1.000000 5.000000 2.000000 80.000000 5.000000 1.000000 9532.000000 2.000000 5.000000 2.000000 0.000000 63800.000000 160.000000 1718.000000 1204.000000 4.380000 4054.000000 104.110001 5.075174 4.382027
75% 3.000000 9.000000 2.000000 120.000000 5.000000 2.000000 9609.000000 5.000000 8.000000 5.000000 3.000000 76700.000000 226.000000 2350.000000 1532.000000 6.680000 5094.000000 118.150002 5.420535 4.787492
max 6.000000 9.000000 4.000000 9999.000000 7.000000 4.000000 9713.000000 5.000000 8.000000 5.000000 9.000000 76700.000000 2885.000000 3311.000000 2874.000000 23.920000 8698.000000 203.619995 7.967280 9.210240

8 rows × 28 columns

Crosstab of the data by specified columns

In [7]:
pd.crosstab(ver['county_name'],ver['action_taken_name'])
Out[7]:
action_taken_name Application approved but not accepted Application denied by financial institution Application withdrawn by applicant File closed for incompleteness Loan originated Loan purchased by the institution
county_name
Addison County 96 274 154 43 1319 102
Bennington County 59 325 129 70 836 147
Caledonia County 56 197 122 28 844 25
Chittenden County 457 1004 847 264 7934 1187
Essex County 6 49 17 5 130 11
Franklin County 119 430 206 84 1844 168
Grand Isle County 32 95 42 25 390 42
Lamoille County 61 237 151 41 992 112
Orange County 42 212 81 24 555 69
Orleans County 43 295 122 49 861 23
Rutland County 103 456 193 92 1317 137
Washington County 140 472 306 98 2290 287
Windham County 82 421 195 73 1154 138
Windsor County 125 427 194 80 1386 223

Return a subset of the data

In [6]:
incomesubset = ver[(ver['applicant_income_000s'] > 0 ) & (ver['applicant_income_000s'] < 1000)]
incomesubset
Out[6]:
action_taken action_taken_name agency_code agency_abbr agency_name applicant_ethnicity applicant_ethnicity_name applicant_income_000s applicant_race_1 applicant_race_name_1 purchaser_type_name hud_median_family_income loan_amount_000s number_of_1_to_4_family_units number_of_owner_occupied_units minority_population population tract_to_msamd_income logloanamt logincome
0 1 Loan originated 1 OCC Office of the Comptroller of the Currency 2 Not Hispanic or Latino 161 5 White Fannie Mae (FNMA) 76700 200 2681 1186 7.120000 3272 80.459999 5.298317 5.081404
1 1 Loan originated 1 OCC Office of the Comptroller of the Currency 2 Not Hispanic or Latino 92 2 Asian Loan was not originated or was not sold in cal… 76700 279 2818 2345 4.580000 7144 154.830002 5.631212 4.521789
2 1 Loan originated 5 NCUA National Credit Union Administration 2 Not Hispanic or Latino 106 5 White Fannie Mae (FNMA) 76700 244 2447 2136 3.420000 6227 124.199997 5.497168 4.663439
3 1 Loan originated 7 HUD Department of Housing and Urban Development 2 Not Hispanic or Latino 49 5 White Commercial bank, savings bank or savings assoc… 76700 196 1810 1572 3.710000 5009 126.970001 5.278115 3.891820
4 1 Loan originated 1 OCC Office of the Comptroller of the Currency 2 Not Hispanic or Latino 74 5 White Loan was not originated or was not sold in cal… 76700 70 3311 2874 5.590000 8698 132.919998 4.248495 4.304065
5 1 Loan originated 1 OCC Office of the Comptroller of the Currency 2 Not Hispanic or Latino 56 5 White Fannie Mae (FNMA) 76700 222 1994 1725 4.440000 5070 121.349998 5.402677 4.025352
6 1 Loan originated 1 OCC Office of the Comptroller of the Currency 2 Not Hispanic or Latino 82 5 White Loan was not originated or was not sold in cal… 76700 149 2818 2345 4.580000 7144 154.830002 5.003946 4.406719
7 1 Loan originated 5 NCUA National Credit Union Administration 2 Not Hispanic or Latino 93 5 White Fannie Mae (FNMA) 63800 150 1562 1257 4.010000 3870 108.949997 5.010635 4.532599
8 1 Loan originated 1 OCC Office of the Comptroller of the Currency 2 Not Hispanic or Latino 133 5 White Loan was not originated or was not sold in cal… 76700 149 858 254 7.500000 2839 117.690002 5.003946 4.890349
9 1 Loan originated 1 OCC Office of the Comptroller of the Currency 2 Not Hispanic or Latino 59 5 White Loan was not originated or was not sold in cal… 76700 102 3311 2874 5.590000 8698 132.919998 4.624973 4.077537
10 1 Loan originated 1 OCC Office of the Comptroller of the Currency 2 Not Hispanic or Latino 79 5 White Fannie Mae (FNMA) 76700 149 2818 2345 4.580000 7144 154.830002 5.003946 4.369448
11 1 Loan originated 5 NCUA National Credit Union Administration 2 Not Hispanic or Latino 58 5 White Loan was not originated or was not sold in cal… 76700 70 1994 1725 4.440000 5070 121.349998 4.248495 4.060443
12 1 Loan originated 1 OCC Office of the Comptroller of the Currency 2 Not Hispanic or Latino 53 5 White Other type of purchaser 76700 189 872 748 3.910000 1968 104.000000 5.241747 3.970292
13 1 Loan originated 1 OCC Office of the Comptroller of the Currency 2 Not Hispanic or Latino 42 5 White Loan was not originated or was not sold in cal… 76700 71 872 748 3.910000 1968 104.000000 4.262680 3.737670
14 1 Loan originated 1 OCC Office of the Comptroller of the Currency 2 Not Hispanic or Latino 63 5 White Fannie Mae (FNMA) 76700 136 1092 627 16.440001 4026 67.860001 4.912655 4.143135
15 1 Loan originated 5 NCUA National Credit Union Administration 2 Not Hispanic or Latino 72 5 White Commercial bank, savings bank or savings assoc… 76700 154 2236 1672 8.530000 5625 97.500000 5.036953 4.276666
16 1 Loan originated 1 OCC Office of the Comptroller of the Currency 2 Not Hispanic or Latino 41 5 White Loan was not originated or was not sold in cal… 76700 86 858 254 7.500000 2839 117.690002 4.454347 3.713572
17 1 Loan originated 1 OCC Office of the Comptroller of the Currency 2 Not Hispanic or Latino 133 5 White Loan was not originated or was not sold in cal… 76700 53 2818 2345 4.580000 7144 154.830002 3.970292 4.890349
18 1 Loan originated 1 OCC Office of the Comptroller of the Currency 2 Not Hispanic or Latino 62 5 White Loan was not originated or was not sold in cal… 76700 120 1594 1167 4.100000 3754 146.199997 4.787492 4.127134
19 1 Loan originated 3 FDIC Federal Deposit Insurance Corporation 2 Not Hispanic or Latino 103 5 White Loan was not originated or was not sold in cal… 63800 125 2230 1440 6.840000 6898 109.459999 4.828314 4.634729
20 1 Loan originated 1 OCC Office of the Comptroller of the Currency 2 Not Hispanic or Latino 282 5 White Fannie Mae (FNMA) 76700 400 1994 1725 4.440000 5070 121.349998 5.991465 5.641907
21 1 Loan originated 1 OCC Office of the Comptroller of the Currency 2 Not Hispanic or Latino 59 5 White Loan was not originated or was not sold in cal… 76700 135 1594 1167 4.100000 3754 146.199997 4.905275 4.077537
22 1 Loan originated 1 OCC Office of the Comptroller of the Currency 2 Not Hispanic or Latino 219 5 White Fannie Mae (FNMA) 76700 184 1660 1408 3.530000 4081 113.629997 5.214936 5.389072
23 1 Loan originated 9 CFPB Consumer Financial Protection Bureau 2 Not Hispanic or Latino 62 5 White Fannie Mae (FNMA) 76700 182 2818 2345 4.580000 7144 154.830002 5.204007 4.127134
24 1 Loan originated 1 OCC Office of the Comptroller of the Currency 2 Not Hispanic or Latino 32 5 White Loan was not originated or was not sold in cal… 76700 100 1613 1506 7.420000 5376 116.830002 4.605170 3.465736
25 1 Loan originated 1 OCC Office of the Comptroller of the Currency 2 Not Hispanic or Latino 504 5 White Loan was not originated or was not sold in cal… 76700 750 1994 1725 4.440000 5070 121.349998 6.620073 6.222576
26 1 Loan originated 1 OCC Office of the Comptroller of the Currency 2 Not Hispanic or Latino 40 5 White Fannie Mae (FNMA) 76700 185 1994 1725 4.440000 5070 121.349998 5.220356 3.688879
27 1 Loan originated 5 NCUA National Credit Union Administration 2 Not Hispanic or Latino 91 5 White Loan was not originated or was not sold in cal… 76700 70 1359 589 7.560000 3451 62.650002 4.248495 4.510860
28 1 Loan originated 1 OCC Office of the Comptroller of the Currency 2 Not Hispanic or Latino 41 5 White Loan was not originated or was not sold in cal… 76700 187 1838 1487 14.830000 4580 71.849998 5.231109 3.713572
29 1 Loan originated 1 OCC Office of the Comptroller of the Currency 2 Not Hispanic or Latino 46 5 White Loan was not originated or was not sold in cal… 76700 205 1611 1498 6.910000 4935 118.150002 5.323010 3.828641
34542 6 Loan purchased by the institution 9 CFPB Consumer Financial Protection Bureau 4 Not applicable 120 7 Not applicable Freddie Mac (FHLMC) 63800 121 2332 670 3.360000 1876 120.169998 4.795791 4.787492
34543 6 Loan purchased by the institution 9 CFPB Consumer Financial Protection Bureau 4 Not applicable 100 7 Not applicable Freddie Mac (FHLMC) 63800 72 754 508 4.200000 2264 57.830002 4.276666 4.605170
34544 6 Loan purchased by the institution 9 CFPB Consumer Financial Protection Bureau 4 Not applicable 260 7 Not applicable Fannie Mae (FNMA) 63800 412 2647 984 4.380000 4314 103.959999 6.021023 5.560682
34545 6 Loan purchased by the institution 9 CFPB Consumer Financial Protection Bureau 4 Not applicable 150 7 Not applicable Fannie Mae (FNMA) 63800 288 1653 1017 3.540000 3048 165.229996 5.662960 5.010635
34546 6 Loan purchased by the institution 9 CFPB Consumer Financial Protection Bureau 4 Not applicable 60 7 Not applicable Fannie Mae (FNMA) 63800 87 1290 1030 4.430000 2954 108.470001 4.465908 4.094345
34547 6 Loan purchased by the institution 9 CFPB Consumer Financial Protection Bureau 4 Not applicable 48 7 Not applicable Freddie Mac (FHLMC) 63800 382 1514 509 3.990000 1705 134.589996 5.945421 3.871201
34548 6 Loan purchased by the institution 9 CFPB Consumer Financial Protection Bureau 4 Not applicable 18 7 Not applicable Fannie Mae (FNMA) 63800 144 2398 1012 2.410000 3320 119.320000 4.969813 2.890372
34549 6 Loan purchased by the institution 9 CFPB Consumer Financial Protection Bureau 4 Not applicable 75 7 Not applicable Fannie Mae (FNMA) 63800 71 1255 741 6.150000 3446 66.279999 4.262680 4.317488
34550 6 Loan purchased by the institution 9 CFPB Consumer Financial Protection Bureau 4 Not applicable 227 7 Not applicable Fannie Mae (FNMA) 63800 256 2285 1721 3.650000 5094 120.339996 5.545177 5.424950
34551 6 Loan purchased by the institution 9 CFPB Consumer Financial Protection Bureau 4 Not applicable 43 7 Not applicable Freddie Mac (FHLMC) 63800 70 1949 1478 4.110000 5064 118.570000 4.248495 3.761200
34552 6 Loan purchased by the institution 9 CFPB Consumer Financial Protection Bureau 4 Not applicable 58 7 Not applicable Life insurance company, credit union, mortgage… 63800 212 1294 878 5.590000 2773 85.080002 5.356586 4.060443
34553 6 Loan purchased by the institution 7 HUD Department of Housing and Urban Development 4 Not applicable 54 7 Not applicable Life insurance company, credit union, mortgage… 63800 173 1112 626 4.710000 1806 80.190002 5.153292 3.988984
34554 6 Loan purchased by the institution 7 HUD Department of Housing and Urban Development 4 Not applicable 70 7 Not applicable Life insurance company, credit union, mortgage… 63800 237 1823 1010 3.240000 3084 104.620003 5.468060 4.248495
34555 6 Loan purchased by the institution 7 HUD Department of Housing and Urban Development 4 Not applicable 91 7 Not applicable Fannie Mae (FNMA) 63800 125 911 448 3.180000 1353 139.160004 4.828314 4.510860
34556 6 Loan purchased by the institution 7 HUD Department of Housing and Urban Development 4 Not applicable 73 7 Not applicable Freddie Mac (FHLMC) 63800 300 911 448 3.180000 1353 139.160004 5.703782 4.290459
34557 6 Loan purchased by the institution 7 HUD Department of Housing and Urban Development 4 Not applicable 38 7 Not applicable Fannie Mae (FNMA) 63800 91 2485 1490 2.990000 4187 101.750000 4.510860 3.637586
34558 6 Loan purchased by the institution 7 HUD Department of Housing and Urban Development 4 Not applicable 54 7 Not applicable Fannie Mae (FNMA) 63800 177 911 448 3.180000 1353 139.160004 5.176150 3.988984
34559 6 Loan purchased by the institution 7 HUD Department of Housing and Urban Development 4 Not applicable 368 7 Not applicable Fannie Mae (FNMA) 63800 300 1514 509 3.990000 1705 134.589996 5.703782 5.908083
34560 6 Loan purchased by the institution 7 HUD Department of Housing and Urban Development 4 Not applicable 115 7 Not applicable Fannie Mae (FNMA) 63800 159 2647 984 4.380000 4314 103.959999 5.068904 4.744932
34561 6 Loan purchased by the institution 7 HUD Department of Housing and Urban Development 4 Not applicable 129 7 Not applicable Fannie Mae (FNMA) 63800 372 2064 1541 4.320000 4264 105.809998 5.918894 4.859812
34562 6 Loan purchased by the institution 7 HUD Department of Housing and Urban Development 4 Not applicable 94 7 Not applicable Fannie Mae (FNMA) 63800 221 2373 1537 4.520000 5227 117.389999 5.398163 4.543295
34563 6 Loan purchased by the institution 7 HUD Department of Housing and Urban Development 4 Not applicable 49 7 Not applicable Freddie Mac (FHLMC) 63800 200 906 594 3.780000 1719 140.630005 5.298317 3.891820
34564 6 Loan purchased by the institution 7 HUD Department of Housing and Urban Development 4 Not applicable 78 7 Not applicable Fannie Mae (FNMA) 63800 132 1718 1015 6.680000 4522 86.610001 4.882802 4.356709
34565 6 Loan purchased by the institution 7 HUD Department of Housing and Urban Development 4 Not applicable 74 7 Not applicable Loan was not originated or was not sold in cal… 63800 137 1276 827 3.460000 2344 93.720001 4.919981 4.304065
34567 6 Loan purchased by the institution 9 CFPB Consumer Financial Protection Bureau 4 Not applicable 241 7 Not applicable Fannie Mae (FNMA) 63800 371 1566 1201 3.070000 3393 123.000000 5.916202 5.484797
34568 6 Loan purchased by the institution 9 CFPB Consumer Financial Protection Bureau 4 Not applicable 64 7 Not applicable Freddie Mac (FHLMC) 63800 168 1349 794 2.510000 2031 139.820007 5.123964 4.158883
34569 6 Loan purchased by the institution 9 CFPB Consumer Financial Protection Bureau 4 Not applicable 689 7 Not applicable Fannie Mae (FNMA) 63800 320 2371 399 4.530000 1302 104.230003 5.768321 6.535241
34570 6 Loan purchased by the institution 9 CFPB Consumer Financial Protection Bureau 4 Not applicable 39 7 Not applicable Fannie Mae (FNMA) 63800 180 1246 603 2.880000 1769 95.379997 5.192957 3.663562
34571 6 Loan purchased by the institution 9 CFPB Consumer Financial Protection Bureau 4 Not applicable 103 7 Not applicable Freddie Mac (FHLMC) 63800 225 2032 1157 3.790000 3302 100.690002 5.416100 4.634729
34572 6 Loan purchased by the institution 9 CFPB Consumer Financial Protection Bureau 4 Not applicable 109 7 Not applicable Loan was not originated or was not sold in cal… 63800 260 2533 1545 4.870000 4391 150.600006 5.560682 4.691348

34432 rows × 46 columns

Look at the shape of the dataset

In [11]:
incomesubset.shape
Out[11]:
(34432, 46)

Query the data

In [5]:
qry1 = ver.query('(applicant_income_000s > 0) & (applicant_income_000s < 1000)') 
qry1.head(10)
Out[5]:
action_taken action_taken_name agency_code agency_abbr agency_name applicant_ethnicity applicant_ethnicity_name applicant_income_000s applicant_race_1 applicant_race_name_1 purchaser_type_name hud_median_family_income loan_amount_000s number_of_1_to_4_family_units number_of_owner_occupied_units minority_population population tract_to_msamd_income logloanamt logincome
0 1 Loan originated 1 OCC Office of the Comptroller of the Currency 2 Not Hispanic or Latino 161 5 White Fannie Mae (FNMA) 76700 200 2681 1186 7.12 3272 80.459999 5.298317 5.081404
1 1 Loan originated 1 OCC Office of the Comptroller of the Currency 2 Not Hispanic or Latino 92 2 Asian Loan was not originated or was not sold in cal… 76700 279 2818 2345 4.58 7144 154.830002 5.631212 4.521789
2 1 Loan originated 5 NCUA National Credit Union Administration 2 Not Hispanic or Latino 106 5 White Fannie Mae (FNMA) 76700 244 2447 2136 3.42 6227 124.199997 5.497168 4.663439
3 1 Loan originated 7 HUD Department of Housing and Urban Development 2 Not Hispanic or Latino 49 5 White Commercial bank, savings bank or savings assoc… 76700 196 1810 1572 3.71 5009 126.970001 5.278115 3.891820
4 1 Loan originated 1 OCC Office of the Comptroller of the Currency 2 Not Hispanic or Latino 74 5 White Loan was not originated or was not sold in cal… 76700 70 3311 2874 5.59 8698 132.919998 4.248495 4.304065
5 1 Loan originated 1 OCC Office of the Comptroller of the Currency 2 Not Hispanic or Latino 56 5 White Fannie Mae (FNMA) 76700 222 1994 1725 4.44 5070 121.349998 5.402677 4.025352
6 1 Loan originated 1 OCC Office of the Comptroller of the Currency 2 Not Hispanic or Latino 82 5 White Loan was not originated or was not sold in cal… 76700 149 2818 2345 4.58 7144 154.830002 5.003946 4.406719
7 1 Loan originated 5 NCUA National Credit Union Administration 2 Not Hispanic or Latino 93 5 White Fannie Mae (FNMA) 63800 150 1562 1257 4.01 3870 108.949997 5.010635 4.532599
8 1 Loan originated 1 OCC Office of the Comptroller of the Currency 2 Not Hispanic or Latino 133 5 White Loan was not originated or was not sold in cal… 76700 149 858 254 7.50 2839 117.690002 5.003946 4.890349
9 1 Loan originated 1 OCC Office of the Comptroller of the Currency 2 Not Hispanic or Latino 59 5 White Loan was not originated or was not sold in cal… 76700 102 3311 2874 5.59 8698 132.919998 4.624973 4.077537

10 rows × 46 columns

Look at the shape of the data

In [16]:
qry1.shape
Out[16]:
(34432, 2)

Group data and obtain the mean

In [17]:
grouped1 = ver.groupby(['applicant_race_name_1','loan_purpose_name']).mean()
grouped1
Out[17]:
action_taken agency_code applicant_ethnicity applicant_income_000s applicant_race_1 applicant_sex census_tract_number co_applicant_ethnicity co_applicant_race_1 co_applicant_sex purchaser_type hud_median_family_income loan_amount_000s number_of_1_to_4_family_units number_of_owner_occupied_units minority_population population tract_to_msamd_income logloanamt logincome
applicant_race_name_1 loan_purpose_name
American Indian or Alaska Native Home improvement 1.692308 4.538462 2.000000 102.076923 1 1.461538 6661.923077 3.384615 6.461538 3.307692 0.461538 67769.230769 64.615385 1641.000000 1195.692308 4.487692 3862.846154 101.520000 3.511875 4.063642
Home purchase 2.578947 4.964912 2.017544 98.842105 1 1.385965 3227.267368 3.456140 5.701754 3.280702 2.421053 72400.000000 176.912281 1669.473684 1188.631579 7.057544 4095.192982 104.059122 5.010189 4.301923
Refinancing 2.382114 4.853659 1.991870 89.593496 1 1.308943 3465.027398 3.333333 5.154472 3.227642 1.577236 72085.365854 179.951220 1849.016260 1309.455285 7.006748 4550.991870 102.170162 5.075414 4.342989
Asian Home improvement 2.615385 5.461538 2.000000 88.769231 2 1.538462 3715.233077 3.153846 5.461538 2.923077 1.384615 71738.461538 106.000000 1728.076923 1268.538462 7.818462 4061.076923 108.128461 4.236656 4.369807
Home purchase 2.093458 5.663551 2.009346 109.822430 2 1.327103 3158.742710 3.196262 5.196262 2.962617 2.401869 72480.373832 191.598131 1629.392523 1174.971963 8.887570 4183.915888 107.478037 5.167938 4.458829
Refinancing 1.886792 5.919811 2.009434 100.179245 2 1.320755 2479.789198 3.367925 5.353774 3.198113 1.599057 73414.150943 190.235849 1754.419811 1281.566038 8.399057 4402.056604 107.227122 5.157283 4.408965
Black or African American Home improvement 2.133333 6.333333 2.000000 72.733333 3 1.266667 4513.604000 3.000000 5.000000 2.933333 0.933333 70680.000000 77.800000 1935.800000 1430.266667 6.383333 4678.733333 94.532000 3.770529 4.206888
Home purchase 2.405405 5.324324 1.918919 71.405405 3 1.594595 5467.111622 3.864865 6.648649 3.729730 2.027027 69378.378378 183.216216 1576.270270 1086.054054 8.095676 4112.756757 97.002702 5.164684 4.100749
Refinancing 2.240000 6.493333 2.000000 104.813333 3 1.346667 4760.790267 3.733333 6.453333 3.666667 1.306667 70336.000000 192.746667 1789.013333 1211.760000 7.225600 4385.773333 101.771866 5.093208 4.454963
Information not provided by applicant in mail, Internet, or telephone application Home improvement 1.993266 5.400673 2.956229 106.292929 6 2.494949 6323.345926 3.653199 6.693603 3.407407 0.508418 68230.303030 97.000000 1730.208754 1193.952862 4.868754 3877.255892 104.221818 3.938899 4.320571
Home purchase 2.006674 6.649611 2.921023 143.097887 6 2.483871 5290.150067 3.731924 6.753059 3.543938 1.961068 69625.806452 213.720801 1755.659622 1156.041157 5.911279 3925.969967 108.059610 5.188407 4.568488
Refinancing 2.061628 6.404154 2.922029 129.036772 6 2.426285 5229.722918 3.647600 6.674157 3.413347 1.324481 69707.558733 190.172966 1820.762683 1238.682329 5.542196 4098.827375 108.765121 5.056481 4.526939
Native Hawaiian or Other Pacific Islander Home improvement 2.750000 6.000000 1.750000 88.250000 4 1.500000 2494.760000 4.250000 7.000000 4.000000 0.000000 73475.000000 82.000000 2067.500000 1354.250000 6.192500 4558.000000 94.682499 3.765636 4.323032
Home purchase 2.000000 5.600000 2.000000 101.700000 4 1.300000 2931.300000 3.900000 6.800000 3.700000 2.900000 72830.000000 188.500000 1580.300000 853.700000 10.998000 3290.100000 87.122999 5.151190 4.378073
Refinancing 2.000000 6.578947 2.052632 101.052632 4 1.105263 3565.161579 3.105263 6.105263 3.052632 2.736842 71947.368421 188.894737 1618.578947 1124.473684 7.741053 4173.947368 102.384737 5.176827 4.525947
Not applicable Home improvement 6.000000 9.000000 4.000000 80.000000 7 4.000000 9604.000000 4.000000 7.000000 4.000000 8.000000 63800.000000 177.000000 2285.000000 1721.000000 3.650000 5094.000000 120.339996 5.176150 4.382027
Home purchase 6.000000 8.184211 4.000000 138.934211 7 4.000000 6578.648289 4.131579 7.105263 4.131579 2.460526 67873.684211 205.276316 1804.868421 1086.894737 5.513158 3552.131579 110.636710 5.227249 4.632007
Refinancing 5.941003 8.303835 4.000000 119.200590 7 4.000000 5036.917640 4.005900 7.000000 3.994100 2.162242 69964.601770 193.781711 1847.976401 1223.746313 5.534897 4092.135693 110.617080 5.139062 4.539565
White Home improvement 1.745717 5.116822 2.009346 86.008178 5 1.356308 6024.937099 3.136293 6.104361 2.961059 0.762461 68622.429907 106.154206 1799.087617 1239.919393 5.064965 4083.890966 103.346051 4.149336 4.230948
Home purchase 2.085156 5.544688 2.003750 105.491875 5 1.333438 5526.173983 3.414687 6.389531 3.276719 2.402031 69306.687500 186.630625 1827.103281 1242.711094 5.636495 4173.643906 103.814998 5.078404 4.346730
Refinancing 2.016838 5.919052 2.005547 112.099946 5 1.305778 5384.464213 3.101811 6.082077 2.957243 1.523293 69496.745373 182.811791 1835.042315 1260.212901 5.438464 4180.187423 106.312604 5.043062 4.434124

21 rows × 28 columns

View the number of rows and columns

In [12]:
isin.shape
Out[12]:
(332, 46)

Look at the unique columns

In [11]:
isin["action_taken_name"].unique()
Out[11]:
array(['Loan originated', 'Application approved but not accepted',
       'Application denied by financial institution',
       'Application withdrawn by applicant',
       'Loan purchased by the institution',
       'File closed for incompleteness'], dtype=object)

Check a boolean condition

In [19]:
(ver.ix[:,'applicant_income_000s'] > 9000).any()
Out[19]:
True

Get descriptive statistics for a specified column

In [21]:
ver.applicant_income_000s.describe()  
Out[21]:
count    34573.000000
mean       110.978394
std        159.169449
min          1.000000
25%         53.000000
50%         80.000000
75%        120.000000
max       9999.000000
Name: applicant_income_000s, dtype: float64

Group data and obtain the mean values

In [7]:
grpagg = ver.groupby('purchaser_type_name').aggregate(np.mean)
grpagg
Out[7]:
action_taken agency_code applicant_ethnicity applicant_income_000s applicant_race_1 applicant_sex census_tract_number co_applicant_ethnicity co_applicant_race_1 co_applicant_sex purchaser_type hud_median_family_income loan_amount_000s number_of_1_to_4_family_units number_of_owner_occupied_units minority_population population tract_to_msamd_income logloanamt logincome
purchaser_type_name
Affiliate institution 1.528017 5.737069 2.284483 105.068966 5.206897 1.633621 6321.220022 3.405172 6.359914 3.278017 8 68220.474138 173.433190 1737.933190 1130.023707 5.249526 3731.543103 101.554957 5.010802 4.402359
Commercial bank, savings bank or savings association 1.010060 5.676861 2.103421 109.758149 5.021730 1.437022 3850.269272 3.224145 6.184708 3.083300 6 71571.146881 206.836217 1889.682495 1354.028571 6.168286 4578.362173 108.301641 5.246646 4.457538
Fannie Mae (FNMA) 2.327684 6.989869 2.223456 119.940581 5.186830 1.598480 5720.001245 3.172803 6.157023 3.029807 1 69049.970777 192.947010 1829.317748 1223.547049 5.536172 4077.419248 109.000758 5.155562 4.541301
Freddie Mac (FHLMC) 1.365875 5.514688 2.124362 107.644151 5.063676 1.450132 3984.795703 3.093052 6.054529 2.930519 3 71378.891821 189.969921 1853.363061 1336.735972 5.970041 4451.449077 108.233497 5.149315 4.482605
Ginnie Mae (GNMA) 3.434002 8.241816 2.130940 71.964097 5.083421 1.459345 5587.681056 3.470961 6.460401 3.329461 2 69235.163675 189.784583 1832.951426 1299.232313 5.288870 4338.180570 101.326906 5.184177 4.182622
Life insurance company, credit union, mortgage bank, or finance company 1.089820 6.190120 2.037425 108.353293 5.007485 1.381737 4274.688323 3.103293 6.067365 2.958084 7 70983.832335 198.562874 1888.513473 1368.953593 5.875883 4594.164671 105.469865 5.152413 4.415373
Loan was not originated or was not sold in calendar year covered by register 2.377108 5.649931 2.127564 112.223058 5.070445 1.477744 6007.775863 3.303235 6.270335 3.145590 0 68655.421620 165.419795 1806.948742 1212.542660 5.282917 4016.822505 104.288070 4.813353 4.365391
Other type of purchaser 1.364277 3.180964 2.113984 95.171563 5.081081 1.365452 5186.463561 3.419506 6.364277 3.256169 9 69742.185664 181.618096 1776.894242 1208.082256 5.477838 4099.076381 103.976792 5.099453 4.268535
Private securitization 1.000000 8.117647 2.066667 143.796078 5.047059 1.298039 5307.249843 3.074510 6.039216 2.968627 5 69617.647059 204.490196 1855.874510 1238.372549 5.180627 4076.400000 111.218353 5.215828 4.646789

9 rows × 28 columns

Group data and get the sums

In [27]:
grpsum = ver.groupby('purchaser_type_name').aggregate(np.sum)
grpsum
Out[27]:
action_taken agency_code applicant_ethnicity applicant_income_000s applicant_race_1 applicant_sex census_tract_number co_applicant_ethnicity co_applicant_race_1 co_applicant_sex purchaser_type hud_median_family_income loan_amount_000s number_of_1_to_4_family_units number_of_owner_occupied_units minority_population population tract_to_msamd_income logloanamt logincome
purchaser_type_name
Affiliate institution 709 2662 1060 48752 2416 758 2.933046e+06 1580 2951 1521 3712 31654300 80473 806401 524331 2435.780010 1731436 47121.499916 2325.012159 2042.694717
Commercial bank, savings bank or savings association 2510 14107 5227 272749 12479 3571 9.567919e+06 8012 15369 7662 14910 177854300 513988 4695861 3364761 15328.190102 11377230 269129.579021 13037.915147 11076.982225
Fannie Mae (FNMA) 11948 35879 11413 615655 26624 8205 2.936077e+07 16286 31604 15552 5133 354433500 990397 9389888 6280467 28417.170130 20929393 559500.888683 26463.502065 23310.496198
Freddie Mac (FHLMC) 7765 31351 12077 611957 28787 8244 2.265356e+07 17584 34420 16660 17055 405789000 1079979 10536369 7599344 33939.680262 25306488 615307.427902 29273.854834 25483.610086
Ginnie Mae (GNMA) 3252 7805 2018 68150 4814 1382 5.291534e+06 3287 6118 3153 1894 65565700 179726 1735805 1230373 5008.560018 4108257 95956.579788 4909.415775 3960.943481
Life insurance company, credit union, mortgage bank, or finance company 728 4135 1361 72380 3345 923 2.855492e+06 2073 4053 1976 4676 47417200 132640 1261527 914461 3925.090032 3068902 70453.869778 3441.811703 2949.469388
Loan was not originated or was not sold in calendar year covered by register 42990 102179 38477 2029554 91699 26725 1.086506e+08 59739 113399 56888 0 1241633300 2991617 32678668 21928834 95541.550403 72644235 1886049.754768 87049.494275 78948.102214
Other type of purchaser 1161 2707 1799 80991 4324 1162 4.413680e+06 2910 5416 2771 7659 59350600 154557 1512137 1028078 4661.640025 3488314 88484.249763 4339.634709 3632.523179
Private securitization 255 2070 527 36668 1287 331 1.353349e+06 784 1540 757 1275 17752500 52145 473248 315785 1321.060006 1039482 28360.679974 1330.036030 1184.931078

9 rows × 28 columns

Return boolean values for a specified criteria

In [28]:
criterion = ver['applicant_race_name_1'].map(lambda x: x.startswith('W'))
criterion.head()
Out[28]:
0     True
1    False
2     True
3     True
4     True
Name: applicant_race_name_1, dtype: bool