Now we will slice and dice our dataset using grouping, aggregation functions, summary statistics, and various querying and indexing operations. To really understand the bike sharing data, we need to look at natural groupings such as starting landmark, subscription type, and the time of day of the start of a bicycle trip. The work done in this section will set up some beautiful and informative plots in the next section of the analysis. Most of the tables below have plotting representations in the visualization section.
Grouping and aggregating or reducing data and obtaining summary measures and statistics is the heart of data analysis and work together with the visualization process including those of an exploratory nature. The iterative process of data discovery at best leads to new questions and insights being discovered which inform and refine the data carpentry process. This is where the interactive repl excels. Think of it as a circular inquistive process of data munging, grouping and aggregating, and visualizing. We will explore using SQL inspired methods of grouping operations in Pandas and the use of mult-indexes and hierarchial indexing.
import numpy as np
import pandas as pd
from datetime import datetime, time
%matplotlib inline
import seaborn as sns
from ggplot import *
print pd.__version__
print np.__version__
0.14.1 1.9.0
%load_ext ipycache
It's time to explore various ways of grouping and aggregating data. Let's review some terminology. Dockcount refers to the number of docks installed, subscription type is either customer or subscriber, and landmark refers to an area covered by San Francisco, San Jose, Redwood City, or Palo Alto.
In the data munging notebook, we created a column indicating whether the duration of a segment of a bicycle trip was under thirty minutes or over thirty minutes as this is the alloted time before being charged overtime fees. So, what percentage of total trips were over thirty minutes?
dmerge4.thirtymin.value_counts()
in_thirty 134912 over_thirty 9103 dtype: int64
Only 6% of total rides had a duration over thirty minutes so it looks like people are good about returning bikes on time. Or, were they? When we look at the statistics based on landmark and subscriber type, we will get a better picture of who and where have longer duration times.
dmerge4.groupby(['thirtymin','timeofday'])[['duration_f']].mean()
duration_f | ||
---|---|---|
thirtymin | timeofday | |
in_thirty | evening | 9.736020 |
mid_day | 10.387775 | |
morning | 8.914918 | |
night | 9.641814 | |
wee_hours | 10.150168 | |
over_thirty | evening | 161.593282 |
mid_day | 164.873060 | |
morning | 218.527759 | |
night | 271.402983 | |
wee_hours | 345.829388 |
Here is another way to check on how many observations out of the total had trip durations over thirty minutes
len(dmerge4[dmerge4['thirtymin']=='over_thirty'])
9103
Only 6878 out 144015 observations or 4.8% were round trips so it would seem that people are really using the bicycles to get from point A to point B
len(dmerge4[dmerge4['startstation']== dmerge4['endstation']])
6878
Let's look at average duration by subscription type. Customers are defined has having either a twenty-four hour or three day pass while Subscribers are defined as having an annual pass. Average durations for subscribers is almost ten minutes while it is sixty minutes for customers. Perhaps customers are recreational riders who take longer trips while subscribers are shorter distance commuters. We'll learn more when we slice and dice the data later.
dmerge4.groupby(['subscriptiontype']).agg({'duration_f' : np.mean})
duration_f | |
---|---|
subscriptiontype | |
Customer | 60.491974 |
Subscriber | 9.832834 |
dmerge4.groupby('subscriptiontype').size()
subscriptiontype Customer 30368 Subscriber 113647 dtype: int64
dmerge4[dmerge4['subscriptiontype'] =="Subscriber"].landmark.value_counts()
San Francisco 102735 San Jose 7219 Mountain View 2157 Palo Alto 905 Redwood City 631 dtype: int64
So, we've learned that the majority of total subscribers are in San Francisco and take very short trips.
dmerge4.groupby('startstation').duration.sum().order(ascending=False)[:5]
startstation Harry Bridges Plaza (Ferry Building) 12579544 Embarcadero at Sansome 10294998 San Francisco Caltrain (Townsend at 4th) 8457338 Market at 4th 7706843 Powell Street BART 6100023 Name: duration, dtype: int64
The table below indicates that San Francisco had the highest total duration but keep in mind that the majority of total observations are also in San Francisco so of course the total would be higher. Using summary statistics will give a better idea of bicycling behavior than looking at total values. Although, for some purposes, it is good to know the total as well. For example, bicycle wear and tear based on total durations.
dmerge4.groupby("landmark")['duration_i','subscriptiontype'].aggregate(np.sum)
duration_i | |
---|---|
landmark | |
Mountain View | 121895 |
Palo Alto | 131810 |
Redwood City | 38924 |
San Francisco | 2454177 |
San Jose | 207658 |
%%cache station.pkl stationdata
stationdata
54% of all docks are located in San Francisco so this may partially explain why total duration and bicycle trips taken in San Francisco are higher and why durations are higher in Palo Alto where only 6% of the docks are located.
stationdata.groupby('landmark')[['dockcount']].sum()/stationdata[['dockcount']].sum()
dockcount | |
---|---|
landmark | |
Mountain View | 0.095823 |
Palo Alto | 0.061425 |
Redwood City | 0.094185 |
San Francisco | 0.544636 |
San Jose | 0.203931 |
stationdata.groupby(['landmark','name'])[['dockcount']].sum()
dockcount | ||
---|---|---|
landmark | name | |
Mountain View | Castro Street and El Camino Real | 11 |
Evelyn Park and Ride | 15 | |
Mountain View Caltrain Station | 23 | |
Mountain View City Hall | 15 | |
Rengstorff Avenue / California Street | 15 | |
San Antonio Caltrain Station | 23 | |
San Antonio Shopping Center | 15 | |
Palo Alto | California Ave Caltrain Station | 15 |
Cowper at University | 11 | |
Palo Alto Caltrain Station | 23 | |
Park at Olive | 15 | |
University and Emerson | 11 | |
Redwood City | Broadway at Main | 15 |
Franklin at Maple | 15 | |
Mezes Park | 15 | |
Redwood City Caltrain Station | 25 | |
Redwood City Medical Center | 15 | |
Redwood City Public Library | 15 | |
San Mateo County Center | 15 | |
San Francisco | 2nd at Folsom | 19 |
2nd at South Park | 15 | |
2nd at Townsend | 27 | |
5th at Howard | 15 | |
Beale at Market | 19 | |
Broadway St at Battery St | 15 | |
Civic Center BART (7th at Market) | 23 | |
Clay at Battery | 15 | |
Commercial at Montgomery | 15 | |
Davis at Jackson | 15 | |
Embarcadero at Bryant | 15 | |
… | … | |
Market at Sansome | 27 | |
Mechanics Plaza (Market at Battery) | 19 | |
Post at Kearney | 19 | |
Powell Street BART | 19 | |
Powell at Post (Union Square) | 19 | |
San Francisco Caltrain (Townsend at 4th) | 19 | |
San Francisco Caltrain 2 (330 Townsend) | 23 | |
San Francisco City Hall | 19 | |
South Van Ness at Market | 19 | |
Spear at Folsom | 19 | |
Steuart at Market | 23 | |
Temporary Transbay Terminal (Howard at Beale) | 23 | |
Townsend at 7th | 15 | |
Washington at Kearney | 15 | |
Yerba Buena Center of the Arts (3rd @ Howard) | 19 | |
San Jose | Adobe on Almaden | 19 |
Arena Green / SAP Center | 19 | |
Japantown | 15 | |
MLK Library | 19 | |
Paseo de San Antonio | 15 | |
SJSU - San Salvador at 9th | 15 | |
SJSU 4th at San Carlos | 19 | |
San Jose City Hall | 15 | |
San Jose Civic Center | 15 | |
San Jose Diridon Caltrain Station | 27 | |
San Jose Government Center | 15 | |
San Pedro Square | 15 | |
San Salvador at 1st | 15 | |
Santa Clara at Almaden | 11 | |
St James Park | 15 |
69 rows × 1 columns
When looking at number of bicycle trips based on landmark, we see that 129,853 or 90% of total observations in the dataset were in San Francisco. 6% of total observations were in San Jose while only 1% were in Palo Alto.
dmerge4.groupby("landmark")[['duration_i']].aggregate(np.size)
duration_i | |
---|---|
landmark | |
Mountain View | 2728 |
Palo Alto | 1706 |
Redwood City | 793 |
San Francisco | 129853 |
San Jose | 8935 |
Looking at average duration by landmark, San Francisco has the lowest duration followed by San Jose while Palo Alto had the highest duration. We can hypothesize that this may be partly due to fewer docks in Palo Alto or greater distances to travel as well as the subscription profile of Palo Alto bicycle users.
dmerge4.groupby("landmark")[['duration_i']].aggregate(np.mean)
duration_i | |
---|---|
landmark | |
Mountain View | 44.682918 |
Palo Alto | 77.262603 |
Redwood City | 49.084489 |
San Francisco | 18.899656 |
San Jose | 23.240963 |
timelandmark = dmerge4.groupby(['timeofday','landmark']).agg({'dockcount' : np.mean, 'duration_f': np.mean, }).reset_index()
timelandmark
timelandmark.dockcount.max()
timeofday | landmark | duration_f | dockcount | |
---|---|---|---|---|
0 | evening | Mountain View | 34.096655 | 18.079393 |
1 | evening | Palo Alto | 66.580671 | 14.565495 |
2 | evening | Redwood City | 50.191700 | 18.198381 |
3 | evening | San Francisco | 16.224655 | 19.349130 |
4 | evening | San Jose | 18.737437 | 17.385234 |
5 | mid_day | Mountain View | 82.030353 | 18.465995 |
6 | mid_day | Palo Alto | 109.684100 | 14.439331 |
7 | mid_day | Redwood City | 76.925088 | 17.982456 |
8 | mid_day | San Francisco | 27.520457 | 19.436506 |
9 | mid_day | San Jose | 23.002496 | 16.776488 |
10 | morning | Mountain View | 29.786291 | 19.715105 |
11 | morning | Palo Alto | 52.318829 | 16.978308 |
12 | morning | Redwood City | 22.378051 | 22.647059 |
13 | morning | San Francisco | 15.172821 | 19.948991 |
14 | morning | San Jose | 18.945759 | 19.782161 |
15 | night | Mountain View | 126.402177 | 20.061224 |
16 | night | Palo Alto | 83.482982 | 14.017544 |
17 | night | Redwood City | 65.530455 | 20.000000 |
18 | night | San Francisco | 19.055272 | 19.432370 |
19 | night | San Jose | 35.876195 | 18.286119 |
20 | wee_hours | Mountain View | 81.724706 | 21.117647 |
21 | wee_hours | Palo Alto | 150.501852 | 16.185185 |
22 | wee_hours | Redwood City | 13.585000 | 20.000000 |
23 | wee_hours | San Francisco | 47.574147 | 18.926380 |
24 | wee_hours | San Jose | 127.363475 | 17.474576 |
The highest dockcount was in Redwood City in the morning.
timelandmark.dockcount.max()
22.647058823529413
The highest average duration was in Palo Alto in the wee hours.
timelandmark.duration_f.order(ascending=False)[:5]
21 150.501852 24 127.363475 15 126.402177 6 109.684100 16 83.482982 Name: duration_f, dtype: float64
Pandas produces nice readable multi-index tables.
np.round(dmerge4.groupby(['timeofday','landmark']).agg({'dockcount' : np.mean, 'duration_i': np.mean, }))
dockcount | duration_i | ||
---|---|---|---|
timeofday | landmark | ||
evening | Mountain View | 18 | 34 |
Palo Alto | 15 | 67 | |
Redwood City | 18 | 50 | |
San Francisco | 19 | 16 | |
San Jose | 17 | 19 | |
mid_day | Mountain View | 18 | 82 |
Palo Alto | 14 | 110 | |
Redwood City | 18 | 77 | |
San Francisco | 19 | 28 | |
San Jose | 17 | 23 | |
morning | Mountain View | 20 | 30 |
Palo Alto | 17 | 52 | |
Redwood City | 23 | 22 | |
San Francisco | 20 | 15 | |
San Jose | 20 | 19 | |
night | Mountain View | 20 | 126 |
Palo Alto | 14 | 83 | |
Redwood City | 20 | 66 | |
San Francisco | 19 | 19 | |
San Jose | 18 | 36 | |
wee_hours | Mountain View | 21 | 82 |
Palo Alto | 16 | 151 | |
Redwood City | 20 | 14 | |
San Francisco | 19 | 48 | |
San Jose | 17 | 127 |
Let's obtain mean values by day of the month by creating a grouped object from the day component of the timestamp and applying an aggregation function. We want to know if there is higher demand or trip duration at certain days of the month.
day_means = dmerge4.groupby('day').aggregate(np.mean)
day_means
tripid | duration | startterminal | endterminal | bike | max_temperature_f | mean_temperature_f | min_temperaturef | max_dew_point_f | meandew_point_f | … | station_id | lat | long | dockcount | hour | month | hourdiff | durationminutes | duration_i | duration_f | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
day | |||||||||||||||||||||
1 | 73878.288738 | 1816.236080 | 56.575208 | 56.547493 | 432.526833 | 69.336105 | 60.620559 | 51.125976 | 51.413706 | 46.602923 | … | 56.575208 | 37.743179 | -122.356093 | 19.177879 | 13.309146 | 8.684051 | 0.312169 | 30.270601 | 30.271353 | 30.270668 |
2 | 80351.443674 | 1273.880168 | 57.369151 | 57.382025 | 433.812825 | 66.287447 | 58.723942 | 50.482298 | 51.430800 | 47.488240 | … | 57.369151 | 37.747079 | -122.360427 | 19.446150 | 13.231245 | 8.443179 | 0.175043 | 21.231336 | 21.227532 | 21.231300 |
3 | 98238.235528 | 1135.100522 | 57.273354 | 57.356113 | 436.655799 | 63.523302 | 56.361338 | 49.060397 | 47.898433 | 42.093626 | … | 57.273354 | 37.743680 | -122.357780 | 19.405016 | 13.084222 | 7.262696 | 0.238662 | 18.918342 | 18.918704 | 18.918305 |
4 | 94721.030403 | 1101.818939 | 57.161696 | 57.186871 | 437.853021 | 64.680093 | 56.829783 | 48.094694 | 44.157242 | 39.132843 | … | 57.161696 | 37.742673 | -122.356006 | 19.465918 | 13.127808 | 8.324942 | 0.201588 | 18.363649 | 18.365414 | 18.363654 |
5 | 95326.050161 | 1173.435091 | 56.902045 | 56.801292 | 432.594833 | 65.471259 | 56.496878 | 47.178902 | 44.220883 | 37.509580 | … | 56.902045 | 37.742166 | -122.354954 | 19.422390 | 13.254682 | 8.448439 | 0.231001 | 19.557252 | 19.561895 | 19.557175 |
6 | 94048.015109 | 1202.085258 | 57.029570 | 57.127131 | 439.644507 | 68.017052 | 58.334556 | 48.195985 | 49.679258 | 42.862724 | … | 57.029570 | 37.744666 | -122.357646 | 19.334988 | 13.356357 | 7.755666 | 0.209152 | 20.034754 | 20.031945 | 20.034757 |
7 | 89738.111635 | 1146.163488 | 57.201200 | 57.143561 | 438.932719 | 71.251125 | 62.068138 | 52.430684 | 52.893936 | 45.455967 | … | 57.201200 | 37.745178 | -122.358731 | 19.379687 | 13.105207 | 7.541033 | 0.214699 | 19.102725 | 19.097493 | 19.102704 |
8 | 84325.793095 | 1386.051549 | 57.085836 | 57.125325 | 436.400804 | 67.203358 | 59.201939 | 50.980137 | 51.354694 | 47.276425 | … | 57.085836 | 37.746047 | -122.359317 | 19.342871 | 13.099551 | 8.004493 | 0.278080 | 23.100859 | 23.105226 | 23.100865 |
9 | 87579.505991 | 1144.179954 | 57.035023 | 57.111521 | 439.260829 | 64.475576 | 56.658986 | 48.722811 | 47.882949 | 43.535714 | … | 57.035023 | 37.744060 | -122.357155 | 19.301843 | 13.283641 | 8.159217 | 0.232028 | 19.069666 | 19.067512 | 19.069680 |
10 | 100164.634270 | 941.555230 | 57.384660 | 57.602459 | 434.971507 | 63.406518 | 56.257026 | 48.694184 | 49.707650 | 46.169789 | … | 57.384660 | 37.746870 | -122.359709 | 19.375878 | 13.193404 | 7.282397 | 0.194770 | 15.692587 | 15.689500 | 15.692576 |
11 | 96293.107869 | 982.990743 | 57.493862 | 57.524452 | 436.820487 | 62.358221 | 56.467498 | 50.252767 | 48.957939 | 46.354397 | … | 57.493862 | 37.746363 | -122.359027 | 19.456027 | 13.155162 | 8.425639 | 0.219763 | 16.383179 | 16.385389 | 16.383139 |
12 | 101958.670921 | 1180.773218 | 57.565482 | 57.644021 | 441.782250 | 63.251522 | 56.464559 | 49.364422 | 50.882388 | 46.859415 | … | 57.565482 | 37.748906 | -122.361073 | 19.408404 | 13.290006 | 8.206362 | 0.233458 | 19.679554 | 19.679560 | 19.679543 |
13 | 108374.918449 | 1113.472481 | 57.505760 | 57.708173 | 439.695922 | 65.535564 | 57.216310 | 48.298775 | 50.977144 | 46.897605 | … | 57.505760 | 37.747396 | -122.361078 | 19.364600 | 13.081185 | 7.314683 | 0.224173 | 18.557875 | 18.563357 | 18.557855 |
14 | 107914.239321 | 1331.676583 | 57.442601 | 57.464912 | 435.837338 | 68.113272 | 59.215294 | 50.194889 | 49.682113 | 45.583715 | … | 57.442601 | 37.747588 | -122.360382 | 19.458810 | 13.210717 | 7.004195 | 0.222349 | 22.194610 | 22.194699 | 22.194594 |
15 | 100907.198866 | 1433.743385 | 57.403612 | 57.506510 | 440.304704 | 71.000420 | 61.042839 | 50.341033 | 47.715456 | 41.582108 | … | 57.403612 | 37.751215 | -122.364489 | 19.319194 | 13.206636 | 7.464301 | 0.265855 | 23.895723 | 23.896472 | 23.895792 |
16 | 101172.588106 | 1202.279385 | 58.006985 | 57.971463 | 437.107763 | 71.047895 | 60.637597 | 49.693474 | 49.862303 | 42.463381 | … | 58.006985 | 37.750837 | -122.364332 | 19.475354 | 13.297945 | 7.656955 | 0.217322 | 20.037990 | 20.040511 | 20.038014 |
17 | 99680.420043 | 1112.376430 | 57.745106 | 57.859081 | 438.638884 | 68.343090 | 59.417523 | 49.817794 | 49.477418 | 44.306842 | … | 57.745106 | 37.749572 | -122.362625 | 19.422563 | 13.311882 | 7.760225 | 0.187633 | 18.539607 | 18.532468 | 18.539643 |
18 | 103648.507761 | 1175.414987 | 57.896343 | 57.921677 | 439.719536 | 65.938805 | 57.212310 | 48.155219 | 50.161285 | 45.867440 | … | 57.896343 | 37.749978 | -122.363236 | 19.535593 | 13.156824 | 8.155754 | 0.191258 | 19.590250 | 19.589652 | 19.590268 |
19 | 109548.449822 | 1014.061177 | 57.204903 | 57.208045 | 434.562958 | 67.482296 | 59.006704 | 50.093861 | 51.146868 | 45.393882 | … | 57.204903 | 37.745790 | -122.358802 | 19.439556 | 12.929395 | 7.885397 | 0.254766 | 16.901020 | 16.898387 | 16.901001 |
20 | 112970.702050 | 1087.413808 | 56.822654 | 57.027832 | 436.416613 | 65.947573 | 56.888457 | 47.428479 | 50.418554 | 45.236893 | … | 56.822654 | 37.740981 | -122.353057 | 19.434520 | 13.388997 | 7.395901 | 0.189213 | 18.123563 | 18.130744 | 18.123556 |
21 | 121094.538382 | 954.694196 | 56.866029 | 56.913252 | 435.960266 | 65.016434 | 56.554816 | 47.660495 | 49.190347 | 43.927606 | … | 56.866029 | 37.740547 | -122.353294 | 19.523403 | 13.082172 | 6.843561 | 0.169128 | 15.911570 | 15.910547 | 15.911610 |
22 | 111480.550011 | 1387.006294 | 56.852102 | 57.083390 | 429.968982 | 66.873455 | 58.093504 | 48.799506 | 47.600135 | 42.593392 | … | 56.852102 | 37.743398 | -122.355859 | 19.332209 | 13.213082 | 7.225893 | 0.271297 | 23.116772 | 23.113059 | 23.116777 |
23 | 104470.365624 | 1258.963995 | 56.603943 | 56.477925 | 435.362195 | 67.403986 | 58.234248 | 48.505787 | 49.957565 | 44.531719 | … | 56.603943 | 37.737760 | -122.349702 | 19.325761 | 13.089156 | 7.512859 | 0.253751 | 20.982733 | 20.980283 | 20.982722 |
24 | 114953.246078 | 1067.051404 | 57.187242 | 57.069777 | 437.834228 | 65.759496 | 57.860652 | 49.440132 | 50.049752 | 46.008258 | … | 57.187242 | 37.741583 | -122.354357 | 19.540875 | 13.002064 | 6.421965 | 0.200041 | 17.784190 | 17.784476 | 17.784242 |
25 | 107939.261435 | 1299.720000 | 57.650335 | 57.680383 | 440.331866 | 64.618756 | 57.094163 | 48.879234 | 49.399043 | 45.445550 | … | 57.650335 | 37.750310 | -122.363495 | 19.331866 | 13.258947 | 7.657225 | 0.262201 | 21.662000 | 21.659522 | 21.661989 |
26 | 101270.468742 | 1384.907149 | 57.155632 | 57.270905 | 438.265102 | 66.128198 | 57.792931 | 49.027697 | 48.403060 | 44.431285 | … | 57.155632 | 37.744375 | -122.357659 | 19.326035 | 13.073068 | 8.448958 | 0.225007 | 23.081786 | 23.084674 | 23.081862 |
27 | 121824.804303 | 1052.402609 | 57.767910 | 57.717098 | 432.240787 | 65.005264 | 57.621881 | 50.223850 | 48.471275 | 45.037308 | … | 57.767910 | 37.747197 | -122.359888 | 19.470588 | 13.196384 | 6.535134 | 0.177386 | 17.540043 | 17.534218 | 17.540053 |
28 | 120628.722106 | 1437.819277 | 57.629387 | 57.609481 | 439.961498 | 65.838397 | 58.754584 | 51.148245 | 52.253798 | 47.578575 | … | 57.629387 | 37.748278 | -122.361829 | 19.413305 | 13.337873 | 6.576218 | 0.270299 | 23.963655 | 23.969094 | 23.963620 |
29 | 88695.445887 | 1615.017825 | 57.283422 | 57.482047 | 433.769544 | 67.336389 | 60.667176 | 53.387319 | 53.541380 | 49.878024 | … | 57.283422 | 37.746765 | -122.360245 | 19.294882 | 13.633308 | 7.565317 | 0.260250 | 26.916964 | 26.918513 | 26.916883 |
30 | 87779.856531 | 1533.849251 | 57.189507 | 57.323769 | 440.363812 | 66.780514 | 59.447323 | 51.803640 | 53.407709 | 49.461456 | … | 57.189507 | 37.745982 | -122.358819 | 19.537473 | 13.335974 | 7.986938 | 0.234904 | 25.564154 | 25.560385 | 25.564158 |
31 | 98845.505424 | 1693.046905 | 57.640715 | 57.788449 | 437.254627 | 66.005105 | 57.257498 | 48.310785 | 49.848117 | 44.868220 | … | 57.640715 | 37.748891 | -122.362839 | 19.306318 | 13.128909 | 7.504467 | 0.272176 | 28.217448 | 28.219847 | 28.217553 |
31 rows × 36 columns
We can also create a grouped object using Pandas grouper. We can actually specify a resampling and a grouping using the Grouper. So, in this case I want the frequency to be monthly and group by landmark and subscription type.
three_grouper = dmerge4.groupby([pd.Grouper(freq='M',key='startdate'),"landmark","subscriptiontype"])[['duration_f']].mean().reset_index()
three_grouper
startdate | landmark | subscriptiontype | duration_f | |
---|---|---|---|---|
0 | 2013-08-31 | Mountain View | Customer | 39.107407 |
1 | 2013-08-31 | Mountain View | Subscriber | 108.082692 |
2 | 2013-08-31 | Palo Alto | Customer | 43.618750 |
3 | 2013-08-31 | Palo Alto | Subscriber | 16.942083 |
4 | 2013-08-31 | Redwood City | Customer | 101.255000 |
5 | 2013-08-31 | Redwood City | Subscriber | 9.770526 |
6 | 2013-08-31 | San Francisco | Customer | 67.455091 |
7 | 2013-08-31 | San Francisco | Subscriber | 12.192543 |
8 | 2013-08-31 | San Jose | Customer | 56.080820 |
9 | 2013-08-31 | San Jose | Subscriber | 11.547216 |
10 | 2013-09-30 | Mountain View | Customer | 182.846638 |
11 | 2013-09-30 | Mountain View | Subscriber | 8.582500 |
12 | 2013-09-30 | Palo Alto | Customer | 160.404739 |
13 | 2013-09-30 | Palo Alto | Subscriber | 9.989910 |
14 | 2013-09-30 | Redwood City | Customer | 70.426667 |
15 | 2013-09-30 | Redwood City | Subscriber | 6.197643 |
16 | 2013-09-30 | San Francisco | Customer | 46.922383 |
17 | 2013-09-30 | San Francisco | Subscriber | 10.614401 |
18 | 2013-09-30 | San Jose | Customer | 72.983308 |
19 | 2013-09-30 | San Jose | Subscriber | 9.735506 |
20 | 2013-10-31 | Mountain View | Customer | 112.790000 |
21 | 2013-10-31 | Mountain View | Subscriber | 12.124853 |
22 | 2013-10-31 | Palo Alto | Customer | 83.578580 |
23 | 2013-10-31 | Palo Alto | Subscriber | 12.032308 |
24 | 2013-10-31 | Redwood City | Customer | 131.333600 |
25 | 2013-10-31 | Redwood City | Subscriber | 28.843174 |
26 | 2013-10-31 | San Francisco | Customer | 52.653610 |
27 | 2013-10-31 | San Francisco | Subscriber | 9.647517 |
28 | 2013-10-31 | San Jose | Customer | 104.471317 |
29 | 2013-10-31 | San Jose | Subscriber | 8.921913 |
… | … | … | … | … |
40 | 2013-12-31 | Mountain View | Customer | 240.204133 |
41 | 2013-12-31 | Mountain View | Subscriber | 11.058692 |
42 | 2013-12-31 | Palo Alto | Customer | 145.392346 |
43 | 2013-12-31 | Palo Alto | Subscriber | 11.520840 |
44 | 2013-12-31 | Redwood City | Customer | 323.863000 |
45 | 2013-12-31 | Redwood City | Subscriber | 3.947755 |
46 | 2013-12-31 | San Francisco | Customer | 57.743898 |
47 | 2013-12-31 | San Francisco | Subscriber | 9.461105 |
48 | 2013-12-31 | San Jose | Customer | 42.284125 |
49 | 2013-12-31 | San Jose | Subscriber | 20.031462 |
50 | 2014-01-31 | Mountain View | Customer | 290.265333 |
51 | 2014-01-31 | Mountain View | Subscriber | 6.763415 |
52 | 2014-01-31 | Palo Alto | Customer | 114.681616 |
53 | 2014-01-31 | Palo Alto | Subscriber | 31.803939 |
54 | 2014-01-31 | Redwood City | Customer | 297.106923 |
55 | 2014-01-31 | Redwood City | Subscriber | 7.313182 |
56 | 2014-01-31 | San Francisco | Customer | 54.000413 |
57 | 2014-01-31 | San Francisco | Subscriber | 9.358774 |
58 | 2014-01-31 | San Jose | Customer | 53.236957 |
59 | 2014-01-31 | San Jose | Subscriber | 8.794458 |
60 | 2014-02-28 | Mountain View | Customer | 152.230787 |
61 | 2014-02-28 | Mountain View | Subscriber | 6.197237 |
62 | 2014-02-28 | Palo Alto | Customer | 164.791429 |
63 | 2014-02-28 | Palo Alto | Subscriber | 10.650678 |
64 | 2014-02-28 | Redwood City | Customer | 342.915000 |
65 | 2014-02-28 | Redwood City | Subscriber | 5.328372 |
66 | 2014-02-28 | San Francisco | Customer | 59.656208 |
67 | 2014-02-28 | San Francisco | Subscriber | 9.561857 |
68 | 2014-02-28 | San Jose | Customer | 125.013058 |
69 | 2014-02-28 | San Jose | Subscriber | 8.391254 |
70 rows × 4 columns
By unstacking the data created in the grouper object, we now have duration by customer and subscriber by landmark. This is a nice way to present the data in tabular format.
three_grouper_unstack = dmerge4.groupby([pd.Grouper(freq='M',key='startdate'),"landmark","subscriptiontype"])[['duration']].mean().unstack().reset_index()
three_grouper_unstack.head()
startdate | landmark | duration | ||
---|---|---|---|---|
subscriptiontype | Customer | Subscriber | ||
0 | 2013-08-31 | Mountain View | 2346.444444 | 6485.000000 |
1 | 2013-08-31 | Palo Alto | 2617.075000 | 1016.541667 |
2 | 2013-08-31 | Redwood City | 6075.250000 | 586.157895 |
3 | 2013-08-31 | San Francisco | 4047.301633 | 731.557012 |
4 | 2013-08-31 | San Jose | 3364.852459 | 692.814433 |
three_grouper_unstack = dmerge4.groupby([pd.Grouper(freq='M',key='startdate'),"landmark","subscriptiontype"])[['duration']].mean().unstack()
three_grouper_unstack.head()
duration | |||
---|---|---|---|
subscriptiontype | Customer | Subscriber | |
startdate | landmark | ||
2013-08-31 | Mountain View | 2346.444444 | 6485.000000 |
Palo Alto | 2617.075000 | 1016.541667 | |
Redwood City | 6075.250000 | 586.157895 | |
San Francisco | 4047.301633 | 731.557012 | |
San Jose | 3364.852459 | 692.814433 |
Average duration by month faceted by landmark
grouperl = dmerge4.groupby([pd.Grouper(freq='M',key='startdate'),'landmark']).mean()[['duration_f']]
grouperl.head(10)
duration_f | ||
---|---|---|
startdate | landmark | |
2013-08-31 | Mountain View | 72.944340 |
Palo Alto | 33.615000 | |
Redwood City | 25.680870 | |
San Francisco | 44.081851 | |
San Jose | 28.740570 | |
2013-09-30 | Mountain View | 67.345988 |
Palo Alto | 108.553665 | |
Redwood City | 17.353211 | |
San Francisco | 24.351433 | |
San Jose | 29.502793 |
Grouping on a variable with many values is also possible. In this case, we can see summary statistics on every single duration as an example.
dmerge4.groupby([pd.Grouper(freq='M',key='startdate'),'duration_f']).sum()[:5]
tripid | duration | startterminal | endterminal | bike | max_temperature_f | mean_temperature_f | min_temperaturef | max_dew_point_f | meandew_point_f | … | station_id | lat | long | dockcount | day | hour | month | hourdiff | durationminutes | duration_i | ||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
startdate | duration_f | |||||||||||||||||||||
2013-08-31 | 1.03 | 7416 | 62 | 60 | 60 | 511 | 71 | 64 | 57 | 57 | 56 | … | 60 | 37.804770 | -122.403234 | 15 | 31 | 20 | 8 | 0 | 1.033333 | 1 |
1.05 | 4576 | 63 | 66 | 66 | 520 | 74 | 68 | 61 | 61 | 58 | … | 66 | 37.774814 | -122.418954 | 19 | 29 | 14 | 8 | 0 | 1.050000 | 1 | |
1.13 | 7250 | 68 | 35 | 35 | 100 | 76 | 69 | 62 | 61 | 59 | … | 35 | 37.444521 | -122.163093 | 11 | 31 | 16 | 8 | 0 | 1.133333 | 1 | |
1.15 | 13518 | 138 | 122 | 122 | 965 | 149 | 133 | 117 | 118 | 114 | … | 122 | 75.590069 | -244.799470 | 34 | 61 | 35 | 16 | 0 | 2.300000 | 2 | |
1.17 | 4607 | 70 | 10 | 10 | 661 | 81 | 72 | 63 | 62 | 61 | … | 10 | 37.337391 | -121.886995 | 15 | 29 | 14 | 8 | 0 | 1.166667 | 1 |
5 rows × 36 columns
We can be very specific and the level of granularity can be very detailed. We can pinpoint the average duration for a very specific profile or demographic which can be very powerful for a data-driven decision making including marketing or civic infrastructure goals for the bike sharing system.
This table gives an indication of monthly durations based on starting landmark and endstation. In other words, we get an idea of how long it takes to reach by bicycle and the differences between customers and subscribers all in one table. We can identify which customers are keeping the bicycles for longer periods of time.
np.round(dmerge4.groupby([pd.Grouper(freq='M',key='startdate'),"landmark","subscriptiontype","endstation"])[['duration_f']].mean())
duration_f | ||||
---|---|---|---|---|
startdate | landmark | subscriptiontype | endstation | |
2013-08-31 | Mountain View | Customer | Evelyn Park and Ride | 26 |
Mountain View Caltrain Station | 61 | |||
Mountain View City Hall | 11 | |||
Park at Olive | 34 | |||
Rengstorff Avenue / California Street | 16 | |||
San Antonio Caltrain Station | 52 | |||
Subscriber | Evelyn Park and Ride | 22 | ||
Mountain View Caltrain Station | 8 | |||
Mountain View City Hall | 215 | |||
Park at Olive | 453 | |||
Rengstorff Avenue / California Street | 10 | |||
San Antonio Caltrain Station | 15 | |||
Palo Alto | Customer | California Ave Caltrain Station | 19 | |
Cowper at University | 24 | |||
Palo Alto Caltrain Station | 105 | |||
Park at Olive | 17 | |||
Rengstorff Avenue / California Street | 25 | |||
San Antonio Caltrain Station | 17 | |||
University and Emerson | 62 | |||
Subscriber | California Ave Caltrain Station | 13 | ||
Cowper at University | 4 | |||
Palo Alto Caltrain Station | 25 | |||
Park at Olive | 15 | |||
San Antonio Caltrain Station | 30 | |||
University and Emerson | 17 | |||
Redwood City | Customer | Franklin at Maple | 31 | |
Palo Alto Caltrain Station | 279 | |||
Redwood City Caltrain Station | 64 | |||
Subscriber | Broadway at Main | 4 | ||
Franklin at Maple | 7 | |||
… | … | … | … | … |
2014-02-28 | San Francisco | Subscriber | Yerba Buena Center of the Arts (3rd @ Howard) | 8 |
San Jose | Customer | Adobe on Almaden | 28 | |
Arena Green / SAP Center | 73 | |||
Japantown | 19 | |||
MLK Library | 360 | |||
Paseo de San Antonio | 26 | |||
SJSU - San Salvador at 9th | 19 | |||
SJSU 4th at San Carlos | 236 | |||
San Jose City Hall | 56 | |||
San Jose Civic Center | 106 | |||
San Jose Diridon Caltrain Station | 177 | |||
San Pedro Square | 181 | |||
San Salvador at 1st | 72 | |||
Santa Clara at Almaden | 37 | |||
St James Park | 87 | |||
Subscriber | Adobe on Almaden | 5 | ||
Arena Green / SAP Center | 5 | |||
Japantown | 11 | |||
MLK Library | 18 | |||
Paseo de San Antonio | 8 | |||
SJSU - San Salvador at 9th | 12 | |||
SJSU 4th at San Carlos | 6 | |||
San Jose City Hall | 10 | |||
San Jose Civic Center | 8 | |||
San Jose Diridon Caltrain Station | 7 | |||
San Jose Government Center | 19 | |||
San Pedro Square | 7 | |||
San Salvador at 1st | 6 | |||
Santa Clara at Almaden | 5 | |||
St James Park | 9 |
1001 rows × 1 columns
Let's look at the breakdown of average duration by both landmark and subscription type. We can see some clear differences between customers and subscribers. In most cases, subscribers are taking short trips.
np.round(dmerge4.groupby([pd.Grouper(freq='M',key='startdate'),"landmark","subscriptiontype"])[['duration_f']].mean()).unstack()
duration_f | |||
---|---|---|---|
subscriptiontype | Customer | Subscriber | |
startdate | landmark | ||
2013-08-31 | Mountain View | 39 | 108 |
Palo Alto | 44 | 17 | |
Redwood City | 101 | 10 | |
San Francisco | 67 | 12 | |
San Jose | 56 | 12 | |
2013-09-30 | Mountain View | 183 | 9 |
Palo Alto | 160 | 10 | |
Redwood City | 70 | 6 | |
San Francisco | 47 | 11 | |
San Jose | 73 | 10 | |
2013-10-31 | Mountain View | 113 | 12 |
Palo Alto | 84 | 12 | |
Redwood City | 131 | 29 | |
San Francisco | 53 | 10 | |
San Jose | 104 | 9 | |
2013-11-30 | Mountain View | 135 | 6 |
Palo Alto | 249 | 10 | |
Redwood City | 175 | 6 | |
San Francisco | 59 | 10 | |
San Jose | 81 | 8 | |
2013-12-31 | Mountain View | 240 | 11 |
Palo Alto | 145 | 12 | |
Redwood City | 324 | 4 | |
San Francisco | 58 | 9 | |
San Jose | 42 | 20 | |
2014-01-31 | Mountain View | 290 | 7 |
Palo Alto | 115 | 32 | |
Redwood City | 297 | 7 | |
San Francisco | 54 | 9 | |
San Jose | 53 | 9 | |
2014-02-28 | Mountain View | 152 | 6 |
Palo Alto | 165 | 11 | |
Redwood City | 343 | 5 | |
San Francisco | 60 | 10 | |
San Jose | 125 | 8 |
Let's look at each level of duration on an hourly basis.
dmerge4.groupby([pd.Grouper(freq='H',key='startdate'),'duration_f']).mean().head()
tripid | duration | startterminal | endterminal | bike | max_temperature_f | mean_temperature_f | min_temperaturef | max_dew_point_f | meandew_point_f | … | station_id | lat | long | dockcount | day | hour | month | hourdiff | durationminutes | duration_i | ||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
startdate | duration_f | |||||||||||||||||||||
2013-08-29 09:00:00 | 2.90 | 4069 | 174 | 64 | 64 | 288 | 74 | 68 | 61 | 61 | 58 | … | 64 | 37.782259 | -122.392738 | 15 | 29 | 9 | 8 | 0 | 2.900000 | 3 |
2.97 | 4086 | 178 | 45 | 45 | 379 | 74 | 68 | 61 | 61 | 58 | … | 45 | 37.794231 | -122.402923 | 15 | 29 | 9 | 8 | 0 | 2.966667 | 3 | |
3.63 | 4081 | 218 | 27 | 27 | 150 | 80 | 70 | 64 | 65 | 61 | … | 27 | 37.389218 | -122.081896 | 15 | 29 | 9 | 8 | 0 | 3.633333 | 4 | |
4.78 | 4084 | 287 | 27 | 27 | 138 | 80 | 70 | 64 | 65 | 61 | … | 27 | 37.389218 | -122.081896 | 15 | 29 | 9 | 8 | 0 | 4.783333 | 5 | |
12.73 | 4080 | 764 | 66 | 69 | 315 | 74 | 68 | 61 | 61 | 58 | … | 66 | 37.774814 | -122.418954 | 19 | 29 | 9 | 8 | 0 | 12.733333 | 13 |
5 rows × 36 columns
In this table, we are looking at hourly data by landmark.
hourlytry = dmerge4.groupby([pd.Grouper(freq='H',key='startdate'),'landmark'],as_index=False)[['duration_f']].mean()
hourlytry
startdate | landmark | duration_f | |
---|---|---|---|
0 | 2013-08-29 09:00:00 | Mountain View | 4.205000 |
1 | 2013-08-29 09:00:00 | Redwood City | 63.820000 |
2 | 2013-08-29 09:00:00 | San Francisco | 14.695556 |
3 | 2013-08-29 10:00:00 | Mountain View | 6.325000 |
4 | 2013-08-29 10:00:00 | Palo Alto | 18.775000 |
5 | 2013-08-29 10:00:00 | San Francisco | 37.356250 |
6 | 2013-08-29 10:00:00 | San Jose | 21.225000 |
7 | 2013-08-29 11:00:00 | San Francisco | 40.284737 |
8 | 2013-08-29 11:00:00 | San Jose | 2.957500 |
9 | 2013-08-29 12:00:00 | Palo Alto | 11.658000 |
10 | 2013-08-29 12:00:00 | Redwood City | 8.353333 |
11 | 2013-08-29 12:00:00 | San Francisco | 16.223173 |
12 | 2013-08-29 12:00:00 | San Jose | 12.387500 |
13 | 2013-08-29 13:00:00 | Mountain View | 1628.550000 |
14 | 2013-08-29 13:00:00 | Palo Alto | 23.544286 |
15 | 2013-08-29 13:00:00 | San Francisco | 53.158529 |
16 | 2013-08-29 13:00:00 | San Jose | 26.937273 |
17 | 2013-08-29 14:00:00 | Mountain View | 15.880000 |
18 | 2013-08-29 14:00:00 | San Francisco | 24.205366 |
19 | 2013-08-29 14:00:00 | San Jose | 6.595000 |
20 | 2013-08-29 15:00:00 | Mountain View | 878.300000 |
21 | 2013-08-29 15:00:00 | Palo Alto | 11.620000 |
22 | 2013-08-29 15:00:00 | San Francisco | 23.209492 |
23 | 2013-08-29 15:00:00 | San Jose | 16.200000 |
24 | 2013-08-29 16:00:00 | Mountain View | 4.450000 |
25 | 2013-08-29 16:00:00 | San Francisco | 14.293684 |
26 | 2013-08-29 16:00:00 | San Jose | 5.440000 |
27 | 2013-08-29 17:00:00 | Mountain View | 12.970000 |
28 | 2013-08-29 17:00:00 | Palo Alto | 10.422000 |
29 | 2013-08-29 17:00:00 | Redwood City | 8.925000 |
… | … | … | … |
9781 | 2014-02-28 11:00:00 | San Jose | 5.270000 |
9782 | 2014-02-28 12:00:00 | Palo Alto | 3.320000 |
9783 | 2014-02-28 12:00:00 | San Francisco | 6.805385 |
9784 | 2014-02-28 12:00:00 | San Jose | 5.766667 |
9785 | 2014-02-28 13:00:00 | Palo Alto | 4.850000 |
9786 | 2014-02-28 13:00:00 | San Francisco | 7.604667 |
9787 | 2014-02-28 13:00:00 | San Jose | 43.920000 |
9788 | 2014-02-28 14:00:00 | San Francisco | 9.297692 |
9789 | 2014-02-28 14:00:00 | San Jose | 5.252000 |
9790 | 2014-02-28 15:00:00 | Mountain View | 4.330000 |
9791 | 2014-02-28 15:00:00 | San Francisco | 14.967143 |
9792 | 2014-02-28 15:00:00 | San Jose | 12.970000 |
9793 | 2014-02-28 16:00:00 | Mountain View | 4.470000 |
9794 | 2014-02-28 16:00:00 | San Francisco | 49.152857 |
9795 | 2014-02-28 16:00:00 | San Jose | 5.080000 |
9796 | 2014-02-28 17:00:00 | Mountain View | 5.625000 |
9797 | 2014-02-28 17:00:00 | San Francisco | 9.254571 |
9798 | 2014-02-28 17:00:00 | San Jose | 6.200000 |
9799 | 2014-02-28 18:00:00 | Mountain View | 4.740000 |
9800 | 2014-02-28 18:00:00 | Palo Alto | 13.415000 |
9801 | 2014-02-28 18:00:00 | San Francisco | 8.176364 |
9802 | 2014-02-28 19:00:00 | Mountain View | 9.435000 |
9803 | 2014-02-28 19:00:00 | San Francisco | 132.491364 |
9804 | 2014-02-28 19:00:00 | San Jose | 11.815000 |
9805 | 2014-02-28 20:00:00 | Mountain View | 4.620000 |
9806 | 2014-02-28 20:00:00 | Redwood City | 24.380000 |
9807 | 2014-02-28 20:00:00 | San Francisco | 7.406667 |
9808 | 2014-02-28 21:00:00 | San Francisco | 7.203333 |
9809 | 2014-02-28 22:00:00 | San Francisco | 8.572500 |
9810 | 2014-02-28 23:00:00 | San Francisco | 539.150000 |
9811 rows × 3 columns
Group operations on a datetime index
dmerge4.groupby(dmerge4.index.month).agg(['count','sum','mean','min','max'])[['duration_f']]
duration_f | |||||
---|---|---|---|---|---|
count | sum | mean | min | max | |
1 | 24428 | 412751.48 | 16.896655 | 1.00 | 9772.60 |
2 | 19024 | 332260.54 | 17.465335 | 1.02 | 3077.22 |
8 | 2102 | 90672.74 | 43.136413 | 1.03 | 4022.23 |
9 | 25243 | 663946.54 | 26.302204 | 1.00 | 9958.62 |
10 | 29105 | 578249.15 | 19.867691 | 1.00 | 7156.40 |
11 | 24219 | 500883.27 | 20.681418 | 1.00 | 12037.27 |
12 | 19894 | 375728.66 | 18.886532 | 1.00 | 10322.03 |
Let's make a table of average dockcount by each precipitation level and subscription type
i = pd.DataFrame(dmerge4.groupby(['precipitation_in','subscriptiontype']).dockcount.mean()).reset_index()
i
precipitation_in | subscriptiontype | dockcount | |
---|---|---|---|
0 | 0 | Customer | 18.940762 |
1 | 0 | Subscriber | 19.472218 |
2 | 0.01 | Customer | 19.660550 |
3 | 0.01 | Subscriber | 19.833333 |
4 | 0 | Customer | 18.942500 |
5 | 0 | Subscriber | 19.528072 |
6 | 0.01 | Customer | 18.397083 |
7 | 0.01 | Subscriber | 19.411516 |
8 | 0.02 | Customer | 19.000000 |
9 | 0.02 | Subscriber | 18.341772 |
10 | 0.03 | Customer | 12.000000 |
11 | 0.03 | Subscriber | 17.666667 |
12 | 0.04 | Customer | 17.166667 |
13 | 0.04 | Subscriber | 18.132075 |
14 | 0.05 | Subscriber | 17.545455 |
15 | 0.06 | Customer | 19.915663 |
16 | 0.06 | Subscriber | 18.818182 |
17 | 0.07 | Customer | 11.000000 |
18 | 0.07 | Subscriber | 15.000000 |
19 | 0.08 | Customer | 17.666667 |
20 | 0.08 | Subscriber | 19.000000 |
21 | 0.09 | Customer | 19.952381 |
22 | 0.09 | Subscriber | 19.588079 |
23 | 0.1 | Customer | 16.600000 |
24 | 0.1 | Subscriber | 19.102564 |
25 | 0.12 | Subscriber | 25.000000 |
26 | 0.14 | Customer | 17.000000 |
27 | 0.14 | Subscriber | 19.000000 |
28 | 0.16 | Subscriber | 15.000000 |
29 | 0.17 | Customer | 13.000000 |
… | … | … | … |
48 | 0.45 | Subscriber | 18.428571 |
49 | 0.47 | Customer | 21.000000 |
50 | 0.47 | Subscriber | 19.105263 |
51 | 0.49 | Customer | 11.000000 |
52 | 0.49 | Subscriber | 11.000000 |
53 | 0.53 | Subscriber | 17.000000 |
54 | 0.61 | Subscriber | 17.608696 |
55 | 0.63 | Customer | 18.714286 |
56 | 0.63 | Subscriber | 19.572519 |
57 | 0.64 | Customer | 20.161290 |
58 | 0.64 | Subscriber | 19.519608 |
59 | 0.66 | Customer | 16.756098 |
60 | 0.66 | Subscriber | 17.800000 |
61 | 0.68 | Subscriber | 18.407407 |
62 | 0.74 | Customer | 18.902439 |
63 | 0.74 | Subscriber | 19.868009 |
64 | 0.78 | Customer | 15.000000 |
65 | 0.78 | Subscriber | 18.200000 |
66 | 0.83 | Subscriber | 15.000000 |
67 | 0.85 | Customer | 19.355556 |
68 | 0.85 | Subscriber | 18.407407 |
69 | 0.88 | Customer | 15.000000 |
70 | 0.88 | Subscriber | 18.000000 |
71 | 0.97 | Customer | 19.000000 |
72 | 0.97 | Subscriber | 18.130435 |
73 | 1.06 | Subscriber | 13.285714 |
74 | 1.71 | Customer | 20.333333 |
75 | 1.71 | Subscriber | 17.000000 |
76 | T | Customer | 19.195219 |
77 | T | Subscriber | 19.671620 |
78 rows × 3 columns
Whate are the total number of observations in each of the initial six months. Other than first month of operation, the distribution of observations distribution is pretty evenly spread out.
dmerge4.groupby(dmerge4.index.month).size()
1 24428 2 19024 8 2102 9 25243 10 29105 11 24219 12 19894 dtype: int64
dmerge4.groupby(dmerge4.index.month).size().order(ascending=False)[:5]
10 29105 9 25243 1 24428 11 24219 12 19894 dtype: int64
Let's create a multi-level index without grouping because we'll often want to perform queries on the full dataset.
dmerge4twolevels = dmerge4.set_index(['landmark','subscriptiontype'])
dmerge4twolevels
lubstartdate | startstation | tripid | duration | startdate | startterminal | enddate | endstation | endterminal | bike | … | day | hour | month | timeofday | diff | hourdiff | durationminutes | duration_i | duration_f | thirtymin | ||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
landmark | subscriptiontype | |||||||||||||||||||||
San Francisco | Subscriber | 2013-08-29 | 2nd at Folsom | 4636 | 186 | 2013-08-29 15:11:00 | 62 | 2013-08-29 15:14:00 | 2nd at Townsend | 61 | 366 | … | 29 | 15 | 8 | evening | 00:03:00 | 0 | 3.100000 | 3 | 3.10 | in_thirty |
Subscriber | 2013-08-29 | 2nd at Folsom | 4820 | 813 | 2013-08-29 17:35:00 | 62 | 2013-08-29 17:48:00 | Townsend at 7th | 65 | 409 | … | 29 | 17 | 8 | evening | 00:13:00 | 0 | 13.550000 | 14 | 13.55 | in_thirty | |
Customer | 2013-08-29 | 2nd at Folsom | 5001 | 236 | 2013-08-29 20:00:00 | 62 | 2013-08-29 20:04:00 | Market at Sansome | 77 | 567 | … | 29 | 20 | 8 | night | 00:04:00 | 0 | 3.933333 | 4 | 3.93 | in_thirty | |
Subscriber | 2013-08-29 | 2nd at Folsom | 4812 | 186 | 2013-08-29 17:30:00 | 62 | 2013-08-29 17:33:00 | 2nd at Folsom | 62 | 409 | … | 29 | 17 | 8 | evening | 00:03:00 | 0 | 3.100000 | 3 | 3.10 | in_thirty | |
Customer | 2013-08-29 | 2nd at Folsom | 4946 | 827 | 2013-08-29 19:07:00 | 62 | 2013-08-29 19:21:00 | Embarcadero at Vallejo | 48 | 342 | … | 29 | 19 | 8 | evening | 00:14:00 | 0 | 13.783333 | 14 | 13.78 | in_thirty | |
Subscriber | 2013-08-29 | 2nd at Folsom | 4390 | 825 | 2013-08-29 12:33:00 | 62 | 2013-08-29 12:47:00 | 2nd at South Park | 64 | 331 | … | 29 | 12 | 8 | mid_day | 00:14:00 | 0 | 13.750000 | 14 | 13.75 | in_thirty | |
Subscriber | 2013-08-29 | 2nd at Folsom | 4939 | 816 | 2013-08-29 19:01:00 | 62 | 2013-08-29 19:14:00 | Harry Bridges Plaza (Ferry Building) | 50 | 539 | … | 29 | 19 | 8 | evening | 00:13:00 | 0 | 13.600000 | 14 | 13.60 | in_thirty | |
Customer | 2013-08-29 | 2nd at Folsom | 4949 | 1524 | 2013-08-29 19:11:00 | 62 | 2013-08-29 19:36:00 | Commercial at Montgomery | 45 | 400 | … | 29 | 19 | 8 | evening | 00:25:00 | 0 | 25.400000 | 25 | 25.40 | in_thirty | |
Subscriber | 2013-08-29 | 2nd at Folsom | 4582 | 335 | 2013-08-29 14:14:00 | 62 | 2013-08-29 14:20:00 | 2nd at Folsom | 62 | 342 | … | 29 | 14 | 8 | mid_day | 00:06:00 | 0 | 5.583333 | 6 | 5.58 | in_thirty | |
Customer | 2013-08-29 | 2nd at Folsom | 4951 | 1409 | 2013-08-29 19:13:00 | 62 | 2013-08-29 19:36:00 | Commercial at Montgomery | 45 | 380 | … | 29 | 19 | 8 | evening | 00:23:00 | 0 | 23.483333 | 23 | 23.48 | in_thirty | |
Customer | 2013-08-29 | 2nd at Folsom | 4945 | 797 | 2013-08-29 19:08:00 | 62 | 2013-08-29 19:21:00 | Embarcadero at Vallejo | 48 | 406 | … | 29 | 19 | 8 | evening | 00:13:00 | 0 | 13.283333 | 13 | 13.28 | in_thirty | |
Subscriber | 2013-08-29 | 2nd at Folsom | 4938 | 840 | 2013-08-29 19:01:00 | 62 | 2013-08-29 19:15:00 | Harry Bridges Plaza (Ferry Building) | 50 | 607 | … | 29 | 19 | 8 | evening | 00:14:00 | 0 | 14.000000 | 14 | 14.00 | in_thirty | |
Subscriber | 2013-08-29 | 2nd at South Park | 4557 | 130 | 2013-08-29 13:57:00 | 64 | 2013-08-29 13:59:00 | 2nd at South Park | 64 | 371 | … | 29 | 13 | 8 | mid_day | 00:02:00 | 0 | 2.166667 | 2 | 2.17 | in_thirty | |
Subscriber | 2013-08-29 | 2nd at South Park | 4069 | 174 | 2013-08-29 09:08:00 | 64 | 2013-08-29 09:11:00 | 2nd at South Park | 64 | 288 | … | 29 | 9 | 8 | morning | 00:03:00 | 0 | 2.900000 | 3 | 2.90 | in_thirty | |
Subscriber | 2013-08-29 | 2nd at South Park | 4564 | 1693 | 2013-08-29 14:04:00 | 64 | 2013-08-29 14:32:00 | 2nd at South Park | 64 | 272 | … | 29 | 14 | 8 | mid_day | 00:28:00 | 0 | 28.216667 | 28 | 28.22 | in_thirty | |
Subscriber | 2013-08-29 | 2nd at South Park | 4465 | 419 | 2013-08-29 13:11:00 | 64 | 2013-08-29 13:18:00 | Post at Kearney | 47 | 498 | … | 29 | 13 | 8 | mid_day | 00:07:00 | 0 | 6.983333 | 7 | 6.98 | in_thirty | |
Subscriber | 2013-08-29 | 2nd at South Park | 4566 | 1699 | 2013-08-29 14:04:00 | 64 | 2013-08-29 14:33:00 | 2nd at South Park | 64 | 371 | … | 29 | 14 | 8 | mid_day | 00:29:00 | 0 | 28.316667 | 28 | 28.32 | in_thirty | |
Subscriber | 2013-08-29 | 2nd at South Park | 5083 | 755 | 2013-08-29 22:05:00 | 64 | 2013-08-29 22:18:00 | Harry Bridges Plaza (Ferry Building) | 50 | 270 | … | 29 | 22 | 8 | night | 00:13:00 | 0 | 12.583333 | 13 | 12.58 | in_thirty | |
Subscriber | 2013-08-29 | 2nd at South Park | 4503 | 728 | 2013-08-29 13:29:00 | 64 | 2013-08-29 13:41:00 | Embarcadero at Vallejo | 48 | 331 | … | 29 | 13 | 8 | mid_day | 00:12:00 | 0 | 12.133333 | 12 | 12.13 | in_thirty | |
Subscriber | 2013-08-29 | 2nd at South Park | 4645 | 850 | 2013-08-29 15:20:00 | 64 | 2013-08-29 15:35:00 | San Francisco Caltrain (Townsend at 4th) | 70 | 288 | … | 29 | 15 | 8 | evening | 00:15:00 | 0 | 14.166667 | 14 | 14.17 | in_thirty | |
Subscriber | 2013-08-29 | 2nd at South Park | 4426 | 982 | 2013-08-29 12:52:00 | 64 | 2013-08-29 13:08:00 | 2nd at South Park | 64 | 593 | … | 29 | 12 | 8 | mid_day | 00:16:00 | 1 | 16.366667 | 16 | 16.37 | in_thirty | |
Subscriber | 2013-08-29 | 2nd at South Park | 4640 | 335 | 2013-08-29 15:19:00 | 64 | 2013-08-29 15:24:00 | San Francisco Caltrain (Townsend at 4th) | 70 | 438 | … | 29 | 15 | 8 | evening | 00:05:00 | 0 | 5.583333 | 6 | 5.58 | in_thirty | |
Subscriber | 2013-08-29 | 2nd at South Park | 4567 | 1684 | 2013-08-29 14:04:00 | 64 | 2013-08-29 14:33:00 | 2nd at South Park | 64 | 569 | … | 29 | 14 | 8 | mid_day | 00:29:00 | 0 | 28.066667 | 28 | 28.07 | in_thirty | |
Customer | 2013-08-29 | 2nd at Townsend | 4276 | 173 | 2013-08-29 11:45:00 | 61 | 2013-08-29 11:48:00 | 2nd at South Park | 64 | 280 | … | 29 | 11 | 8 | mid_day | 00:03:00 | 0 | 2.883333 | 3 | 2.88 | in_thirty | |
Customer | 2013-08-29 | 2nd at Townsend | 4876 | 309 | 2013-08-29 18:09:00 | 61 | 2013-08-29 18:14:00 | San Francisco Caltrain 2 (330 Townsend) | 69 | 565 | … | 29 | 18 | 8 | evening | 00:05:00 | 0 | 5.150000 | 5 | 5.15 | in_thirty | |
Subscriber | 2013-08-29 | 2nd at Townsend | 4411 | 223 | 2013-08-29 12:40:00 | 61 | 2013-08-29 12:43:00 | San Francisco Caltrain 2 (330 Townsend) | 69 | 259 | … | 29 | 12 | 8 | mid_day | 00:03:00 | 0 | 3.716667 | 4 | 3.72 | in_thirty | |
Customer | 2013-08-29 | 2nd at Townsend | 4397 | 950 | 2013-08-29 12:36:00 | 61 | 2013-08-29 12:52:00 | Steuart at Market | 74 | 312 | … | 29 | 12 | 8 | mid_day | 00:16:00 | 0 | 15.833333 | 16 | 15.83 | in_thirty | |
Customer | 2013-08-29 | 2nd at Townsend | 4888 | 1077 | 2013-08-29 18:16:00 | 61 | 2013-08-29 18:34:00 | Grant Avenue at Columbus Avenue | 73 | 309 | … | 29 | 18 | 8 | evening | 00:18:00 | 0 | 17.950000 | 18 | 17.95 | in_thirty | |
Customer | 2013-08-29 | 2nd at Townsend | 4402 | 1776 | 2013-08-29 12:39:00 | 61 | 2013-08-29 13:09:00 | Embarcadero at Folsom | 51 | 313 | … | 29 | 12 | 8 | mid_day | 00:30:00 | 1 | 29.600000 | 30 | 29.60 | in_thirty | |
Customer | 2013-08-29 | 2nd at Townsend | 4460 | 693 | 2013-08-29 13:08:00 | 61 | 2013-08-29 13:20:00 | Harry Bridges Plaza (Ferry Building) | 50 | 314 | … | 29 | 13 | 8 | mid_day | 00:12:00 | 0 | 11.550000 | 12 | 11.55 | in_thirty | |
… | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … | |
Subscriber | 2014-02-28 | Temporary Transbay Terminal (Howard at Beale) | 198483 | 839 | 2014-02-28 16:41:00 | 55 | 2014-02-28 16:55:00 | San Francisco Caltrain (Townsend at 4th) | 70 | 281 | … | 28 | 16 | 2 | evening | 00:14:00 | 0 | 13.983333 | 14 | 13.98 | in_thirty | |
Subscriber | 2014-02-28 | Townsend at 7th | 198770 | 850 | 2014-02-28 22:19:00 | 65 | 2014-02-28 22:34:00 | Post at Kearney | 47 | 479 | … | 28 | 22 | 2 | night | 00:15:00 | 0 | 14.166667 | 14 | 14.17 | in_thirty | |
Subscriber | 2014-02-28 | Townsend at 7th | 198405 | 317 | 2014-02-28 14:08:00 | 65 | 2014-02-28 14:13:00 | San Francisco Caltrain (Townsend at 4th) | 70 | 412 | … | 28 | 14 | 2 | mid_day | 00:05:00 | 0 | 5.283333 | 5 | 5.28 | in_thirty | |
Subscriber | 2014-02-28 | Townsend at 7th | 198750 | 254 | 2014-02-28 20:22:00 | 65 | 2014-02-28 20:26:00 | San Francisco Caltrain (Townsend at 4th) | 70 | 268 | … | 28 | 20 | 2 | night | 00:04:00 | 0 | 4.233333 | 4 | 4.23 | in_thirty | |
Subscriber | 2014-02-28 | Townsend at 7th | 198468 | 267 | 2014-02-28 16:33:00 | 65 | 2014-02-28 16:38:00 | San Francisco Caltrain (Townsend at 4th) | 70 | 400 | … | 28 | 16 | 2 | evening | 00:05:00 | 0 | 4.450000 | 4 | 4.45 | in_thirty | |
Subscriber | 2014-02-28 | Townsend at 7th | 198213 | 854 | 2014-02-28 09:06:00 | 65 | 2014-02-28 09:20:00 | Market at Sansome | 77 | 409 | … | 28 | 9 | 2 | morning | 00:14:00 | 0 | 14.233333 | 14 | 14.23 | in_thirty | |
Subscriber | 2014-02-28 | Townsend at 7th | 198098 | 558 | 2014-02-28 07:55:00 | 65 | 2014-02-28 08:04:00 | South Van Ness at Market | 66 | 515 | … | 28 | 7 | 2 | morning | 00:09:00 | 1 | 9.300000 | 9 | 9.30 | in_thirty | |
Subscriber | 2014-02-28 | Townsend at 7th | 198587 | 610 | 2014-02-28 17:32:00 | 65 | 2014-02-28 17:42:00 | Harry Bridges Plaza (Ferry Building) | 50 | 435 | … | 28 | 17 | 2 | evening | 00:10:00 | 0 | 10.166667 | 10 | 10.17 | in_thirty | |
Subscriber | 2014-02-28 | Townsend at 7th | 198648 | 408 | 2014-02-28 18:20:00 | 65 | 2014-02-28 18:27:00 | Civic Center BART (7th at Market) | 72 | 414 | … | 28 | 18 | 2 | evening | 00:07:00 | 0 | 6.800000 | 7 | 6.80 | in_thirty | |
Subscriber | 2014-02-28 | Townsend at 7th | 198465 | 891 | 2014-02-28 16:32:00 | 65 | 2014-02-28 16:46:00 | Temporary Transbay Terminal (Howard at Beale) | 55 | 570 | … | 28 | 16 | 2 | evening | 00:14:00 | 0 | 14.850000 | 15 | 14.85 | in_thirty | |
Subscriber | 2014-02-28 | Townsend at 7th | 198633 | 243 | 2014-02-28 18:08:00 | 65 | 2014-02-28 18:12:00 | San Francisco Caltrain (Townsend at 4th) | 70 | 389 | … | 28 | 18 | 2 | evening | 00:04:00 | 0 | 4.050000 | 4 | 4.05 | in_thirty | |
Subscriber | 2014-02-28 | Townsend at 7th | 198128 | 221 | 2014-02-28 08:12:00 | 65 | 2014-02-28 08:16:00 | San Francisco Caltrain (Townsend at 4th) | 70 | 514 | … | 28 | 8 | 2 | morning | 00:04:00 | 0 | 3.683333 | 4 | 3.68 | in_thirty | |
Palo Alto | Subscriber | 2014-02-28 | University and Emerson | 198399 | 291 | 2014-02-28 13:49:00 | 35 | 2014-02-28 13:54:00 | University and Emerson | 35 | 165 | … | 28 | 13 | 2 | mid_day | 00:05:00 | 0 | 4.850000 | 5 | 4.85 | in_thirty |
Subscriber | 2014-02-28 | University and Emerson | 198625 | 233 | 2014-02-28 18:03:00 | 35 | 2014-02-28 18:07:00 | Cowper at University | 37 | 138 | … | 28 | 18 | 2 | evening | 00:04:00 | 0 | 3.883333 | 4 | 3.88 | in_thirty | |
San Francisco | Subscriber | 2014-02-28 | Washington at Kearney | 198275 | 570 | 2014-02-28 09:44:00 | 46 | 2014-02-28 09:54:00 | Powell Street BART | 53 | 384 | … | 28 | 9 | 2 | morning | 00:10:00 | 0 | 9.500000 | 10 | 9.50 | in_thirty |
Subscriber | 2014-02-28 | Washington at Kearney | 198277 | 918 | 2014-02-28 09:45:00 | 46 | 2014-02-28 10:00:00 | South Van Ness at Market | 66 | 399 | … | 28 | 9 | 2 | morning | 00:15:00 | 1 | 15.300000 | 15 | 15.30 | in_thirty | |
Subscriber | 2014-02-28 | Washington at Kearney | 198671 | 956 | 2014-02-28 18:34:00 | 46 | 2014-02-28 18:50:00 | Embarcadero at Bryant | 54 | 548 | … | 28 | 18 | 2 | evening | 00:16:00 | 0 | 15.933333 | 16 | 15.93 | in_thirty | |
Subscriber | 2014-02-28 | Washington at Kearney | 198145 | 555 | 2014-02-28 08:28:00 | 46 | 2014-02-28 08:37:00 | Howard at 2nd | 63 | 407 | … | 28 | 8 | 2 | morning | 00:09:00 | 0 | 9.250000 | 9 | 9.25 | in_thirty | |
Customer | 2014-02-28 | Washington at Kearney | 198381 | 727 | 2014-02-28 13:07:00 | 46 | 2014-02-28 13:19:00 | Market at 4th | 76 | 324 | … | 28 | 13 | 2 | mid_day | 00:12:00 | 0 | 12.116667 | 12 | 12.12 | in_thirty | |
Subscriber | 2014-02-28 | Yerba Buena Center of the Arts (3rd @ Howard) | 198533 | 292 | 2014-02-28 17:03:00 | 68 | 2014-02-28 17:08:00 | San Francisco Caltrain 2 (330 Townsend) | 69 | 479 | … | 28 | 17 | 2 | evening | 00:05:00 | 0 | 4.866667 | 5 | 4.87 | in_thirty | |
Subscriber | 2014-02-28 | Yerba Buena Center of the Arts (3rd @ Howard) | 198491 | 278 | 2014-02-28 16:44:00 | 68 | 2014-02-28 16:49:00 | San Francisco Caltrain (Townsend at 4th) | 70 | 314 | … | 28 | 16 | 2 | evening | 00:05:00 | 0 | 4.633333 | 5 | 4.63 | in_thirty | |
Subscriber | 2014-02-28 | Yerba Buena Center of the Arts (3rd @ Howard) | 198305 | 522 | 2014-02-28 10:05:00 | 68 | 2014-02-28 10:14:00 | Steuart at Market | 74 | 326 | … | 28 | 10 | 2 | morning | 00:09:00 | 0 | 8.700000 | 9 | 8.70 | in_thirty | |
Subscriber | 2014-02-28 | Yerba Buena Center of the Arts (3rd @ Howard) | 198289 | 561 | 2014-02-28 09:50:00 | 68 | 2014-02-28 10:00:00 | Market at 10th | 67 | 518 | … | 28 | 9 | 2 | morning | 00:10:00 | 1 | 9.350000 | 9 | 9.35 | in_thirty | |
Subscriber | 2014-02-28 | Yerba Buena Center of the Arts (3rd @ Howard) | 198453 | 282 | 2014-02-28 16:19:00 | 68 | 2014-02-28 16:24:00 | San Francisco Caltrain (Townsend at 4th) | 70 | 437 | … | 28 | 16 | 2 | evening | 00:05:00 | 0 | 4.700000 | 5 | 4.70 | in_thirty | |
Subscriber | 2014-02-28 | Yerba Buena Center of the Arts (3rd @ Howard) | 198710 | 323 | 2014-02-28 19:13:00 | 68 | 2014-02-28 19:19:00 | San Francisco Caltrain 2 (330 Townsend) | 69 | 339 | … | 28 | 19 | 2 | evening | 00:06:00 | 0 | 5.383333 | 5 | 5.38 | in_thirty | |
Subscriber | 2014-02-28 | Yerba Buena Center of the Arts (3rd @ Howard) | 198685 | 271 | 2014-02-28 18:48:00 | 68 | 2014-02-28 18:52:00 | San Francisco Caltrain (Townsend at 4th) | 70 | 454 | … | 28 | 18 | 2 | evening | 00:04:00 | 0 | 4.516667 | 5 | 4.52 | in_thirty | |
Subscriber | 2014-02-28 | Yerba Buena Center of the Arts (3rd @ Howard) | 198087 | 159 | 2014-02-28 07:42:00 | 68 | 2014-02-28 07:44:00 | 5th at Howard | 57 | 435 | … | 28 | 7 | 2 | morning | 00:02:00 | 0 | 2.650000 | 3 | 2.65 | in_thirty | |
Subscriber | 2014-02-28 | Yerba Buena Center of the Arts (3rd @ Howard) | 198639 | 342 | 2014-02-28 18:13:00 | 68 | 2014-02-28 18:19:00 | Civic Center BART (7th at Market) | 72 | 292 | … | 28 | 18 | 2 | evening | 00:06:00 | 0 | 5.700000 | 6 | 5.70 | in_thirty | |
Subscriber | 2014-02-28 | Yerba Buena Center of the Arts (3rd @ Howard) | 198566 | 338 | 2014-02-28 17:20:00 | 68 | 2014-02-28 17:25:00 | San Francisco Caltrain (Townsend at 4th) | 70 | 477 | … | 28 | 17 | 2 | evening | 00:05:00 | 0 | 5.633333 | 6 | 5.63 | in_thirty | |
Customer | 2014-02-28 | Yerba Buena Center of the Arts (3rd @ Howard) | 198386 | 394 | 2014-02-28 13:22:00 | 68 | 2014-02-28 13:28:00 | Steuart at Market | 74 | 442 | … | 28 | 13 | 2 | mid_day | 00:06:00 | 0 | 6.566667 | 7 | 6.57 | in_thirty |
144015 rows × 52 columns
Creates a groupby object dataframe on which aggregate actions can be performed and can group by the multiindexes
#creates a groupby object dataframe and can group by the multiindexes
grouplevel0 = dmerge4twolevels.groupby(level=0)
grouplevel0
<pandas.core.groupby.DataFrameGroupBy object at 0x7ffbc802c8d0>
grouplevel0.mean()
#same result as dmerge4.groupby('landmark').mean()
tripid | duration | startterminal | endterminal | bike | max_temperature_f | mean_temperature_f | min_temperaturef | max_dew_point_f | meandew_point_f | … | lat | long | dockcount | day | hour | month | hourdiff | durationminutes | duration_i | duration_f | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
landmark | |||||||||||||||||||||
Mountain View | 113506.415689 | 2681.418988 | 28.782991 | 28.980572 | 266.556085 | 67.732405 | 54.907991 | 44.681085 | 48.479106 | 42.678886 | … | 37.394312 | -122.083341 | 18.888563 | 15.521261 | 13.072947 | 6.898827 | 0.251833 | 44.690316 | 44.682918 | 44.690359 |
Palo Alto | 92832.010551 | 4635.689918 | 35.669988 | 35.193435 | 208.521688 | 69.908558 | 59.389215 | 48.392145 | 48.810082 | 44.058617 | … | 37.440460 | -122.156431 | 15.171161 | 15.759086 | 13.461313 | 7.947831 | 0.566237 | 77.261499 | 77.262603 | 77.261483 |
Redwood City | 82721.219420 | 2945.804540 | 23.134931 | 24.148802 | 237.013871 | 68.611602 | 59.397226 | 50.197982 | 51.931904 | 45.466583 | … | 37.486049 | -122.229363 | 19.766709 | 15.145019 | 12.639344 | 8.464061 | 0.161412 | 49.096742 | 49.084489 | 49.096608 |
San Francisco | 101092.908674 | 1133.965253 | 61.825657 | 61.868821 | 455.360123 | 66.115962 | 58.201636 | 49.776278 | 49.848767 | 45.052167 | … | 37.787144 | -122.400225 | 19.565963 | 15.746290 | 13.177293 | 7.668687 | 0.231462 | 18.899421 | 18.899656 | 18.899424 |
San Jose | 98496.783100 | 1394.662899 | 7.371684 | 7.563290 | 284.057862 | 69.322104 | 58.393397 | 46.986682 | 47.791046 | 42.897146 | … | 37.334148 | -121.892161 | 17.977952 | 15.714605 | 13.638500 | 7.585115 | 0.095467 | 23.244382 | 23.240963 | 23.244376 |
5 rows × 37 columns
grouplevel1 = dmerge4twolevels.groupby(level=1)
grouplevel1
<pandas.core.groupby.DataFrameGroupBy object at 0x7ffb96e0db90>
grouplevel1.mean()
tripid | duration | startterminal | endterminal | bike | max_temperature_f | mean_temperature_f | min_temperaturef | max_dew_point_f | meandew_point_f | … | lat | long | dockcount | day | hour | month | hourdiff | durationminutes | duration_i | duration_f | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
subscriptiontype | |||||||||||||||||||||
Customer | 79116.721714 | 3629.519659 | 56.400586 | 56.781909 | 433.907238 | 68.987487 | 60.303115 | 51.149137 | 51.473953 | 46.698992 | … | 37.745722 | -122.358957 | 18.944810 | 15.439970 | 13.707949 | 8.313192 | 0.580051 | 60.491994 | 60.490022 | 60.491974 |
Subscriber | 106806.906130 | 589.969564 | 57.538334 | 57.505486 | 437.863692 | 65.713868 | 57.602277 | 49.049971 | 49.218783 | 44.373692 | … | 37.745863 | -122.358849 | 19.526261 | 15.816247 | 13.069760 | 7.481157 | 0.132648 | 9.832826 | 9.833106 | 9.832834 |
2 rows × 37 columns
The level can also be specified by name
#can also specify level by name (same as above)
dmerge4twolevels.groupby(level='subscriptiontype').sum()
#dmerge4twolevels.sum(level='subscriptiontype') #same as above
tripid | duration | startterminal | endterminal | bike | max_temperature_f | mean_temperature_f | min_temperaturef | max_dew_point_f | meandew_point_f | … | lat | long | dockcount | day | hour | month | hourdiff | durationminutes | duration_i | duration_f | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
subscriptiontype | |||||||||||||||||||||
Customer | 2402616605 | 110221253 | 1712773 | 1724353 | 13176895 | 2095012 | 1831285 | 1553297 | 1563161 | 1418155 | … | 1146262.097908 | -3715796.806334 | 575316 | 468881 | 416283 | 252455 | 17615 | 1837020.883333 | 1836961 | 1837020.26 |
Subscriber | 12138284461 | 67048271 | 6539059 | 6535326 | 49761895 | 7468184 | 6546326 | 5574382 | 5593567 | 5042937 | … | 4289704.103845 | -13905716.142004 | 2219101 | 1797469 | 1485339 | 850211 | 15075 | 1117471.183333 | 1117503 | 1117472.12 |
2 rows × 37 columns
Create the level without dropping it as a column
#don't drop the level as a column; another way to select the level of a multi-level index
dmerge4twolevels.xs('Subscriber', level='subscriptiontype',drop_level=False)
lubstartdate | startstation | tripid | duration | startdate | startterminal | enddate | endstation | endterminal | bike | … | day | hour | month | timeofday | diff | hourdiff | durationminutes | duration_i | duration_f | thirtymin | ||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
landmark | subscriptiontype | |||||||||||||||||||||
San Francisco | Subscriber | 2013-08-29 | 2nd at Folsom | 4636 | 186 | 2013-08-29 15:11:00 | 62 | 2013-08-29 15:14:00 | 2nd at Townsend | 61 | 366 | … | 29 | 15 | 8 | evening | 00:03:00 | 0 | 3.100000 | 3 | 3.10 | in_thirty |
Subscriber | 2013-08-29 | 2nd at Folsom | 4820 | 813 | 2013-08-29 17:35:00 | 62 | 2013-08-29 17:48:00 | Townsend at 7th | 65 | 409 | … | 29 | 17 | 8 | evening | 00:13:00 | 0 | 13.550000 | 14 | 13.55 | in_thirty | |
Subscriber | 2013-08-29 | 2nd at Folsom | 4812 | 186 | 2013-08-29 17:30:00 | 62 | 2013-08-29 17:33:00 | 2nd at Folsom | 62 | 409 | … | 29 | 17 | 8 | evening | 00:03:00 | 0 | 3.100000 | 3 | 3.10 | in_thirty | |
Subscriber | 2013-08-29 | 2nd at Folsom | 4390 | 825 | 2013-08-29 12:33:00 | 62 | 2013-08-29 12:47:00 | 2nd at South Park | 64 | 331 | … | 29 | 12 | 8 | mid_day | 00:14:00 | 0 | 13.750000 | 14 | 13.75 | in_thirty | |
Subscriber | 2013-08-29 | 2nd at Folsom | 4939 | 816 | 2013-08-29 19:01:00 | 62 | 2013-08-29 19:14:00 | Harry Bridges Plaza (Ferry Building) | 50 | 539 | … | 29 | 19 | 8 | evening | 00:13:00 | 0 | 13.600000 | 14 | 13.60 | in_thirty | |
Subscriber | 2013-08-29 | 2nd at Folsom | 4582 | 335 | 2013-08-29 14:14:00 | 62 | 2013-08-29 14:20:00 | 2nd at Folsom | 62 | 342 | … | 29 | 14 | 8 | mid_day | 00:06:00 | 0 | 5.583333 | 6 | 5.58 | in_thirty | |
Subscriber | 2013-08-29 | 2nd at Folsom | 4938 | 840 | 2013-08-29 19:01:00 | 62 | 2013-08-29 19:15:00 | Harry Bridges Plaza (Ferry Building) | 50 | 607 | … | 29 | 19 | 8 | evening | 00:14:00 | 0 | 14.000000 | 14 | 14.00 | in_thirty | |
Subscriber | 2013-08-29 | 2nd at South Park | 4557 | 130 | 2013-08-29 13:57:00 | 64 | 2013-08-29 13:59:00 | 2nd at South Park | 64 | 371 | … | 29 | 13 | 8 | mid_day | 00:02:00 | 0 | 2.166667 | 2 | 2.17 | in_thirty | |
Subscriber | 2013-08-29 | 2nd at South Park | 4069 | 174 | 2013-08-29 09:08:00 | 64 | 2013-08-29 09:11:00 | 2nd at South Park | 64 | 288 | … | 29 | 9 | 8 | morning | 00:03:00 | 0 | 2.900000 | 3 | 2.90 | in_thirty | |
Subscriber | 2013-08-29 | 2nd at South Park | 4564 | 1693 | 2013-08-29 14:04:00 | 64 | 2013-08-29 14:32:00 | 2nd at South Park | 64 | 272 | … | 29 | 14 | 8 | mid_day | 00:28:00 | 0 | 28.216667 | 28 | 28.22 | in_thirty | |
Subscriber | 2013-08-29 | 2nd at South Park | 4465 | 419 | 2013-08-29 13:11:00 | 64 | 2013-08-29 13:18:00 | Post at Kearney | 47 | 498 | … | 29 | 13 | 8 | mid_day | 00:07:00 | 0 | 6.983333 | 7 | 6.98 | in_thirty | |
Subscriber | 2013-08-29 | 2nd at South Park | 4566 | 1699 | 2013-08-29 14:04:00 | 64 | 2013-08-29 14:33:00 | 2nd at South Park | 64 | 371 | … | 29 | 14 | 8 | mid_day | 00:29:00 | 0 | 28.316667 | 28 | 28.32 | in_thirty | |
Subscriber | 2013-08-29 | 2nd at South Park | 5083 | 755 | 2013-08-29 22:05:00 | 64 | 2013-08-29 22:18:00 | Harry Bridges Plaza (Ferry Building) | 50 | 270 | … | 29 | 22 | 8 | night | 00:13:00 | 0 | 12.583333 | 13 | 12.58 | in_thirty | |
Subscriber | 2013-08-29 | 2nd at South Park | 4503 | 728 | 2013-08-29 13:29:00 | 64 | 2013-08-29 13:41:00 | Embarcadero at Vallejo | 48 | 331 | … | 29 | 13 | 8 | mid_day | 00:12:00 | 0 | 12.133333 | 12 | 12.13 | in_thirty | |
Subscriber | 2013-08-29 | 2nd at South Park | 4645 | 850 | 2013-08-29 15:20:00 | 64 | 2013-08-29 15:35:00 | San Francisco Caltrain (Townsend at 4th) | 70 | 288 | … | 29 | 15 | 8 | evening | 00:15:00 | 0 | 14.166667 | 14 | 14.17 | in_thirty | |
Subscriber | 2013-08-29 | 2nd at South Park | 4426 | 982 | 2013-08-29 12:52:00 | 64 | 2013-08-29 13:08:00 | 2nd at South Park | 64 | 593 | … | 29 | 12 | 8 | mid_day | 00:16:00 | 1 | 16.366667 | 16 | 16.37 | in_thirty | |
Subscriber | 2013-08-29 | 2nd at South Park | 4640 | 335 | 2013-08-29 15:19:00 | 64 | 2013-08-29 15:24:00 | San Francisco Caltrain (Townsend at 4th) | 70 | 438 | … | 29 | 15 | 8 | evening | 00:05:00 | 0 | 5.583333 | 6 | 5.58 | in_thirty | |
Subscriber | 2013-08-29 | 2nd at South Park | 4567 | 1684 | 2013-08-29 14:04:00 | 64 | 2013-08-29 14:33:00 | 2nd at South Park | 64 | 569 | … | 29 | 14 | 8 | mid_day | 00:29:00 | 0 | 28.066667 | 28 | 28.07 | in_thirty | |
Subscriber | 2013-08-29 | 2nd at Townsend | 4411 | 223 | 2013-08-29 12:40:00 | 61 | 2013-08-29 12:43:00 | San Francisco Caltrain 2 (330 Townsend) | 69 | 259 | … | 29 | 12 | 8 | mid_day | 00:03:00 | 0 | 3.716667 | 4 | 3.72 | in_thirty | |
Subscriber | 2013-08-29 | 5th at Howard | 4909 | 1220 | 2013-08-29 18:36:00 | 57 | 2013-08-29 18:57:00 | Embarcadero at Vallejo | 48 | 404 | … | 29 | 18 | 8 | evening | 00:21:00 | 0 | 20.333333 | 20 | 20.33 | in_thirty | |
Subscriber | 2013-08-29 | 5th at Howard | 4970 | 404 | 2013-08-29 19:36:00 | 57 | 2013-08-29 19:43:00 | Powell at Post (Union Square) | 71 | 320 | … | 29 | 19 | 8 | evening | 00:07:00 | 0 | 6.733333 | 7 | 6.73 | in_thirty | |
Subscriber | 2013-08-29 | 5th at Howard | 4657 | 298 | 2013-08-29 15:30:00 | 57 | 2013-08-29 15:35:00 | 5th at Howard | 57 | 631 | … | 29 | 15 | 8 | evening | 00:05:00 | 0 | 4.966667 | 5 | 4.97 | in_thirty | |
Subscriber | 2013-08-29 | 5th at Howard | 4892 | 1492 | 2013-08-29 18:21:00 | 57 | 2013-08-29 18:46:00 | Embarcadero at Vallejo | 48 | 577 | … | 29 | 18 | 8 | evening | 00:25:00 | 0 | 24.866667 | 25 | 24.87 | in_thirty | |
Subscriber | 2013-08-29 | 5th at Howard | 4950 | 1424 | 2013-08-29 19:12:00 | 57 | 2013-08-29 19:35:00 | Harry Bridges Plaza (Ferry Building) | 50 | 508 | … | 29 | 19 | 8 | evening | 00:23:00 | 0 | 23.733333 | 24 | 23.73 | in_thirty | |
San Jose | Subscriber | 2013-08-29 | Adobe on Almaden | 4419 | 556 | 2013-08-29 12:45:00 | 5 | 2013-08-29 12:55:00 | San Jose Civic Center | 3 | 671 | … | 29 | 12 | 8 | mid_day | 00:10:00 | 0 | 9.266667 | 9 | 9.27 | in_thirty |
Subscriber | 2013-08-29 | Adobe on Almaden | 4741 | 412 | 2013-08-29 16:45:00 | 5 | 2013-08-29 16:52:00 | San Jose City Hall | 10 | 679 | … | 29 | 16 | 8 | evening | 00:07:00 | 0 | 6.866667 | 7 | 6.87 | in_thirty | |
Subscriber | 2013-08-29 | Adobe on Almaden | 4502 | 210 | 2013-08-29 13:28:00 | 5 | 2013-08-29 13:31:00 | San Jose Civic Center | 3 | 664 | … | 29 | 13 | 8 | mid_day | 00:03:00 | 0 | 3.500000 | 4 | 3.50 | in_thirty | |
Subscriber | 2013-08-29 | Arena Green / SAP Center | 4365 | 1094 | 2013-08-29 12:24:00 | 14 | 2013-08-29 12:42:00 | Arena Green / SAP Center | 14 | 647 | … | 29 | 12 | 8 | mid_day | 00:18:00 | 0 | 18.233333 | 18 | 18.23 | in_thirty | |
San Francisco | Subscriber | 2013-08-29 | Beale at Market | 4787 | 479 | 2013-08-29 17:12:00 | 56 | 2013-08-29 17:20:00 | Market at 4th | 76 | 546 | … | 29 | 17 | 8 | evening | 00:08:00 | 0 | 7.983333 | 8 | 7.98 | in_thirty |
Subscriber | 2013-08-29 | Beale at Market | 4374 | 563 | 2013-08-29 12:23:00 | 56 | 2013-08-29 12:32:00 | Commercial at Montgomery | 45 | 410 | … | 29 | 12 | 8 | mid_day | 00:09:00 | 0 | 9.383333 | 9 | 9.38 | in_thirty | |
… | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … | |
Subscriber | 2014-02-28 | Temporary Transbay Terminal (Howard at Beale) | 198170 | 888 | 2014-02-28 08:47:00 | 55 | 2014-02-28 09:02:00 | Grant Avenue at Columbus Avenue | 73 | 319 | … | 28 | 8 | 2 | morning | 00:15:00 | 1 | 14.800000 | 15 | 14.80 | in_thirty | |
Subscriber | 2014-02-28 | Temporary Transbay Terminal (Howard at Beale) | 198105 | 277 | 2014-02-28 07:59:00 | 55 | 2014-02-28 08:03:00 | Harry Bridges Plaza (Ferry Building) | 50 | 420 | … | 28 | 7 | 2 | morning | 00:04:00 | 1 | 4.616667 | 5 | 4.62 | in_thirty | |
Subscriber | 2014-02-28 | Temporary Transbay Terminal (Howard at Beale) | 198483 | 839 | 2014-02-28 16:41:00 | 55 | 2014-02-28 16:55:00 | San Francisco Caltrain (Townsend at 4th) | 70 | 281 | … | 28 | 16 | 2 | evening | 00:14:00 | 0 | 13.983333 | 14 | 13.98 | in_thirty | |
Subscriber | 2014-02-28 | Townsend at 7th | 198770 | 850 | 2014-02-28 22:19:00 | 65 | 2014-02-28 22:34:00 | Post at Kearney | 47 | 479 | … | 28 | 22 | 2 | night | 00:15:00 | 0 | 14.166667 | 14 | 14.17 | in_thirty | |
Subscriber | 2014-02-28 | Townsend at 7th | 198405 | 317 | 2014-02-28 14:08:00 | 65 | 2014-02-28 14:13:00 | San Francisco Caltrain (Townsend at 4th) | 70 | 412 | … | 28 | 14 | 2 | mid_day | 00:05:00 | 0 | 5.283333 | 5 | 5.28 | in_thirty | |
Subscriber | 2014-02-28 | Townsend at 7th | 198750 | 254 | 2014-02-28 20:22:00 | 65 | 2014-02-28 20:26:00 | San Francisco Caltrain (Townsend at 4th) | 70 | 268 | … | 28 | 20 | 2 | night | 00:04:00 | 0 | 4.233333 | 4 | 4.23 | in_thirty | |
Subscriber | 2014-02-28 | Townsend at 7th | 198468 | 267 | 2014-02-28 16:33:00 | 65 | 2014-02-28 16:38:00 | San Francisco Caltrain (Townsend at 4th) | 70 | 400 | … | 28 | 16 | 2 | evening | 00:05:00 | 0 | 4.450000 | 4 | 4.45 | in_thirty | |
Subscriber | 2014-02-28 | Townsend at 7th | 198213 | 854 | 2014-02-28 09:06:00 | 65 | 2014-02-28 09:20:00 | Market at Sansome | 77 | 409 | … | 28 | 9 | 2 | morning | 00:14:00 | 0 | 14.233333 | 14 | 14.23 | in_thirty | |
Subscriber | 2014-02-28 | Townsend at 7th | 198098 | 558 | 2014-02-28 07:55:00 | 65 | 2014-02-28 08:04:00 | South Van Ness at Market | 66 | 515 | … | 28 | 7 | 2 | morning | 00:09:00 | 1 | 9.300000 | 9 | 9.30 | in_thirty | |
Subscriber | 2014-02-28 | Townsend at 7th | 198587 | 610 | 2014-02-28 17:32:00 | 65 | 2014-02-28 17:42:00 | Harry Bridges Plaza (Ferry Building) | 50 | 435 | … | 28 | 17 | 2 | evening | 00:10:00 | 0 | 10.166667 | 10 | 10.17 | in_thirty | |
Subscriber | 2014-02-28 | Townsend at 7th | 198648 | 408 | 2014-02-28 18:20:00 | 65 | 2014-02-28 18:27:00 | Civic Center BART (7th at Market) | 72 | 414 | … | 28 | 18 | 2 | evening | 00:07:00 | 0 | 6.800000 | 7 | 6.80 | in_thirty | |
Subscriber | 2014-02-28 | Townsend at 7th | 198465 | 891 | 2014-02-28 16:32:00 | 65 | 2014-02-28 16:46:00 | Temporary Transbay Terminal (Howard at Beale) | 55 | 570 | … | 28 | 16 | 2 | evening | 00:14:00 | 0 | 14.850000 | 15 | 14.85 | in_thirty | |
Subscriber | 2014-02-28 | Townsend at 7th | 198633 | 243 | 2014-02-28 18:08:00 | 65 | 2014-02-28 18:12:00 | San Francisco Caltrain (Townsend at 4th) | 70 | 389 | … | 28 | 18 | 2 | evening | 00:04:00 | 0 | 4.050000 | 4 | 4.05 | in_thirty | |
Subscriber | 2014-02-28 | Townsend at 7th | 198128 | 221 | 2014-02-28 08:12:00 | 65 | 2014-02-28 08:16:00 | San Francisco Caltrain (Townsend at 4th) | 70 | 514 | … | 28 | 8 | 2 | morning | 00:04:00 | 0 | 3.683333 | 4 | 3.68 | in_thirty | |
Palo Alto | Subscriber | 2014-02-28 | University and Emerson | 198399 | 291 | 2014-02-28 13:49:00 | 35 | 2014-02-28 13:54:00 | University and Emerson | 35 | 165 | … | 28 | 13 | 2 | mid_day | 00:05:00 | 0 | 4.850000 | 5 | 4.85 | in_thirty |
Subscriber | 2014-02-28 | University and Emerson | 198625 | 233 | 2014-02-28 18:03:00 | 35 | 2014-02-28 18:07:00 | Cowper at University | 37 | 138 | … | 28 | 18 | 2 | evening | 00:04:00 | 0 | 3.883333 | 4 | 3.88 | in_thirty | |
San Francisco | Subscriber | 2014-02-28 | Washington at Kearney | 198275 | 570 | 2014-02-28 09:44:00 | 46 | 2014-02-28 09:54:00 | Powell Street BART | 53 | 384 | … | 28 | 9 | 2 | morning | 00:10:00 | 0 | 9.500000 | 10 | 9.50 | in_thirty |
Subscriber | 2014-02-28 | Washington at Kearney | 198277 | 918 | 2014-02-28 09:45:00 | 46 | 2014-02-28 10:00:00 | South Van Ness at Market | 66 | 399 | … | 28 | 9 | 2 | morning | 00:15:00 | 1 | 15.300000 | 15 | 15.30 | in_thirty | |
Subscriber | 2014-02-28 | Washington at Kearney | 198671 | 956 | 2014-02-28 18:34:00 | 46 | 2014-02-28 18:50:00 | Embarcadero at Bryant | 54 | 548 | … | 28 | 18 | 2 | evening | 00:16:00 | 0 | 15.933333 | 16 | 15.93 | in_thirty | |
Subscriber | 2014-02-28 | Washington at Kearney | 198145 | 555 | 2014-02-28 08:28:00 | 46 | 2014-02-28 08:37:00 | Howard at 2nd | 63 | 407 | … | 28 | 8 | 2 | morning | 00:09:00 | 0 | 9.250000 | 9 | 9.25 | in_thirty | |
Subscriber | 2014-02-28 | Yerba Buena Center of the Arts (3rd @ Howard) | 198533 | 292 | 2014-02-28 17:03:00 | 68 | 2014-02-28 17:08:00 | San Francisco Caltrain 2 (330 Townsend) | 69 | 479 | … | 28 | 17 | 2 | evening | 00:05:00 | 0 | 4.866667 | 5 | 4.87 | in_thirty | |
Subscriber | 2014-02-28 | Yerba Buena Center of the Arts (3rd @ Howard) | 198491 | 278 | 2014-02-28 16:44:00 | 68 | 2014-02-28 16:49:00 | San Francisco Caltrain (Townsend at 4th) | 70 | 314 | … | 28 | 16 | 2 | evening | 00:05:00 | 0 | 4.633333 | 5 | 4.63 | in_thirty | |
Subscriber | 2014-02-28 | Yerba Buena Center of the Arts (3rd @ Howard) | 198305 | 522 | 2014-02-28 10:05:00 | 68 | 2014-02-28 10:14:00 | Steuart at Market | 74 | 326 | … | 28 | 10 | 2 | morning | 00:09:00 | 0 | 8.700000 | 9 | 8.70 | in_thirty | |
Subscriber | 2014-02-28 | Yerba Buena Center of the Arts (3rd @ Howard) | 198289 | 561 | 2014-02-28 09:50:00 | 68 | 2014-02-28 10:00:00 | Market at 10th | 67 | 518 | … | 28 | 9 | 2 | morning | 00:10:00 | 1 | 9.350000 | 9 | 9.35 | in_thirty | |
Subscriber | 2014-02-28 | Yerba Buena Center of the Arts (3rd @ Howard) | 198453 | 282 | 2014-02-28 16:19:00 | 68 | 2014-02-28 16:24:00 | San Francisco Caltrain (Townsend at 4th) | 70 | 437 | … | 28 | 16 | 2 | evening | 00:05:00 | 0 | 4.700000 | 5 | 4.70 | in_thirty | |
Subscriber | 2014-02-28 | Yerba Buena Center of the Arts (3rd @ Howard) | 198710 | 323 | 2014-02-28 19:13:00 | 68 | 2014-02-28 19:19:00 | San Francisco Caltrain 2 (330 Townsend) | 69 | 339 | … | 28 | 19 | 2 | evening | 00:06:00 | 0 | 5.383333 | 5 | 5.38 | in_thirty | |
Subscriber | 2014-02-28 | Yerba Buena Center of the Arts (3rd @ Howard) | 198685 | 271 | 2014-02-28 18:48:00 | 68 | 2014-02-28 18:52:00 | San Francisco Caltrain (Townsend at 4th) | 70 | 454 | … | 28 | 18 | 2 | evening | 00:04:00 | 0 | 4.516667 | 5 | 4.52 | in_thirty | |
Subscriber | 2014-02-28 | Yerba Buena Center of the Arts (3rd @ Howard) | 198087 | 159 | 2014-02-28 07:42:00 | 68 | 2014-02-28 07:44:00 | 5th at Howard | 57 | 435 | … | 28 | 7 | 2 | morning | 00:02:00 | 0 | 2.650000 | 3 | 2.65 | in_thirty | |
Subscriber | 2014-02-28 | Yerba Buena Center of the Arts (3rd @ Howard) | 198639 | 342 | 2014-02-28 18:13:00 | 68 | 2014-02-28 18:19:00 | Civic Center BART (7th at Market) | 72 | 292 | … | 28 | 18 | 2 | evening | 00:06:00 | 0 | 5.700000 | 6 | 5.70 | in_thirty | |
Subscriber | 2014-02-28 | Yerba Buena Center of the Arts (3rd @ Howard) | 198566 | 338 | 2014-02-28 17:20:00 | 68 | 2014-02-28 17:25:00 | San Francisco Caltrain (Townsend at 4th) | 70 | 477 | … | 28 | 17 | 2 | evening | 00:05:00 | 0 | 5.633333 | 6 | 5.63 | in_thirty |
113647 rows × 52 columns
By setting up the initial query, We can then change the inputs for interactive reporting of results. We can answer very specific questions by querying.
Let's created a grouped object, apply aggregate functions, and then query the data. In this table, we have the average duration for subscription type of customer only.
grouped1 = dmerge4.groupby(['landmark','subscriptiontype'])
group2 = grouped1[['duration']].agg([np.mean, np.std])
group2
query1 = group2.query('subscriptiontype == "Customer"')
query1
duration | |||
---|---|---|---|
mean | std | ||
landmark | subscriptiontype | ||
Mountain View | Customer | 10679.844133 | 35553.063019 |
Palo Alto | Customer | 8856.419476 | 39184.301869 |
Redwood City | Customer | 11584.882716 | 23257.019736 |
San Francisco | Customer | 3212.968434 | 9861.482292 |
San Jose | Customer | 4675.431818 | 17693.747427 |
Return a dataset from a multi-index dataframe that contains only startterminal number 62.
#subseting and indexing a non-grouped multi index:
dmerge4twolevels.query('startterminal == 62')
lubstartdate | startstation | tripid | duration | startdate | startterminal | enddate | endstation | endterminal | bike | … | day | hour | month | timeofday | diff | hourdiff | durationminutes | duration_i | duration_f | thirtymin | ||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
landmark | subscriptiontype | |||||||||||||||||||||
San Francisco | Subscriber | 2013-08-29 | 2nd at Folsom | 4636 | 186 | 2013-08-29 15:11:00 | 62 | 2013-08-29 15:14:00 | 2nd at Townsend | 61 | 366 | … | 29 | 15 | 8 | evening | 00:03:00 | 0 | 3.100000 | 3 | 3.10 | in_thirty |
Subscriber | 2013-08-29 | 2nd at Folsom | 4820 | 813 | 2013-08-29 17:35:00 | 62 | 2013-08-29 17:48:00 | Townsend at 7th | 65 | 409 | … | 29 | 17 | 8 | evening | 00:13:00 | 0 | 13.550000 | 14 | 13.55 | in_thirty | |
Customer | 2013-08-29 | 2nd at Folsom | 5001 | 236 | 2013-08-29 20:00:00 | 62 | 2013-08-29 20:04:00 | Market at Sansome | 77 | 567 | … | 29 | 20 | 8 | night | 00:04:00 | 0 | 3.933333 | 4 | 3.93 | in_thirty | |
Subscriber | 2013-08-29 | 2nd at Folsom | 4812 | 186 | 2013-08-29 17:30:00 | 62 | 2013-08-29 17:33:00 | 2nd at Folsom | 62 | 409 | … | 29 | 17 | 8 | evening | 00:03:00 | 0 | 3.100000 | 3 | 3.10 | in_thirty | |
Customer | 2013-08-29 | 2nd at Folsom | 4946 | 827 | 2013-08-29 19:07:00 | 62 | 2013-08-29 19:21:00 | Embarcadero at Vallejo | 48 | 342 | … | 29 | 19 | 8 | evening | 00:14:00 | 0 | 13.783333 | 14 | 13.78 | in_thirty | |
Subscriber | 2013-08-29 | 2nd at Folsom | 4390 | 825 | 2013-08-29 12:33:00 | 62 | 2013-08-29 12:47:00 | 2nd at South Park | 64 | 331 | … | 29 | 12 | 8 | mid_day | 00:14:00 | 0 | 13.750000 | 14 | 13.75 | in_thirty | |
Subscriber | 2013-08-29 | 2nd at Folsom | 4939 | 816 | 2013-08-29 19:01:00 | 62 | 2013-08-29 19:14:00 | Harry Bridges Plaza (Ferry Building) | 50 | 539 | … | 29 | 19 | 8 | evening | 00:13:00 | 0 | 13.600000 | 14 | 13.60 | in_thirty | |
Customer | 2013-08-29 | 2nd at Folsom | 4949 | 1524 | 2013-08-29 19:11:00 | 62 | 2013-08-29 19:36:00 | Commercial at Montgomery | 45 | 400 | … | 29 | 19 | 8 | evening | 00:25:00 | 0 | 25.400000 | 25 | 25.40 | in_thirty | |
Subscriber | 2013-08-29 | 2nd at Folsom | 4582 | 335 | 2013-08-29 14:14:00 | 62 | 2013-08-29 14:20:00 | 2nd at Folsom | 62 | 342 | … | 29 | 14 | 8 | mid_day | 00:06:00 | 0 | 5.583333 | 6 | 5.58 | in_thirty | |
Customer | 2013-08-29 | 2nd at Folsom | 4951 | 1409 | 2013-08-29 19:13:00 | 62 | 2013-08-29 19:36:00 | Commercial at Montgomery | 45 | 380 | … | 29 | 19 | 8 | evening | 00:23:00 | 0 | 23.483333 | 23 | 23.48 | in_thirty | |
Customer | 2013-08-29 | 2nd at Folsom | 4945 | 797 | 2013-08-29 19:08:00 | 62 | 2013-08-29 19:21:00 | Embarcadero at Vallejo | 48 | 406 | … | 29 | 19 | 8 | evening | 00:13:00 | 0 | 13.283333 | 13 | 13.28 | in_thirty | |
Subscriber | 2013-08-29 | 2nd at Folsom | 4938 | 840 | 2013-08-29 19:01:00 | 62 | 2013-08-29 19:15:00 | Harry Bridges Plaza (Ferry Building) | 50 | 607 | … | 29 | 19 | 8 | evening | 00:14:00 | 0 | 14.000000 | 14 | 14.00 | in_thirty | |
Subscriber | 2013-08-30 | 2nd at Folsom | 6115 | 69 | 2013-08-30 16:30:00 | 62 | 2013-08-30 16:31:00 | 2nd at Folsom | 62 | 633 | … | 30 | 16 | 8 | evening | 00:01:00 | 0 | 1.150000 | 1 | 1.15 | in_thirty | |
Customer | 2013-08-30 | 2nd at Folsom | 5370 | 1074 | 2013-08-30 11:03:00 | 62 | 2013-08-30 11:21:00 | Harry Bridges Plaza (Ferry Building) | 50 | 559 | … | 30 | 11 | 8 | mid_day | 00:18:00 | 0 | 17.900000 | 18 | 17.90 | in_thirty | |
Customer | 2013-08-30 | 2nd at Folsom | 5210 | 493 | 2013-08-30 09:36:00 | 62 | 2013-08-30 09:44:00 | Clay at Battery | 41 | 360 | … | 30 | 9 | 8 | morning | 00:08:00 | 0 | 8.216667 | 8 | 8.22 | in_thirty | |
Customer | 2013-08-30 | 2nd at Folsom | 5371 | 1052 | 2013-08-30 11:03:00 | 62 | 2013-08-30 11:21:00 | Harry Bridges Plaza (Ferry Building) | 50 | 385 | … | 30 | 11 | 8 | mid_day | 00:18:00 | 0 | 17.533333 | 18 | 17.53 | in_thirty | |
Subscriber | 2013-08-30 | 2nd at Folsom | 6125 | 393 | 2013-08-30 16:35:00 | 62 | 2013-08-30 16:42:00 | San Francisco Caltrain (Townsend at 4th) | 70 | 633 | … | 30 | 16 | 8 | evening | 00:07:00 | 0 | 6.550000 | 7 | 6.55 | in_thirty | |
Customer | 2013-08-30 | 2nd at Folsom | 6292 | 229 | 2013-08-30 18:15:00 | 62 | 2013-08-30 18:18:00 | Market at Sansome | 77 | 272 | … | 30 | 18 | 8 | evening | 00:03:00 | 0 | 3.816667 | 4 | 3.82 | in_thirty | |
Customer | 2013-08-30 | 2nd at Folsom | 5222 | 27103 | 2013-08-30 09:55:00 | 62 | 2013-08-30 17:27:00 | Townsend at 7th | 65 | 396 | … | 30 | 9 | 8 | morning | 07:32:00 | 8 | 451.716667 | 452 | 451.72 | over_thirty | |
Subscriber | 2013-08-30 | 2nd at Folsom | 6310 | 655 | 2013-08-30 18:22:00 | 62 | 2013-08-30 18:33:00 | Civic Center BART (7th at Market) | 72 | 341 | … | 30 | 18 | 8 | evening | 00:11:00 | 0 | 10.916667 | 11 | 10.92 | in_thirty | |
Subscriber | 2013-08-30 | 2nd at Folsom | 6343 | 689 | 2013-08-30 18:49:00 | 62 | 2013-08-30 19:00:00 | Embarcadero at Sansome | 60 | 324 | … | 30 | 18 | 8 | evening | 00:11:00 | 1 | 11.483333 | 11 | 11.48 | in_thirty | |
Subscriber | 2013-08-30 | 2nd at Folsom | 5702 | 170 | 2013-08-30 13:43:00 | 62 | 2013-08-30 13:46:00 | 2nd at South Park | 64 | 442 | … | 30 | 13 | 8 | mid_day | 00:03:00 | 0 | 2.833333 | 3 | 2.83 | in_thirty | |
Subscriber | 2013-08-30 | 2nd at Folsom | 6117 | 213 | 2013-08-30 16:31:00 | 62 | 2013-08-30 16:35:00 | 2nd at Folsom | 62 | 633 | … | 30 | 16 | 8 | evening | 00:04:00 | 0 | 3.550000 | 4 | 3.55 | in_thirty | |
Customer | 2013-08-31 | 2nd at Folsom | 6916 | 1455 | 2013-08-31 13:36:00 | 62 | 2013-08-31 14:00:00 | Harry Bridges Plaza (Ferry Building) | 50 | 291 | … | 31 | 13 | 8 | mid_day | 00:24:00 | 1 | 24.250000 | 24 | 24.25 | in_thirty | |
Customer | 2013-08-31 | 2nd at Folsom | 6614 | 382 | 2013-08-31 10:52:00 | 62 | 2013-08-31 10:59:00 | Post at Kearney | 47 | 505 | … | 31 | 10 | 8 | morning | 00:07:00 | 0 | 6.366667 | 6 | 6.37 | in_thirty | |
Customer | 2013-08-31 | 2nd at Folsom | 7019 | 671 | 2013-08-31 14:30:00 | 62 | 2013-08-31 14:42:00 | Civic Center BART (7th at Market) | 72 | 425 | … | 31 | 14 | 8 | mid_day | 00:12:00 | 0 | 11.183333 | 11 | 11.18 | in_thirty | |
Customer | 2013-08-31 | 2nd at Folsom | 7041 | 5061 | 2013-08-31 14:43:00 | 62 | 2013-08-31 16:07:00 | Powell at Post (Union Square) | 71 | 588 | … | 31 | 14 | 8 | mid_day | 01:24:00 | 2 | 84.350000 | 84 | 84.35 | over_thirty | |
Customer | 2013-08-31 | 2nd at Folsom | 7372 | 262 | 2013-08-31 18:42:00 | 62 | 2013-08-31 18:47:00 | Market at Sansome | 77 | 584 | … | 31 | 18 | 8 | evening | 00:05:00 | 0 | 4.366667 | 4 | 4.37 | in_thirty | |
Customer | 2013-08-31 | 2nd at Folsom | 6835 | 679 | 2013-08-31 13:04:00 | 62 | 2013-08-31 13:15:00 | 2nd at Folsom | 62 | 267 | … | 31 | 13 | 8 | mid_day | 00:11:00 | 0 | 11.316667 | 11 | 11.32 | in_thirty | |
Customer | 2013-08-31 | 2nd at Folsom | 7042 | 5067 | 2013-08-31 14:43:00 | 62 | 2013-08-31 16:08:00 | Powell at Post (Union Square) | 71 | 439 | … | 31 | 14 | 8 | mid_day | 01:25:00 | 2 | 84.450000 | 84 | 84.45 | over_thirty | |
… | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … | |
Subscriber | 2014-02-27 | 2nd at Folsom | 197102 | 403 | 2014-02-27 08:59:00 | 62 | 2014-02-27 09:05:00 | 5th at Howard | 57 | 558 | … | 27 | 8 | 2 | morning | 00:06:00 | 1 | 6.716667 | 7 | 6.72 | in_thirty | |
Subscriber | 2014-02-27 | 2nd at Folsom | 196871 | 301 | 2014-02-27 06:44:00 | 62 | 2014-02-27 06:49:00 | San Francisco Caltrain (Townsend at 4th) | 70 | 386 | … | 27 | 6 | 2 | morning | 00:05:00 | 0 | 5.016667 | 5 | 5.02 | in_thirty | |
Subscriber | 2014-02-27 | 2nd at Folsom | 197328 | 158 | 2014-02-27 12:21:00 | 62 | 2014-02-27 12:23:00 | 2nd at South Park | 64 | 549 | … | 27 | 12 | 2 | mid_day | 00:02:00 | 0 | 2.633333 | 3 | 2.63 | in_thirty | |
Subscriber | 2014-02-27 | 2nd at Folsom | 197822 | 484 | 2014-02-27 18:33:00 | 62 | 2014-02-27 18:41:00 | Davis at Jackson | 42 | 397 | … | 27 | 18 | 2 | evening | 00:08:00 | 0 | 8.066667 | 8 | 8.07 | in_thirty | |
Subscriber | 2014-02-27 | 2nd at Folsom | 197942 | 1190 | 2014-02-27 20:01:00 | 62 | 2014-02-27 20:20:00 | Golden Gate at Polk | 59 | 576 | … | 27 | 20 | 2 | night | 00:19:00 | 0 | 19.833333 | 20 | 19.83 | in_thirty | |
Subscriber | 2014-02-27 | 2nd at Folsom | 197166 | 659 | 2014-02-27 09:31:00 | 62 | 2014-02-27 09:42:00 | 2nd at Townsend | 61 | 502 | … | 27 | 9 | 2 | morning | 00:11:00 | 0 | 10.983333 | 11 | 10.98 | in_thirty | |
Subscriber | 2014-02-27 | 2nd at Folsom | 197330 | 403 | 2014-02-27 12:24:00 | 62 | 2014-02-27 12:31:00 | 5th at Howard | 57 | 560 | … | 27 | 12 | 2 | mid_day | 00:07:00 | 0 | 6.716667 | 7 | 6.72 | in_thirty | |
Subscriber | 2014-02-27 | 2nd at Folsom | 197353 | 349 | 2014-02-27 12:48:00 | 62 | 2014-02-27 12:54:00 | Embarcadero at Folsom | 51 | 335 | … | 27 | 12 | 2 | mid_day | 00:06:00 | 0 | 5.816667 | 6 | 5.82 | in_thirty | |
Subscriber | 2014-02-27 | 2nd at Folsom | 197641 | 446 | 2014-02-27 17:14:00 | 62 | 2014-02-27 17:22:00 | Powell Street BART | 53 | 449 | … | 27 | 17 | 2 | evening | 00:08:00 | 0 | 7.433333 | 7 | 7.43 | in_thirty | |
Subscriber | 2014-02-27 | 2nd at Folsom | 197573 | 203 | 2014-02-27 16:32:00 | 62 | 2014-02-27 16:36:00 | Market at Sansome | 77 | 289 | … | 27 | 16 | 2 | evening | 00:04:00 | 0 | 3.383333 | 3 | 3.38 | in_thirty | |
Subscriber | 2014-02-27 | 2nd at Folsom | 196847 | 175 | 2014-02-27 05:53:00 | 62 | 2014-02-27 05:56:00 | Market at Sansome | 77 | 472 | … | 27 | 5 | 2 | morning | 00:03:00 | 0 | 2.916667 | 3 | 2.92 | in_thirty | |
Subscriber | 2014-02-27 | 2nd at Folsom | 197043 | 506 | 2014-02-27 08:43:00 | 62 | 2014-02-27 08:52:00 | Washington at Kearney | 46 | 399 | … | 27 | 8 | 2 | morning | 00:09:00 | 0 | 8.433333 | 8 | 8.43 | in_thirty | |
Subscriber | 2014-02-27 | 2nd at Folsom | 197091 | 771 | 2014-02-27 08:57:00 | 62 | 2014-02-27 09:10:00 | Townsend at 7th | 65 | 542 | … | 27 | 8 | 2 | morning | 00:13:00 | 1 | 12.850000 | 13 | 12.85 | in_thirty | |
Subscriber | 2014-02-28 | 2nd at Folsom | 198421 | 1729 | 2014-02-28 14:54:00 | 62 | 2014-02-28 15:23:00 | Spear at Folsom | 49 | 379 | … | 28 | 14 | 2 | mid_day | 00:29:00 | 1 | 28.816667 | 29 | 28.82 | in_thirty | |
Subscriber | 2014-02-28 | 2nd at Folsom | 198718 | 76 | 2014-02-28 19:23:00 | 62 | 2014-02-28 19:24:00 | Howard at 2nd | 63 | 563 | … | 28 | 19 | 2 | evening | 00:01:00 | 0 | 1.266667 | 1 | 1.27 | in_thirty | |
Subscriber | 2014-02-28 | 2nd at Folsom | 198168 | 764 | 2014-02-28 08:46:00 | 62 | 2014-02-28 08:59:00 | Grant Avenue at Columbus Avenue | 73 | 222 | … | 28 | 8 | 2 | morning | 00:13:00 | 0 | 12.733333 | 13 | 12.73 | in_thirty | |
Subscriber | 2014-02-28 | 2nd at Folsom | 198579 | 288 | 2014-02-28 17:28:00 | 62 | 2014-02-28 17:33:00 | Spear at Folsom | 49 | 391 | … | 28 | 17 | 2 | evening | 00:05:00 | 0 | 4.800000 | 5 | 4.80 | in_thirty | |
Subscriber | 2014-02-28 | 2nd at Folsom | 198451 | 494 | 2014-02-28 16:16:00 | 62 | 2014-02-28 16:24:00 | Powell at Post (Union Square) | 71 | 463 | … | 28 | 16 | 2 | evening | 00:08:00 | 0 | 8.233333 | 8 | 8.23 | in_thirty | |
Subscriber | 2014-02-28 | 2nd at Folsom | 198222 | 450 | 2014-02-28 09:12:00 | 62 | 2014-02-28 09:20:00 | Beale at Market | 56 | 583 | … | 28 | 9 | 2 | morning | 00:08:00 | 0 | 7.500000 | 8 | 7.50 | in_thirty | |
Subscriber | 2014-02-28 | 2nd at Folsom | 198546 | 464 | 2014-02-28 17:10:00 | 62 | 2014-02-28 17:18:00 | San Francisco Caltrain (Townsend at 4th) | 70 | 259 | … | 28 | 17 | 2 | evening | 00:08:00 | 0 | 7.733333 | 8 | 7.73 | in_thirty | |
Subscriber | 2014-02-28 | 2nd at Folsom | 198349 | 513 | 2014-02-28 11:42:00 | 62 | 2014-02-28 11:50:00 | Powell Street BART | 53 | 408 | … | 28 | 11 | 2 | mid_day | 00:08:00 | 0 | 8.550000 | 9 | 8.55 | in_thirty | |
Subscriber | 2014-02-28 | 2nd at Folsom | 198174 | 720 | 2014-02-28 08:50:00 | 62 | 2014-02-28 09:02:00 | Townsend at 7th | 65 | 389 | … | 28 | 8 | 2 | morning | 00:12:00 | 1 | 12.000000 | 12 | 12.00 | in_thirty | |
Subscriber | 2014-02-28 | 2nd at Folsom | 198369 | 252 | 2014-02-28 12:44:00 | 62 | 2014-02-28 12:49:00 | Spear at Folsom | 49 | 589 | … | 28 | 12 | 2 | mid_day | 00:05:00 | 0 | 4.200000 | 4 | 4.20 | in_thirty | |
Subscriber | 2014-02-28 | 2nd at Folsom | 198610 | 378 | 2014-02-28 17:57:00 | 62 | 2014-02-28 18:03:00 | 5th at Howard | 57 | 487 | … | 28 | 17 | 2 | evening | 00:06:00 | 1 | 6.300000 | 6 | 6.30 | in_thirty | |
Subscriber | 2014-02-28 | 2nd at Folsom | 198489 | 216 | 2014-02-28 16:44:00 | 62 | 2014-02-28 16:47:00 | 2nd at Townsend | 61 | 557 | … | 28 | 16 | 2 | evening | 00:03:00 | 0 | 3.600000 | 4 | 3.60 | in_thirty | |
Subscriber | 2014-02-28 | 2nd at Folsom | 198768 | 225 | 2014-02-28 21:54:00 | 62 | 2014-02-28 21:57:00 | Yerba Buena Center of the Arts (3rd @ Howard) | 68 | 635 | … | 28 | 21 | 2 | night | 00:03:00 | 0 | 3.750000 | 4 | 3.75 | in_thirty | |
Subscriber | 2014-02-28 | 2nd at Folsom | 198573 | 413 | 2014-02-28 17:27:00 | 62 | 2014-02-28 17:34:00 | Powell Street BART | 53 | 373 | … | 28 | 17 | 2 | evening | 00:07:00 | 0 | 6.883333 | 7 | 6.88 | in_thirty | |
Subscriber | 2014-02-28 | 2nd at Folsom | 198172 | 491 | 2014-02-28 08:48:00 | 62 | 2014-02-28 08:57:00 | San Francisco Caltrain (Townsend at 4th) | 70 | 518 | … | 28 | 8 | 2 | morning | 00:09:00 | 0 | 8.183333 | 8 | 8.18 | in_thirty | |
Subscriber | 2014-02-28 | 2nd at Folsom | 198717 | 615 | 2014-02-28 19:22:00 | 62 | 2014-02-28 19:33:00 | Civic Center BART (7th at Market) | 72 | 603 | … | 28 | 19 | 2 | evening | 00:11:00 | 0 | 10.250000 | 10 | 10.25 | in_thirty | |
Subscriber | 2014-02-28 | 2nd at Folsom | 198606 | 663 | 2014-02-28 17:50:00 | 62 | 2014-02-28 18:01:00 | Powell at Post (Union Square) | 71 | 604 | … | 28 | 17 | 2 | evening | 00:11:00 | 1 | 11.050000 | 11 | 11.05 | in_thirty |
3776 rows × 52 columns
Query a multi-index dataframe for subscriptiontype of subscriber.
dmerge4twolevels.query('subscriptiontype == "Subscriber"')
lubstartdate | startstation | tripid | duration | startdate | startterminal | enddate | endstation | endterminal | bike | … | day | hour | month | timeofday | diff | hourdiff | durationminutes | duration_i | duration_f | thirtymin | ||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
landmark | subscriptiontype | |||||||||||||||||||||
San Francisco | Subscriber | 2013-08-29 | 2nd at Folsom | 4636 | 186 | 2013-08-29 15:11:00 | 62 | 2013-08-29 15:14:00 | 2nd at Townsend | 61 | 366 | … | 29 | 15 | 8 | evening | 00:03:00 | 0 | 3.100000 | 3 | 3.10 | in_thirty |
Subscriber | 2013-08-29 | 2nd at Folsom | 4820 | 813 | 2013-08-29 17:35:00 | 62 | 2013-08-29 17:48:00 | Townsend at 7th | 65 | 409 | … | 29 | 17 | 8 | evening | 00:13:00 | 0 | 13.550000 | 14 | 13.55 | in_thirty | |
Subscriber | 2013-08-29 | 2nd at Folsom | 4812 | 186 | 2013-08-29 17:30:00 | 62 | 2013-08-29 17:33:00 | 2nd at Folsom | 62 | 409 | … | 29 | 17 | 8 | evening | 00:03:00 | 0 | 3.100000 | 3 | 3.10 | in_thirty | |
Subscriber | 2013-08-29 | 2nd at Folsom | 4390 | 825 | 2013-08-29 12:33:00 | 62 | 2013-08-29 12:47:00 | 2nd at South Park | 64 | 331 | … | 29 | 12 | 8 | mid_day | 00:14:00 | 0 | 13.750000 | 14 | 13.75 | in_thirty | |
Subscriber | 2013-08-29 | 2nd at Folsom | 4939 | 816 | 2013-08-29 19:01:00 | 62 | 2013-08-29 19:14:00 | Harry Bridges Plaza (Ferry Building) | 50 | 539 | … | 29 | 19 | 8 | evening | 00:13:00 | 0 | 13.600000 | 14 | 13.60 | in_thirty | |
Subscriber | 2013-08-29 | 2nd at Folsom | 4582 | 335 | 2013-08-29 14:14:00 | 62 | 2013-08-29 14:20:00 | 2nd at Folsom | 62 | 342 | … | 29 | 14 | 8 | mid_day | 00:06:00 | 0 | 5.583333 | 6 | 5.58 | in_thirty | |
Subscriber | 2013-08-29 | 2nd at Folsom | 4938 | 840 | 2013-08-29 19:01:00 | 62 | 2013-08-29 19:15:00 | Harry Bridges Plaza (Ferry Building) | 50 | 607 | … | 29 | 19 | 8 | evening | 00:14:00 | 0 | 14.000000 | 14 | 14.00 | in_thirty | |
Subscriber | 2013-08-29 | 2nd at South Park | 4557 | 130 | 2013-08-29 13:57:00 | 64 | 2013-08-29 13:59:00 | 2nd at South Park | 64 | 371 | … | 29 | 13 | 8 | mid_day | 00:02:00 | 0 | 2.166667 | 2 | 2.17 | in_thirty | |
Subscriber | 2013-08-29 | 2nd at South Park | 4069 | 174 | 2013-08-29 09:08:00 | 64 | 2013-08-29 09:11:00 | 2nd at South Park | 64 | 288 | … | 29 | 9 | 8 | morning | 00:03:00 | 0 | 2.900000 | 3 | 2.90 | in_thirty | |
Subscriber | 2013-08-29 | 2nd at South Park | 4564 | 1693 | 2013-08-29 14:04:00 | 64 | 2013-08-29 14:32:00 | 2nd at South Park | 64 | 272 | … | 29 | 14 | 8 | mid_day | 00:28:00 | 0 | 28.216667 | 28 | 28.22 | in_thirty | |
Subscriber | 2013-08-29 | 2nd at South Park | 4465 | 419 | 2013-08-29 13:11:00 | 64 | 2013-08-29 13:18:00 | Post at Kearney | 47 | 498 | … | 29 | 13 | 8 | mid_day | 00:07:00 | 0 | 6.983333 | 7 | 6.98 | in_thirty | |
Subscriber | 2013-08-29 | 2nd at South Park | 4566 | 1699 | 2013-08-29 14:04:00 | 64 | 2013-08-29 14:33:00 | 2nd at South Park | 64 | 371 | … | 29 | 14 | 8 | mid_day | 00:29:00 | 0 | 28.316667 | 28 | 28.32 | in_thirty | |
Subscriber | 2013-08-29 | 2nd at South Park | 5083 | 755 | 2013-08-29 22:05:00 | 64 | 2013-08-29 22:18:00 | Harry Bridges Plaza (Ferry Building) | 50 | 270 | … | 29 | 22 | 8 | night | 00:13:00 | 0 | 12.583333 | 13 | 12.58 | in_thirty | |
Subscriber | 2013-08-29 | 2nd at South Park | 4503 | 728 | 2013-08-29 13:29:00 | 64 | 2013-08-29 13:41:00 | Embarcadero at Vallejo | 48 | 331 | … | 29 | 13 | 8 | mid_day | 00:12:00 | 0 | 12.133333 | 12 | 12.13 | in_thirty | |
Subscriber | 2013-08-29 | 2nd at South Park | 4645 | 850 | 2013-08-29 15:20:00 | 64 | 2013-08-29 15:35:00 | San Francisco Caltrain (Townsend at 4th) | 70 | 288 | … | 29 | 15 | 8 | evening | 00:15:00 | 0 | 14.166667 | 14 | 14.17 | in_thirty | |
Subscriber | 2013-08-29 | 2nd at South Park | 4426 | 982 | 2013-08-29 12:52:00 | 64 | 2013-08-29 13:08:00 | 2nd at South Park | 64 | 593 | … | 29 | 12 | 8 | mid_day | 00:16:00 | 1 | 16.366667 | 16 | 16.37 | in_thirty | |
Subscriber | 2013-08-29 | 2nd at South Park | 4640 | 335 | 2013-08-29 15:19:00 | 64 | 2013-08-29 15:24:00 | San Francisco Caltrain (Townsend at 4th) | 70 | 438 | … | 29 | 15 | 8 | evening | 00:05:00 | 0 | 5.583333 | 6 | 5.58 | in_thirty | |
Subscriber | 2013-08-29 | 2nd at South Park | 4567 | 1684 | 2013-08-29 14:04:00 | 64 | 2013-08-29 14:33:00 | 2nd at South Park | 64 | 569 | … | 29 | 14 | 8 | mid_day | 00:29:00 | 0 | 28.066667 | 28 | 28.07 | in_thirty | |
Subscriber | 2013-08-29 | 2nd at Townsend | 4411 | 223 | 2013-08-29 12:40:00 | 61 | 2013-08-29 12:43:00 | San Francisco Caltrain 2 (330 Townsend) | 69 | 259 | … | 29 | 12 | 8 | mid_day | 00:03:00 | 0 | 3.716667 | 4 | 3.72 | in_thirty | |
Subscriber | 2013-08-29 | 5th at Howard | 4909 | 1220 | 2013-08-29 18:36:00 | 57 | 2013-08-29 18:57:00 | Embarcadero at Vallejo | 48 | 404 | … | 29 | 18 | 8 | evening | 00:21:00 | 0 | 20.333333 | 20 | 20.33 | in_thirty | |
Subscriber | 2013-08-29 | 5th at Howard | 4970 | 404 | 2013-08-29 19:36:00 | 57 | 2013-08-29 19:43:00 | Powell at Post (Union Square) | 71 | 320 | … | 29 | 19 | 8 | evening | 00:07:00 | 0 | 6.733333 | 7 | 6.73 | in_thirty | |
Subscriber | 2013-08-29 | 5th at Howard | 4657 | 298 | 2013-08-29 15:30:00 | 57 | 2013-08-29 15:35:00 | 5th at Howard | 57 | 631 | … | 29 | 15 | 8 | evening | 00:05:00 | 0 | 4.966667 | 5 | 4.97 | in_thirty | |
Subscriber | 2013-08-29 | 5th at Howard | 4892 | 1492 | 2013-08-29 18:21:00 | 57 | 2013-08-29 18:46:00 | Embarcadero at Vallejo | 48 | 577 | … | 29 | 18 | 8 | evening | 00:25:00 | 0 | 24.866667 | 25 | 24.87 | in_thirty | |
Subscriber | 2013-08-29 | 5th at Howard | 4950 | 1424 | 2013-08-29 19:12:00 | 57 | 2013-08-29 19:35:00 | Harry Bridges Plaza (Ferry Building) | 50 | 508 | … | 29 | 19 | 8 | evening | 00:23:00 | 0 | 23.733333 | 24 | 23.73 | in_thirty | |
San Jose | Subscriber | 2013-08-29 | Adobe on Almaden | 4419 | 556 | 2013-08-29 12:45:00 | 5 | 2013-08-29 12:55:00 | San Jose Civic Center | 3 | 671 | … | 29 | 12 | 8 | mid_day | 00:10:00 | 0 | 9.266667 | 9 | 9.27 | in_thirty |
Subscriber | 2013-08-29 | Adobe on Almaden | 4741 | 412 | 2013-08-29 16:45:00 | 5 | 2013-08-29 16:52:00 | San Jose City Hall | 10 | 679 | … | 29 | 16 | 8 | evening | 00:07:00 | 0 | 6.866667 | 7 | 6.87 | in_thirty | |
Subscriber | 2013-08-29 | Adobe on Almaden | 4502 | 210 | 2013-08-29 13:28:00 | 5 | 2013-08-29 13:31:00 | San Jose Civic Center | 3 | 664 | … | 29 | 13 | 8 | mid_day | 00:03:00 | 0 | 3.500000 | 4 | 3.50 | in_thirty | |
Subscriber | 2013-08-29 | Arena Green / SAP Center | 4365 | 1094 | 2013-08-29 12:24:00 | 14 | 2013-08-29 12:42:00 | Arena Green / SAP Center | 14 | 647 | … | 29 | 12 | 8 | mid_day | 00:18:00 | 0 | 18.233333 | 18 | 18.23 | in_thirty | |
San Francisco | Subscriber | 2013-08-29 | Beale at Market | 4787 | 479 | 2013-08-29 17:12:00 | 56 | 2013-08-29 17:20:00 | Market at 4th | 76 | 546 | … | 29 | 17 | 8 | evening | 00:08:00 | 0 | 7.983333 | 8 | 7.98 | in_thirty |
Subscriber | 2013-08-29 | Beale at Market | 4374 | 563 | 2013-08-29 12:23:00 | 56 | 2013-08-29 12:32:00 | Commercial at Montgomery | 45 | 410 | … | 29 | 12 | 8 | mid_day | 00:09:00 | 0 | 9.383333 | 9 | 9.38 | in_thirty | |
… | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … | |
Subscriber | 2014-02-28 | Temporary Transbay Terminal (Howard at Beale) | 198170 | 888 | 2014-02-28 08:47:00 | 55 | 2014-02-28 09:02:00 | Grant Avenue at Columbus Avenue | 73 | 319 | … | 28 | 8 | 2 | morning | 00:15:00 | 1 | 14.800000 | 15 | 14.80 | in_thirty | |
Subscriber | 2014-02-28 | Temporary Transbay Terminal (Howard at Beale) | 198105 | 277 | 2014-02-28 07:59:00 | 55 | 2014-02-28 08:03:00 | Harry Bridges Plaza (Ferry Building) | 50 | 420 | … | 28 | 7 | 2 | morning | 00:04:00 | 1 | 4.616667 | 5 | 4.62 | in_thirty | |
Subscriber | 2014-02-28 | Temporary Transbay Terminal (Howard at Beale) | 198483 | 839 | 2014-02-28 16:41:00 | 55 | 2014-02-28 16:55:00 | San Francisco Caltrain (Townsend at 4th) | 70 | 281 | … | 28 | 16 | 2 | evening | 00:14:00 | 0 | 13.983333 | 14 | 13.98 | in_thirty | |
Subscriber | 2014-02-28 | Townsend at 7th | 198770 | 850 | 2014-02-28 22:19:00 | 65 | 2014-02-28 22:34:00 | Post at Kearney | 47 | 479 | … | 28 | 22 | 2 | night | 00:15:00 | 0 | 14.166667 | 14 | 14.17 | in_thirty | |
Subscriber | 2014-02-28 | Townsend at 7th | 198405 | 317 | 2014-02-28 14:08:00 | 65 | 2014-02-28 14:13:00 | San Francisco Caltrain (Townsend at 4th) | 70 | 412 | … | 28 | 14 | 2 | mid_day | 00:05:00 | 0 | 5.283333 | 5 | 5.28 | in_thirty | |
Subscriber | 2014-02-28 | Townsend at 7th | 198750 | 254 | 2014-02-28 20:22:00 | 65 | 2014-02-28 20:26:00 | San Francisco Caltrain (Townsend at 4th) | 70 | 268 | … | 28 | 20 | 2 | night | 00:04:00 | 0 | 4.233333 | 4 | 4.23 | in_thirty | |
Subscriber | 2014-02-28 | Townsend at 7th | 198468 | 267 | 2014-02-28 16:33:00 | 65 | 2014-02-28 16:38:00 | San Francisco Caltrain (Townsend at 4th) | 70 | 400 | … | 28 | 16 | 2 | evening | 00:05:00 | 0 | 4.450000 | 4 | 4.45 | in_thirty | |
Subscriber | 2014-02-28 | Townsend at 7th | 198213 | 854 | 2014-02-28 09:06:00 | 65 | 2014-02-28 09:20:00 | Market at Sansome | 77 | 409 | … | 28 | 9 | 2 | morning | 00:14:00 | 0 | 14.233333 | 14 | 14.23 | in_thirty | |
Subscriber | 2014-02-28 | Townsend at 7th | 198098 | 558 | 2014-02-28 07:55:00 | 65 | 2014-02-28 08:04:00 | South Van Ness at Market | 66 | 515 | … | 28 | 7 | 2 | morning | 00:09:00 | 1 | 9.300000 | 9 | 9.30 | in_thirty | |
Subscriber | 2014-02-28 | Townsend at 7th | 198587 | 610 | 2014-02-28 17:32:00 | 65 | 2014-02-28 17:42:00 | Harry Bridges Plaza (Ferry Building) | 50 | 435 | … | 28 | 17 | 2 | evening | 00:10:00 | 0 | 10.166667 | 10 | 10.17 | in_thirty | |
Subscriber | 2014-02-28 | Townsend at 7th | 198648 | 408 | 2014-02-28 18:20:00 | 65 | 2014-02-28 18:27:00 | Civic Center BART (7th at Market) | 72 | 414 | … | 28 | 18 | 2 | evening | 00:07:00 | 0 | 6.800000 | 7 | 6.80 | in_thirty | |
Subscriber | 2014-02-28 | Townsend at 7th | 198465 | 891 | 2014-02-28 16:32:00 | 65 | 2014-02-28 16:46:00 | Temporary Transbay Terminal (Howard at Beale) | 55 | 570 | … | 28 | 16 | 2 | evening | 00:14:00 | 0 | 14.850000 | 15 | 14.85 | in_thirty | |
Subscriber | 2014-02-28 | Townsend at 7th | 198633 | 243 | 2014-02-28 18:08:00 | 65 | 2014-02-28 18:12:00 | San Francisco Caltrain (Townsend at 4th) | 70 | 389 | … | 28 | 18 | 2 | evening | 00:04:00 | 0 | 4.050000 | 4 | 4.05 | in_thirty | |
Subscriber | 2014-02-28 | Townsend at 7th | 198128 | 221 | 2014-02-28 08:12:00 | 65 | 2014-02-28 08:16:00 | San Francisco Caltrain (Townsend at 4th) | 70 | 514 | … | 28 | 8 | 2 | morning | 00:04:00 | 0 | 3.683333 | 4 | 3.68 | in_thirty | |
Palo Alto | Subscriber | 2014-02-28 | University and Emerson | 198399 | 291 | 2014-02-28 13:49:00 | 35 | 2014-02-28 13:54:00 | University and Emerson | 35 | 165 | … | 28 | 13 | 2 | mid_day | 00:05:00 | 0 | 4.850000 | 5 | 4.85 | in_thirty |
Subscriber | 2014-02-28 | University and Emerson | 198625 | 233 | 2014-02-28 18:03:00 | 35 | 2014-02-28 18:07:00 | Cowper at University | 37 | 138 | … | 28 | 18 | 2 | evening | 00:04:00 | 0 | 3.883333 | 4 | 3.88 | in_thirty | |
San Francisco | Subscriber | 2014-02-28 | Washington at Kearney | 198275 | 570 | 2014-02-28 09:44:00 | 46 | 2014-02-28 09:54:00 | Powell Street BART | 53 | 384 | … | 28 | 9 | 2 | morning | 00:10:00 | 0 | 9.500000 | 10 | 9.50 | in_thirty |
Subscriber | 2014-02-28 | Washington at Kearney | 198277 | 918 | 2014-02-28 09:45:00 | 46 | 2014-02-28 10:00:00 | South Van Ness at Market | 66 | 399 | … | 28 | 9 | 2 | morning | 00:15:00 | 1 | 15.300000 | 15 | 15.30 | in_thirty | |
Subscriber | 2014-02-28 | Washington at Kearney | 198671 | 956 | 2014-02-28 18:34:00 | 46 | 2014-02-28 18:50:00 | Embarcadero at Bryant | 54 | 548 | … | 28 | 18 | 2 | evening | 00:16:00 | 0 | 15.933333 | 16 | 15.93 | in_thirty | |
Subscriber | 2014-02-28 | Washington at Kearney | 198145 | 555 | 2014-02-28 08:28:00 | 46 | 2014-02-28 08:37:00 | Howard at 2nd | 63 | 407 | … | 28 | 8 | 2 | morning | 00:09:00 | 0 | 9.250000 | 9 | 9.25 | in_thirty | |
Subscriber | 2014-02-28 | Yerba Buena Center of the Arts (3rd @ Howard) | 198533 | 292 | 2014-02-28 17:03:00 | 68 | 2014-02-28 17:08:00 | San Francisco Caltrain 2 (330 Townsend) | 69 | 479 | … | 28 | 17 | 2 | evening | 00:05:00 | 0 | 4.866667 | 5 | 4.87 | in_thirty | |
Subscriber | 2014-02-28 | Yerba Buena Center of the Arts (3rd @ Howard) | 198491 | 278 | 2014-02-28 16:44:00 | 68 | 2014-02-28 16:49:00 | San Francisco Caltrain (Townsend at 4th) | 70 | 314 | … | 28 | 16 | 2 | evening | 00:05:00 | 0 | 4.633333 | 5 | 4.63 | in_thirty | |
Subscriber | 2014-02-28 | Yerba Buena Center of the Arts (3rd @ Howard) | 198305 | 522 | 2014-02-28 10:05:00 | 68 | 2014-02-28 10:14:00 | Steuart at Market | 74 | 326 | … | 28 | 10 | 2 | morning | 00:09:00 | 0 | 8.700000 | 9 | 8.70 | in_thirty | |
Subscriber | 2014-02-28 | Yerba Buena Center of the Arts (3rd @ Howard) | 198289 | 561 | 2014-02-28 09:50:00 | 68 | 2014-02-28 10:00:00 | Market at 10th | 67 | 518 | … | 28 | 9 | 2 | morning | 00:10:00 | 1 | 9.350000 | 9 | 9.35 | in_thirty | |
Subscriber | 2014-02-28 | Yerba Buena Center of the Arts (3rd @ Howard) | 198453 | 282 | 2014-02-28 16:19:00 | 68 | 2014-02-28 16:24:00 | San Francisco Caltrain (Townsend at 4th) | 70 | 437 | … | 28 | 16 | 2 | evening | 00:05:00 | 0 | 4.700000 | 5 | 4.70 | in_thirty | |
Subscriber | 2014-02-28 | Yerba Buena Center of the Arts (3rd @ Howard) | 198710 | 323 | 2014-02-28 19:13:00 | 68 | 2014-02-28 19:19:00 | San Francisco Caltrain 2 (330 Townsend) | 69 | 339 | … | 28 | 19 | 2 | evening | 00:06:00 | 0 | 5.383333 | 5 | 5.38 | in_thirty | |
Subscriber | 2014-02-28 | Yerba Buena Center of the Arts (3rd @ Howard) | 198685 | 271 | 2014-02-28 18:48:00 | 68 | 2014-02-28 18:52:00 | San Francisco Caltrain (Townsend at 4th) | 70 | 454 | … | 28 | 18 | 2 | evening | 00:04:00 | 0 | 4.516667 | 5 | 4.52 | in_thirty | |
Subscriber | 2014-02-28 | Yerba Buena Center of the Arts (3rd @ Howard) | 198087 | 159 | 2014-02-28 07:42:00 | 68 | 2014-02-28 07:44:00 | 5th at Howard | 57 | 435 | … | 28 | 7 | 2 | morning | 00:02:00 | 0 | 2.650000 | 3 | 2.65 | in_thirty | |
Subscriber | 2014-02-28 | Yerba Buena Center of the Arts (3rd @ Howard) | 198639 | 342 | 2014-02-28 18:13:00 | 68 | 2014-02-28 18:19:00 | Civic Center BART (7th at Market) | 72 | 292 | … | 28 | 18 | 2 | evening | 00:06:00 | 0 | 5.700000 | 6 | 5.70 | in_thirty | |
Subscriber | 2014-02-28 | Yerba Buena Center of the Arts (3rd @ Howard) | 198566 | 338 | 2014-02-28 17:20:00 | 68 | 2014-02-28 17:25:00 | San Francisco Caltrain (Townsend at 4th) | 70 | 477 | … | 28 | 17 | 2 | evening | 00:05:00 | 0 | 5.633333 | 6 | 5.63 | in_thirty |
113647 rows × 52 columns
We can get this type of granular data by querying. By setting up the initial query, we can then change the inputs for interactive reporting of results.
duration62 = dmerge4twolevels.query('startterminal == 62').groupby('endstation')['duration_f'].mean().order(ascending=False)
duration62
endstation 2nd at Folsom 84.510405 Beale at Market 28.968750 Civic Center BART (7th at Market) 25.788571 Embarcadero at Vallejo 22.056000 Powell at Post (Union Square) 21.702200 Embarcadero at Sansome 20.633906 Mechanics Plaza (Market at Battery) 18.325526 Golden Gate at Polk 18.093333 San Francisco Caltrain (Townsend at 4th) 17.642372 South Van Ness at Market 16.725909 Commercial at Montgomery 15.988511 Market at 10th 15.887500 Townsend at 7th 15.233402 Grant Avenue at Columbus Avenue 15.224000 San Francisco City Hall 14.389286 Davis at Jackson 11.523784 Broadway St at Battery St 11.013333 Washington at Kearney 10.850811 Harry Bridges Plaza (Ferry Building) 10.394581 5th at Howard 9.919462 Powell Street BART 9.627881 Clay at Battery 9.399346 Post at Kearney 8.780957 Steuart at Market 8.627549 Embarcadero at Bryant 8.034000 Yerba Buena Center of the Arts (3rd @ Howard) 7.567119 Market at 4th 7.431879 San Francisco Caltrain 2 (330 Townsend) 7.380276 Embarcadero at Folsom 5.832000 Temporary Transbay Terminal (Howard at Beale) 5.819636 Spear at Folsom 5.556141 2nd at South Park 5.551429 2nd at Townsend 4.662283 Market at Sansome 4.334730 Howard at 2nd 3.326883 Name: duration_f, dtype: float64
Let's concatenate the morning monthly and evening monthly created in the time series section into a dataframe for the plotting notebook.
concatenated = pd.concat([morning_monthly,evening_monthly], keys=['morning', 'evening'],axis=1)
concatenated
morning | evening | |
---|---|---|
duration_f | duration_f | |
startdate | ||
2013-08-31 | 37.648913 | 37.240278 |
2013-09-30 | 17.973624 | 24.621531 |
2013-10-31 | 13.302877 | 16.752767 |
2013-11-30 | 13.209249 | 17.277542 |
2013-12-31 | 15.384694 | 14.980330 |
2014-01-31 | 11.679421 | 14.962542 |
2014-02-28 | 11.441120 | 15.718395 |
Merge same named column from two datasets but keep both columns but rename them with a suffix
merge = pd.merge(morning_monthly,evening_monthly,how='outer',left_index=True, right_index=True,suffixes=['_morning', '_evening'])
merge
duration_f_morning | duration_f_evening | |
---|---|---|
startdate | ||
2013-08-31 | 37.648913 | 37.240278 |
2013-09-30 | 17.973624 | 24.621531 |
2013-10-31 | 13.302877 | 16.752767 |
2013-11-30 | 13.209249 | 17.277542 |
2013-12-31 | 15.384694 | 14.980330 |
2014-01-31 | 11.679421 | 14.962542 |
2014-02-28 | 11.441120 | 15.718395 |
Use a lambda function to get summary statistics of average duration by landmark. The maximum average duration was in San Jose, the highest average duration was in Palo Alto and the lowest was in San Francisco.
dmerge4.groupby("landmark")[['duration_f']].apply(lambda x: x.describe()).unstack()
duration_f | ||||||||
---|---|---|---|---|---|---|---|---|
count | mean | std | min | 25% | 50% | 75% | max | |
landmark | ||||||||
Mountain View | 2728 | 44.690359 | 283.766147 | 1.03 | 4.100 | 5.18 | 13.080 | 9772.60 |
Palo Alto | 1706 | 77.261483 | 459.154620 | 1.10 | 6.055 | 15.57 | 34.865 | 12037.27 |
Redwood City | 793 | 49.096608 | 233.252426 | 1.00 | 3.800 | 4.73 | 7.830 | 3831.90 |
San Francisco | 129853 | 18.899424 | 81.849843 | 1.00 | 6.000 | 8.98 | 13.280 | 7156.40 |
San Jose | 8935 | 23.244376 | 173.987471 | 1.03 | 4.770 | 7.42 | 11.320 | 10322.03 |
Make a grouped object without making it the index.
Another use of the lambda function is to map a function. For example sake, to check the length of a string.
#apply a lambda function
f = lambda x: len(str(x))
dmerge4[['landmark']].applymap(f)[:3]
landmark | |
---|---|
startdate | |
2013-08-29 15:11:00 | 13 |
2013-08-29 17:35:00 | 13 |
2013-08-29 20:00:00 | 13 |
Here is another string operation by way of mapping a lambda function. Very useful.
dmerge4['endstation'].map(lambda x: x.startswith('Grant')).head()
startdate 2013-08-29 15:11:00 False 2013-08-29 17:35:00 False 2013-08-29 20:00:00 False 2013-08-29 17:30:00 False 2013-08-29 19:07:00 False Name: endstation, dtype: bool
Make a grouped object without making it the index.
dmerge4.groupby('landmark', as_index=False).mean()
landmark | tripid | duration | startterminal | endterminal | bike | max_temperature_f | mean_temperature_f | min_temperaturef | max_dew_point_f | … | lat | long | dockcount | day | hour | month | hourdiff | durationminutes | duration_i | duration_f | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | Mountain View | 113506.415689 | 2681.418988 | 28.782991 | 28.980572 | 266.556085 | 67.732405 | 54.907991 | 44.681085 | 48.479106 | … | 37.394312 | -122.083341 | 18.888563 | 15.521261 | 13.072947 | 6.898827 | 0.251833 | 44.690316 | 44.682918 | 44.690359 |
1 | Palo Alto | 92832.010551 | 4635.689918 | 35.669988 | 35.193435 | 208.521688 | 69.908558 | 59.389215 | 48.392145 | 48.810082 | … | 37.440460 | -122.156431 | 15.171161 | 15.759086 | 13.461313 | 7.947831 | 0.566237 | 77.261499 | 77.262603 | 77.261483 |
2 | Redwood City | 82721.219420 | 2945.804540 | 23.134931 | 24.148802 | 237.013871 | 68.611602 | 59.397226 | 50.197982 | 51.931904 | … | 37.486049 | -122.229363 | 19.766709 | 15.145019 | 12.639344 | 8.464061 | 0.161412 | 49.096742 | 49.084489 | 49.096608 |
3 | San Francisco | 101092.908674 | 1133.965253 | 61.825657 | 61.868821 | 455.360123 | 66.115962 | 58.201636 | 49.776278 | 49.848767 | … | 37.787144 | -122.400225 | 19.565963 | 15.746290 | 13.177293 | 7.668687 | 0.231462 | 18.899421 | 18.899656 | 18.899424 |
4 | San Jose | 98496.783100 | 1394.662899 | 7.371684 | 7.563290 | 284.057862 | 69.322104 | 58.393397 | 46.986682 | 47.791046 | … | 37.334148 | -121.892161 | 17.977952 | 15.714605 | 13.638500 | 7.585115 | 0.095467 | 23.244382 | 23.240963 | 23.244376 |
5 rows × 38 columns