Chicago ridership data
Source
Kuhn and Johnson (2020), Feature Engineering and Selection, Chapman and Hall/CRC . https://bookdown.org/max/FES/ and https://github.com/topepo/FES
Details
These data are from Kuhn and Johnson (2020) and contain an abbreviated training set for modeling the number of people (in thousands) who enter the Clark and Lake L station.
The date
column corresponds to the current date. The columns with station
names (Austin
through California
) are a sample of the columns used in
the original analysis (for file size reasons). These are 14 day lag
variables (i.e. date - 14 days
). There are columns related to weather and
sports team schedules.
The station at 35th and Archer is contained in the column Archer_35th
to
make it a valid R column name.
Examples
data(Chicago)
str(Chicago)
#> tibble [5,698 × 50] (S3: tbl_df/tbl/data.frame)
#> $ ridership : num [1:5698] 15.7 15.8 15.9 15.9 15.4 ...
#> $ Austin : num [1:5698] 1.46 1.5 1.52 1.49 1.5 ...
#> $ Quincy_Wells : num [1:5698] 8.37 8.35 8.36 7.85 7.62 ...
#> $ Belmont : num [1:5698] 4.6 4.72 4.68 4.77 4.72 ...
#> $ Archer_35th : num [1:5698] 2.01 2.09 2.11 2.17 2.06 ...
#> $ Oak_Park : num [1:5698] 1.42 1.43 1.49 1.45 1.42 ...
#> $ Western : num [1:5698] 3.32 3.34 3.36 3.36 3.27 ...
#> $ Clark_Lake : num [1:5698] 15.6 15.7 15.6 15.7 15.6 ...
#> $ Clinton : num [1:5698] 2.4 2.4 2.37 2.42 2.42 ...
#> $ Merchandise_Mart: num [1:5698] 6.48 6.48 6.41 6.49 5.8 ...
#> $ Irving_Park : num [1:5698] 3.74 3.85 3.86 3.84 3.88 ...
#> $ Washington_Wells: num [1:5698] 7.56 7.58 7.62 7.36 7.09 ...
#> $ Harlem : num [1:5698] 2.65 2.76 2.79 2.81 2.73 ...
#> $ Monroe : num [1:5698] 5.67 6.01 5.79 5.96 5.77 ...
#> $ Polk : num [1:5698] 2.48 2.44 2.53 2.45 2.57 ...
#> $ Ashland : num [1:5698] 1.32 1.31 1.32 1.35 1.35 ...
#> $ Kedzie : num [1:5698] 3.01 3.02 2.98 3.01 3.08 ...
#> $ Addison : num [1:5698] 2.5 2.57 2.59 2.53 2.56 ...
#> $ Jefferson_Park : num [1:5698] 6.59 6.75 6.97 7.01 6.92 ...
#> $ Montrose : num [1:5698] 1.84 1.92 1.98 1.98 1.95 ...
#> $ California : num [1:5698] 0.756 0.781 0.812 0.776 0.789 0.37 0.274 0.473 0.844 0.835 ...
#> $ temp_min : num [1:5698] 15.1 25 19 15.1 21 19 15.1 26.6 34 33.1 ...
#> $ temp : num [1:5698] 19.4 30.4 25 22.4 27 ...
#> $ temp_max : num [1:5698] 30 36 28.9 27 32 30 28.9 41 43 36 ...
#> $ temp_change : num [1:5698] 14.9 11 9.9 11.9 11 11 13.8 14.4 9 2.9 ...
#> $ dew : num [1:5698] 13.4 25 18 10.9 21.9 ...
#> $ humidity : num [1:5698] 78 79 81 66.5 84 71 74 93 93 89 ...
#> $ pressure : num [1:5698] 30.4 30.2 30.2 30.4 29.9 ...
#> $ pressure_change : num [1:5698] 0.12 0.18 0.23 0.16 0.65 ...
#> $ wind : num [1:5698] 5.2 8.1 10.4 9.8 12.7 12.7 8.1 8.1 9.2 11.5 ...
#> $ wind_max : num [1:5698] 10.4 11.5 19.6 16.1 19.6 17.3 13.8 17.3 23 16.1 ...
#> $ gust : num [1:5698] 0 0 0 0 0 0 0 0 0 0 ...
#> $ gust_max : num [1:5698] 0 0 0 0 25.3 26.5 0 26.5 31.1 0 ...
#> $ percip : num [1:5698] 0 0 0 0 0 0 0 0 0 0 ...
#> $ percip_max : num [1:5698] 0 0 0 0 0 0 0 0.07 0.11 0.01 ...
#> $ weather_rain : num [1:5698] 0 0 0 0 0 ...
#> $ weather_snow : num [1:5698] 0 0 0.214 0 0.516 ...
#> $ weather_cloud : num [1:5698] 0.708 1 0.357 0.292 0.452 ...
#> $ weather_storm : num [1:5698] 0 0.2083 0.0714 0.0417 0.4516 ...
#> $ Blackhawks_Away : num [1:5698] 0 0 0 0 0 0 0 0 0 0 ...
#> $ Blackhawks_Home : num [1:5698] 0 0 0 0 0 0 0 0 0 0 ...
#> $ Bulls_Away : num [1:5698] 0 0 1 0 0 0 0 0 1 0 ...
#> $ Bulls_Home : num [1:5698] 0 1 0 0 0 1 0 0 0 0 ...
#> $ Bears_Away : num [1:5698] 0 0 0 0 0 0 0 0 0 0 ...
#> $ Bears_Home : num [1:5698] 0 0 0 0 0 0 0 0 0 0 ...
#> $ WhiteSox_Away : num [1:5698] 0 0 0 0 0 0 0 0 0 0 ...
#> $ WhiteSox_Home : num [1:5698] 0 0 0 0 0 0 0 0 0 0 ...
#> $ Cubs_Away : num [1:5698] 0 0 0 0 0 0 0 0 0 0 ...
#> $ Cubs_Home : num [1:5698] 0 0 0 0 0 0 0 0 0 0 ...
#> $ date : Date[1:5698], format: "2001-01-22" "2001-01-23" ...
stations
#> [1] "Austin" "Quincy_Wells" "Belmont"
#> [4] "Archer_35th" "Oak_Park" "Western"
#> [7] "Clark_Lake" "Clinton" "Merchandise_Mart"
#> [10] "Irving_Park" "Washington_Wells" "Harlem"
#> [13] "Monroe" "Polk" "Ashland"
#> [16] "Kedzie" "Addison" "Jefferson_Park"
#> [19] "Montrose" "California"