Chicago ridership data
Source
Kuhn and Johnson (2020), Feature Engineering and Selection, Chapman and Hall/CRC . https://bookdown.org/max/FES/ and https://github.com/topepo/FES
Details
These data are from Kuhn and Johnson (2020) and contain an abbreviated training set for modeling the number of people (in thousands) who enter the Clark and Lake L station.
The date column corresponds to the current date. The columns with station
names (Austin through California) are a sample of the columns used in
the original analysis (for file size reasons). These are 14 day lag
variables (i.e. date - 14 days). There are columns related to weather and
sports team schedules.
The station at 35th and Archer is contained in the column Archer_35th to
make it a valid R column name.
Examples
data(Chicago)
str(Chicago)
#> tibble [5,698 × 50] (S3: tbl_df/tbl/data.frame)
#>  $ ridership       : num [1:5698] 15.7 15.8 15.9 15.9 15.4 ...
#>  $ Austin          : num [1:5698] 1.46 1.5 1.52 1.49 1.5 ...
#>  $ Quincy_Wells    : num [1:5698] 8.37 8.35 8.36 7.85 7.62 ...
#>  $ Belmont         : num [1:5698] 4.6 4.72 4.68 4.77 4.72 ...
#>  $ Archer_35th     : num [1:5698] 2.01 2.09 2.11 2.17 2.06 ...
#>  $ Oak_Park        : num [1:5698] 1.42 1.43 1.49 1.45 1.42 ...
#>  $ Western         : num [1:5698] 3.32 3.34 3.36 3.36 3.27 ...
#>  $ Clark_Lake      : num [1:5698] 15.6 15.7 15.6 15.7 15.6 ...
#>  $ Clinton         : num [1:5698] 2.4 2.4 2.37 2.42 2.42 ...
#>  $ Merchandise_Mart: num [1:5698] 6.48 6.48 6.41 6.49 5.8 ...
#>  $ Irving_Park     : num [1:5698] 3.74 3.85 3.86 3.84 3.88 ...
#>  $ Washington_Wells: num [1:5698] 7.56 7.58 7.62 7.36 7.09 ...
#>  $ Harlem          : num [1:5698] 2.65 2.76 2.79 2.81 2.73 ...
#>  $ Monroe          : num [1:5698] 5.67 6.01 5.79 5.96 5.77 ...
#>  $ Polk            : num [1:5698] 2.48 2.44 2.53 2.45 2.57 ...
#>  $ Ashland         : num [1:5698] 1.32 1.31 1.32 1.35 1.35 ...
#>  $ Kedzie          : num [1:5698] 3.01 3.02 2.98 3.01 3.08 ...
#>  $ Addison         : num [1:5698] 2.5 2.57 2.59 2.53 2.56 ...
#>  $ Jefferson_Park  : num [1:5698] 6.59 6.75 6.97 7.01 6.92 ...
#>  $ Montrose        : num [1:5698] 1.84 1.92 1.98 1.98 1.95 ...
#>  $ California      : num [1:5698] 0.756 0.781 0.812 0.776 0.789 0.37 0.274 0.473 0.844 0.835 ...
#>  $ temp_min        : num [1:5698] 15.1 25 19 15.1 21 19 15.1 26.6 34 33.1 ...
#>  $ temp            : num [1:5698] 19.4 30.4 25 22.4 27 ...
#>  $ temp_max        : num [1:5698] 30 36 28.9 27 32 30 28.9 41 43 36 ...
#>  $ temp_change     : num [1:5698] 14.9 11 9.9 11.9 11 11 13.8 14.4 9 2.9 ...
#>  $ dew             : num [1:5698] 13.4 25 18 10.9 21.9 ...
#>  $ humidity        : num [1:5698] 78 79 81 66.5 84 71 74 93 93 89 ...
#>  $ pressure        : num [1:5698] 30.4 30.2 30.2 30.4 29.9 ...
#>  $ pressure_change : num [1:5698] 0.12 0.18 0.23 0.16 0.65 ...
#>  $ wind            : num [1:5698] 5.2 8.1 10.4 9.8 12.7 12.7 8.1 8.1 9.2 11.5 ...
#>  $ wind_max        : num [1:5698] 10.4 11.5 19.6 16.1 19.6 17.3 13.8 17.3 23 16.1 ...
#>  $ gust            : num [1:5698] 0 0 0 0 0 0 0 0 0 0 ...
#>  $ gust_max        : num [1:5698] 0 0 0 0 25.3 26.5 0 26.5 31.1 0 ...
#>  $ percip          : num [1:5698] 0 0 0 0 0 0 0 0 0 0 ...
#>  $ percip_max      : num [1:5698] 0 0 0 0 0 0 0 0.07 0.11 0.01 ...
#>  $ weather_rain    : num [1:5698] 0 0 0 0 0 ...
#>  $ weather_snow    : num [1:5698] 0 0 0.214 0 0.516 ...
#>  $ weather_cloud   : num [1:5698] 0.708 1 0.357 0.292 0.452 ...
#>  $ weather_storm   : num [1:5698] 0 0.2083 0.0714 0.0417 0.4516 ...
#>  $ Blackhawks_Away : num [1:5698] 0 0 0 0 0 0 0 0 0 0 ...
#>  $ Blackhawks_Home : num [1:5698] 0 0 0 0 0 0 0 0 0 0 ...
#>  $ Bulls_Away      : num [1:5698] 0 0 1 0 0 0 0 0 1 0 ...
#>  $ Bulls_Home      : num [1:5698] 0 1 0 0 0 1 0 0 0 0 ...
#>  $ Bears_Away      : num [1:5698] 0 0 0 0 0 0 0 0 0 0 ...
#>  $ Bears_Home      : num [1:5698] 0 0 0 0 0 0 0 0 0 0 ...
#>  $ WhiteSox_Away   : num [1:5698] 0 0 0 0 0 0 0 0 0 0 ...
#>  $ WhiteSox_Home   : num [1:5698] 0 0 0 0 0 0 0 0 0 0 ...
#>  $ Cubs_Away       : num [1:5698] 0 0 0 0 0 0 0 0 0 0 ...
#>  $ Cubs_Home       : num [1:5698] 0 0 0 0 0 0 0 0 0 0 ...
#>  $ date            : Date[1:5698], format: "2001-01-22" ...
stations
#>  [1] "Austin"           "Quincy_Wells"     "Belmont"         
#>  [4] "Archer_35th"      "Oak_Park"         "Western"         
#>  [7] "Clark_Lake"       "Clinton"          "Merchandise_Mart"
#> [10] "Irving_Park"      "Washington_Wells" "Harlem"          
#> [13] "Monroe"           "Polk"             "Ashland"         
#> [16] "Kedzie"           "Addison"          "Jefferson_Park"  
#> [19] "Montrose"         "California"