Skip to content

Chicago ridership data

Source

Kuhn and Johnson (2020), Feature Engineering and Selection, Chapman and Hall/CRC . https://bookdown.org/max/FES/ and https://github.com/topepo/FES

Value

Chicago

a tibble

stations

a vector of station names

Details

These data are from Kuhn and Johnson (2020) and contain an abbreviated training set for modeling the number of people (in thousands) who enter the Clark and Lake L station.

The date column corresponds to the current date. The columns with station names (Austin through California) are a sample of the columns used in the original analysis (for file size reasons). These are 14 day lag variables (i.e. date - 14 days). There are columns related to weather and sports team schedules.

The station at 35th and Archer is contained in the column Archer_35th to make it a valid R column name.

Examples

data(Chicago)
str(Chicago)
#> tibble [5,698 × 50] (S3: tbl_df/tbl/data.frame)
#>  $ ridership       : num [1:5698] 15.7 15.8 15.9 15.9 15.4 ...
#>  $ Austin          : num [1:5698] 1.46 1.5 1.52 1.49 1.5 ...
#>  $ Quincy_Wells    : num [1:5698] 8.37 8.35 8.36 7.85 7.62 ...
#>  $ Belmont         : num [1:5698] 4.6 4.72 4.68 4.77 4.72 ...
#>  $ Archer_35th     : num [1:5698] 2.01 2.09 2.11 2.17 2.06 ...
#>  $ Oak_Park        : num [1:5698] 1.42 1.43 1.49 1.45 1.42 ...
#>  $ Western         : num [1:5698] 3.32 3.34 3.36 3.36 3.27 ...
#>  $ Clark_Lake      : num [1:5698] 15.6 15.7 15.6 15.7 15.6 ...
#>  $ Clinton         : num [1:5698] 2.4 2.4 2.37 2.42 2.42 ...
#>  $ Merchandise_Mart: num [1:5698] 6.48 6.48 6.41 6.49 5.8 ...
#>  $ Irving_Park     : num [1:5698] 3.74 3.85 3.86 3.84 3.88 ...
#>  $ Washington_Wells: num [1:5698] 7.56 7.58 7.62 7.36 7.09 ...
#>  $ Harlem          : num [1:5698] 2.65 2.76 2.79 2.81 2.73 ...
#>  $ Monroe          : num [1:5698] 5.67 6.01 5.79 5.96 5.77 ...
#>  $ Polk            : num [1:5698] 2.48 2.44 2.53 2.45 2.57 ...
#>  $ Ashland         : num [1:5698] 1.32 1.31 1.32 1.35 1.35 ...
#>  $ Kedzie          : num [1:5698] 3.01 3.02 2.98 3.01 3.08 ...
#>  $ Addison         : num [1:5698] 2.5 2.57 2.59 2.53 2.56 ...
#>  $ Jefferson_Park  : num [1:5698] 6.59 6.75 6.97 7.01 6.92 ...
#>  $ Montrose        : num [1:5698] 1.84 1.92 1.98 1.98 1.95 ...
#>  $ California      : num [1:5698] 0.756 0.781 0.812 0.776 0.789 0.37 0.274 0.473 0.844 0.835 ...
#>  $ temp_min        : num [1:5698] 15.1 25 19 15.1 21 19 15.1 26.6 34 33.1 ...
#>  $ temp            : num [1:5698] 19.4 30.4 25 22.4 27 ...
#>  $ temp_max        : num [1:5698] 30 36 28.9 27 32 30 28.9 41 43 36 ...
#>  $ temp_change     : num [1:5698] 14.9 11 9.9 11.9 11 11 13.8 14.4 9 2.9 ...
#>  $ dew             : num [1:5698] 13.4 25 18 10.9 21.9 ...
#>  $ humidity        : num [1:5698] 78 79 81 66.5 84 71 74 93 93 89 ...
#>  $ pressure        : num [1:5698] 30.4 30.2 30.2 30.4 29.9 ...
#>  $ pressure_change : num [1:5698] 0.12 0.18 0.23 0.16 0.65 ...
#>  $ wind            : num [1:5698] 5.2 8.1 10.4 9.8 12.7 12.7 8.1 8.1 9.2 11.5 ...
#>  $ wind_max        : num [1:5698] 10.4 11.5 19.6 16.1 19.6 17.3 13.8 17.3 23 16.1 ...
#>  $ gust            : num [1:5698] 0 0 0 0 0 0 0 0 0 0 ...
#>  $ gust_max        : num [1:5698] 0 0 0 0 25.3 26.5 0 26.5 31.1 0 ...
#>  $ percip          : num [1:5698] 0 0 0 0 0 0 0 0 0 0 ...
#>  $ percip_max      : num [1:5698] 0 0 0 0 0 0 0 0.07 0.11 0.01 ...
#>  $ weather_rain    : num [1:5698] 0 0 0 0 0 ...
#>  $ weather_snow    : num [1:5698] 0 0 0.214 0 0.516 ...
#>  $ weather_cloud   : num [1:5698] 0.708 1 0.357 0.292 0.452 ...
#>  $ weather_storm   : num [1:5698] 0 0.2083 0.0714 0.0417 0.4516 ...
#>  $ Blackhawks_Away : num [1:5698] 0 0 0 0 0 0 0 0 0 0 ...
#>  $ Blackhawks_Home : num [1:5698] 0 0 0 0 0 0 0 0 0 0 ...
#>  $ Bulls_Away      : num [1:5698] 0 0 1 0 0 0 0 0 1 0 ...
#>  $ Bulls_Home      : num [1:5698] 0 1 0 0 0 1 0 0 0 0 ...
#>  $ Bears_Away      : num [1:5698] 0 0 0 0 0 0 0 0 0 0 ...
#>  $ Bears_Home      : num [1:5698] 0 0 0 0 0 0 0 0 0 0 ...
#>  $ WhiteSox_Away   : num [1:5698] 0 0 0 0 0 0 0 0 0 0 ...
#>  $ WhiteSox_Home   : num [1:5698] 0 0 0 0 0 0 0 0 0 0 ...
#>  $ Cubs_Away       : num [1:5698] 0 0 0 0 0 0 0 0 0 0 ...
#>  $ Cubs_Home       : num [1:5698] 0 0 0 0 0 0 0 0 0 0 ...
#>  $ date            : Date[1:5698], format: "2001-01-22" "2001-01-23" ...
stations
#>  [1] "Austin"           "Quincy_Wells"     "Belmont"         
#>  [4] "Archer_35th"      "Oak_Park"         "Western"         
#>  [7] "Clark_Lake"       "Clinton"          "Merchandise_Mart"
#> [10] "Irving_Park"      "Washington_Wells" "Harlem"          
#> [13] "Monroe"           "Polk"             "Ashland"         
#> [16] "Kedzie"           "Addison"          "Jefferson_Park"  
#> [19] "Montrose"         "California"