A data set from the MLC++ machine learning software for modeling customer
churn. There are 19 predictors, mostly numeric: state
(categorical),
account_length
area_code
international_plan
(yes/no),
voice_mail_plan
(yes/no), number_vmail_messages
total_day_minutes
total_day_calls
total_day_charge
total_eve_minutes
total_eve_calls
total_eve_charge
total_night_minutes
total_night_calls
total_night_charge
total_intl_minutes
total_intl_calls
total_intl_charge
, and
number_customer_service_calls
.
Details
The outcome is contained in a column called churn
(also yes/no).
A note in one of the source files states that the data are "artificial based
on claims similar to real world".
Examples
data(mlc_churn)
str(mlc_churn)
#> tibble [5,000 × 20] (S3: tbl_df/tbl/data.frame)
#> $ state : Factor w/ 51 levels "AK","AL","AR",..: 17 36 32 36 37 2 20 25 19 50 ...
#> $ account_length : int [1:5000] 128 107 137 84 75 118 121 147 117 141 ...
#> $ area_code : Factor w/ 3 levels "area_code_408",..: 2 2 2 1 2 3 3 2 1 2 ...
#> $ international_plan : Factor w/ 2 levels "no","yes": 1 1 1 2 2 2 1 2 1 2 ...
#> $ voice_mail_plan : Factor w/ 2 levels "no","yes": 2 2 1 1 1 1 2 1 1 2 ...
#> $ number_vmail_messages : int [1:5000] 25 26 0 0 0 0 24 0 0 37 ...
#> $ total_day_minutes : num [1:5000] 265 162 243 299 167 ...
#> $ total_day_calls : int [1:5000] 110 123 114 71 113 98 88 79 97 84 ...
#> $ total_day_charge : num [1:5000] 45.1 27.5 41.4 50.9 28.3 ...
#> $ total_eve_minutes : num [1:5000] 197.4 195.5 121.2 61.9 148.3 ...
#> $ total_eve_calls : int [1:5000] 99 103 110 88 122 101 108 94 80 111 ...
#> $ total_eve_charge : num [1:5000] 16.78 16.62 10.3 5.26 12.61 ...
#> $ total_night_minutes : num [1:5000] 245 254 163 197 187 ...
#> $ total_night_calls : int [1:5000] 91 103 104 89 121 118 118 96 90 97 ...
#> $ total_night_charge : num [1:5000] 11.01 11.45 7.32 8.86 8.41 ...
#> $ total_intl_minutes : num [1:5000] 10 13.7 12.2 6.6 10.1 6.3 7.5 7.1 8.7 11.2 ...
#> $ total_intl_calls : int [1:5000] 3 3 5 7 3 6 7 6 4 5 ...
#> $ total_intl_charge : num [1:5000] 2.7 3.7 3.29 1.78 2.73 1.7 2.03 1.92 2.35 3.02 ...
#> $ number_customer_service_calls: int [1:5000] 1 1 0 2 3 0 3 0 1 0 ...
#> $ churn : Factor w/ 2 levels "yes","no": 2 2 2 2 2 2 2 2 2 2 ...