tidymodels and shrinkage
Cornell College
STA 362 Spring 2024 Block 8
https://www.youtube.com/watch?v=Q81RR3yKn30&t=7s
https://www.youtube.com/watch?v=NGf0voTMlcs
On your own:
Ridge Vs Lasso https://www.youtube.com/watch?v=Xm2C_gTAl8c
Elastic Net https://www.youtube.com/watch?v=1dKRdX9bfIo
Elastic net!
\(RSS + \lambda_1\sum_{j=1}^p\beta^2_j+\lambda_2\sum_{j=1}^p|\beta_j|\)
What is the \(\ell_1\) part of the penalty?
What is the \(\ell_2\) part of the penalty
\[RSS + \lambda_1\sum_{j=1}^p\beta^2_j+\lambda_2\sum_{j=1}^p|\beta_j|\]
When will this be equivalent to Ridge Regression?
\[RSS + \lambda_1\sum_{j=1}^p\beta^2_j+\lambda_2\sum_{j=1}^p|\beta_j|\]
When will this be equivalent to Lasso?
\[RSS + \lambda_1\sum_{j=1}^p\beta^2_j+\lambda_2\sum_{j=1}^p|\beta_j|\]
Linear Regression Model Specification (regression)
Computational engine: lm
last_fit()
and specify the splittrain
data from the splitrmse
as before) you can just use collect_metrics()
and it will automatically calculate the metrics on the test
data from the splitset.seed(100000)
Auto_split <- initial_split(Auto, prop = 0.5)
lm_fit <- last_fit(lm_spec,
mpg ~ horsepower,
split = Auto_split)
lm_fit |>
collect_metrics()
# A tibble: 2 × 4
.metric .estimator .estimate .config
<chr> <chr> <dbl> <chr>
1 rmse standard 4.77 Preprocessor1_Model1
2 rsq standard 0.634 Preprocessor1_Model1
recipe()
!recipe()
function along with step_*()
functions, we can specify preprocessing steps and R will automagically apply them to each fold appropriately.recipe
gets plugged into the fit_resamples()
functionAuto_cv <- vfold_cv(Auto, v = 5)
rec <- recipe(mpg ~ horsepower, data = Auto) |>
step_scale(horsepower)
results <- fit_resamples(lm_spec,
preprocessor = rec,
resamples = Auto_cv)
results |>
collect_metrics()
# A tibble: 2 × 6
.metric .estimator mean n std_err .config
<chr> <chr> <dbl> <int> <dbl> <chr>
1 rmse standard 4.92 5 0.0744 Preprocessor1_Model1
2 rsq standard 0.611 5 0.0158 Preprocessor1_Model1
all_predictors()
short hand.rec <- recipe(mpg ~ horsepower + displacement + weight, data = Auto) |>
step_scale(all_predictors())
results <- fit_resamples(lm_spec,
preprocessor = rec,
resamples = Auto_cv)
results |>
collect_metrics()
# A tibble: 2 × 6
.metric .estimator mean n std_err .config
<chr> <chr> <dbl> <int> <dbl> <chr>
1 rmse standard 4.23 5 0.157 Preprocessor1_Model1
2 rsq standard 0.704 5 0.0253 Preprocessor1_Model1
Application Exercise
Hitters
dataset by running ?Hitters
in the ConsoleSalary
from all of the other 19 variables in this dataset. Create a visualization of Salary
.Application Exercise
When specifying your model, you can indicate whether you would like to use ridge, lasso, or elastic net. We can write a general equation to minimize:
\[RSS + \lambda\left((1-\alpha)\sum_{i=1}^p\beta_j^2+\alpha\sum_{i=1}^p|\beta_j|\right)\]
glmnet
linear_reg()
function has two additional parameters, penalty
and mixture
penalty
is \(\lambda\) from our equation.mixture
is a number between 0 and 1 representing \(\alpha\)\[RSS + \lambda\left((1-\alpha)\sum_{i=1}^p\beta_j^2+\alpha\sum_{i=1}^p|\beta_j|\right)\]
What would we set mixture
to in order to perform Ridge regression?
Application Exercise
set.seed(1)
Hitters
dataset\[RSS + \lambda\left((1-\alpha)\sum_{i=1}^p\beta_j^2+\alpha\sum_{i=1}^p|\beta_j|\right)\]
tune()
for the the penalty and the mixture. Those are the things we want to vary!fit_resamples()
we are going to use tune_grid()
# A tibble: 110 × 8
penalty mixture .metric .estimator mean n std_err .config
<dbl> <dbl> <chr> <chr> <dbl> <int> <dbl> <chr>
1 0 0 rmse standard 4.30 1 NA Preprocessor1_Model01
2 0 0 rsq standard 0.699 1 NA Preprocessor1_Model01
3 10 0 rmse standard 4.86 1 NA Preprocessor1_Model02
4 10 0 rsq standard 0.692 1 NA Preprocessor1_Model02
5 20 0 rmse standard 5.41 1 NA Preprocessor1_Model03
6 20 0 rsq standard 0.691 1 NA Preprocessor1_Model03
7 30 0 rmse standard 5.81 1 NA Preprocessor1_Model04
8 30 0 rsq standard 0.691 1 NA Preprocessor1_Model04
9 40 0 rmse standard 6.10 1 NA Preprocessor1_Model05
10 40 0 rsq standard 0.691 1 NA Preprocessor1_Model05
# ℹ 100 more rows
# A tibble: 55 × 8
penalty mixture .metric .estimator mean n std_err .config
<dbl> <dbl> <chr> <chr> <dbl> <int> <dbl> <chr>
1 0 0.25 rmse standard 4.26 1 NA Preprocessor1_Model12
2 0 0.5 rmse standard 4.26 1 NA Preprocessor1_Model23
3 0 1 rmse standard 4.26 1 NA Preprocessor1_Model45
4 0 0.75 rmse standard 4.27 1 NA Preprocessor1_Model34
5 0 0 rmse standard 4.30 1 NA Preprocessor1_Model01
6 10 0 rmse standard 4.86 1 NA Preprocessor1_Model02
7 20 0 rmse standard 5.41 1 NA Preprocessor1_Model03
8 10 0.25 rmse standard 5.69 1 NA Preprocessor1_Model13
9 30 0 rmse standard 5.81 1 NA Preprocessor1_Model04
10 40 0 rmse standard 6.10 1 NA Preprocessor1_Model05
# ℹ 45 more rows
# A tibble: 55 × 8
penalty mixture .metric .estimator mean n std_err .config
<dbl> <dbl> <chr> <chr> <dbl> <int> <dbl> <chr>
1 0 0.25 rmse standard 4.26 1 NA Preprocessor1_Model12
2 0 0.5 rmse standard 4.26 1 NA Preprocessor1_Model23
3 0 1 rmse standard 4.26 1 NA Preprocessor1_Model45
4 0 0.75 rmse standard 4.27 1 NA Preprocessor1_Model34
5 0 0 rmse standard 4.30 1 NA Preprocessor1_Model01
6 10 0 rmse standard 4.86 1 NA Preprocessor1_Model02
7 20 0 rmse standard 5.41 1 NA Preprocessor1_Model03
8 10 0.25 rmse standard 5.69 1 NA Preprocessor1_Model13
9 30 0 rmse standard 5.81 1 NA Preprocessor1_Model04
10 40 0 rmse standard 6.10 1 NA Preprocessor1_Model05
# ℹ 45 more rows
Which would you choose?
results |>
collect_metrics() |>
filter(.metric == "rmse") |>
ggplot(aes(penalty, mean, color = factor(mixture), group = factor(mixture))) +
geom_line() +
geom_point() +
labs(y = "RMSE")
Application Exercise
Hitters
cross validation object and recipe created in the previous exercise, use tune_grid
to pick the optimal penalty and mixture values.tune_grid
function. Then use collect_metrics
and filter to only include the RSME estimates.last_fit()
with the selected parameters, specifying the split data so that it is evaluated on the left out test sampleauto_split <- initial_split(Auto, prop = 0.5)
auto_train <- training(auto_split)
auto_cv <- vfold_cv(auto_train, v = 5)
rec <- recipe(mpg ~ horsepower + displacement + weight, data = auto_train) |>
step_scale(all_predictors())
tuning <- tune_grid(penalty_spec,
rec,
grid = grid,
resamples = auto_cv)
tuning |>
collect_metrics() |>
filter(.metric == "rmse") |>
arrange(mean)
# A tibble: 66 × 8
penalty mixture .metric .estimator mean n std_err .config
<dbl> <dbl> <chr> <chr> <dbl> <int> <dbl> <chr>
1 0 1 rmse standard 3.48 1 NA Preprocessor1_Model56
2 0 0.8 rmse standard 3.48 1 NA Preprocessor1_Model45
3 0 0.6 rmse standard 3.49 1 NA Preprocessor1_Model34
4 0 0.4 rmse standard 3.49 1 NA Preprocessor1_Model23
5 0 0.2 rmse standard 3.49 1 NA Preprocessor1_Model12
6 0 0 rmse standard 3.63 1 NA Preprocessor1_Model01
7 10 0 rmse standard 4.42 1 NA Preprocessor1_Model02
8 20 0 rmse standard 5.02 1 NA Preprocessor1_Model03
9 10 0.2 rmse standard 5.10 1 NA Preprocessor1_Model13
10 30 0 rmse standard 5.44 1 NA Preprocessor1_Model04
# ℹ 56 more rows
final_spec <- linear_reg(penalty = 0, mixture = 0) |>
set_engine("glmnet")
fit <- last_fit(final_spec,
rec,
split = auto_split)
fit |>
collect_metrics()
# A tibble: 2 × 4
.metric .estimator .estimate .config
<chr> <chr> <dbl> <chr>
1 rmse standard 4.45 Preprocessor1_Model1
2 rsq standard 0.691 Preprocessor1_Model1
workflow()
to combine the recipe and the model specification to pass to a fit
object.Application Exercise
workflow