How to choose the value of the regularisation parameter (λ)?
Choosing the regularisation parameter is a complicated business. If the value of λ is very high, it would lead to very small values of the regression coefficient β, which will make the model underfitting (high bias – low variance). Similarly, if the value of λ is 0 (very small), the model would tend to overfit the training data (low bias – high variance).
There is no proper way to select the value of λ. What we can do is have a sub-sample of data and run the algorithm multiple times on different sets. Here, the person has to decide how much variance can be tolerated. Once the user is satisfied with the variance, which value of λ could be selected for the full dataset.
One more thing to be noted is that the value of λ selected here was optimal for that subset, not for the whole training data.