How to train a multiple linear regression model to find the best combination of variables?
Automated model selection and model-averaging. Provides a wrapper for glm and other functions, automatically generating all possible models (under constraints set by the user) with the specified response and explanatory variables, and finding the best models in terms of some Information Criterion (AIC, AICc or BIC). Can handle very large numbers of candidate models. Features a Genetic Algorithm to find the best models when an exhaustive screening of the candidates is not feasible
But we have to take care of following things:
This type of data-driven model selection will almost always destroy your ability to make reliable inferences (compute p-values, confidence intervals, etc.)
it may overfit your data (although using the information criteria listed in the package description will help with this)