Why is log(odds) considered and not probability in logistic regression?
Logistic regression applies maximum likelihood estimation which transforms the dependent variable into a logitvariable (natural log of the odds of the dependent variable occurring or not) with respect to independent variables. In this way, logistic regression estimates the probability of a certain event occurring.
In the following equation, a log of odds changes linearly as a function of explanatory variables:
The reason why log(odds) is used and not probability is given below:
By converting probability to log(odds), we have expanded the range from [0, 1] to [- ∞, +∞ ]. By fitting model based on probability, we will face a restricted range problem, so we apply log transformation so that we can cover up the problem of non-linearity involved in the model and we can just fit with a linear combination of variables.