|
Size: 1517
Comment:
|
Size: 1516
Comment:
|
| Deletions are marked like this. | Additions are marked like this. |
| Line 3: | Line 3: |
| We consider a binary outcome (positive/negative) consisting of groups with $$n_text{1}$$ positives and $$n_text{2}$$ negatives and the probability of a positive outcome equal to p. Suppose we have a binary group predictor where we only get a positive outcome when x=0 and a negative outcome when x=1. This circumstance is known as ''complete sepratation''. The log-likelihood may be written as | We consider a binary outcome (positive/negative) consisting of groups with $$n_text{1}$$ positives and $$n_text{2}$$ negatives and the probability of a positive outcome equal to p. Suppose we have a binary group predictor where we only get a positive outcome when x=0 and a negative outcome when x=1. This circumstance is known as ''complete separation''. The log-likelihood may be written as |
Using algebra to show why the maximum likelihood estimates are undefined when a response group only occurs in certain predictor groups
We consider a binary outcome (positive/negative) consisting of groups with $$n_text{1}$$ positives and $$n_text{2}$$ negatives and the probability of a positive outcome equal to p. Suppose we have a binary group predictor where we only get a positive outcome when x=0 and a negative outcome when x=1. This circumstance is known as complete separation. The log-likelihood may be written as
ln L = $$n_text{1}$$ ln p + $$n_text{2}$$ ln (1-p)
In a binary logistic regression
$$p = \frac{etext{a+bx}}{1+ etext{a+bx}}$$ where x is the group predictor taking values 0 and 1.
ln L = $$n_text{1}$$(a+bx) - $$n_text{1}$$ ln(1 + $$etext{a+bx}$$) + $$n_text{2}$$ - $$n_text{2}$$ ln(1+$$etext{a+bx}$$)
$$\frac{d}{db} = n_text{1}x - n_text{1}x \frac{etext{a+bx}}{1+etext{a+bx}} - n_text{2}x \frac{etext{a+bx}}{1+etext{a+bx}}$$
Since all the x=0 scores are in the 'positive' group (which we denote as group 1) and all the x=1 scores are in the 'negative' group (whichw e denote as group 2) we have
$$\frac{d}{db} = n_text{2}\frac{ etext{a+b}}{1+etext{a+b}}$$.
$$\frac{d}{db}$$ =0 for maximum likelihood estimates and $$\frac{d}{db}$$ can only equal zero with infinite estimates of a and b.
This argument can be extended to continuous predictors where an outcome group only occurs for values below or above certain thresholds of the predictor.
