Diff for "FAQ/rt1d" - CBU statistics Wiki
location: Diff for "FAQ/rt1d"
Differences between revisions 5 and 6
Revision 5 as of 2010-05-20 09:33:45
Size: 2605
Editor: PeterWatson
Comment:
Revision 6 as of 2010-05-20 09:36:41
Size: 2654
Editor: PeterWatson
Comment:
Deletions are marked like this. Additions are marked like this.
Line 5: Line 5:
To see this consider the following example To see the former consider the following example
Line 15: Line 15:
The difference score T2 score - T1 score is also different to the residual of a regression using T1 score to predict T2 score. To see this lets consider the regression equation which is The difference score T2 score - T1 score is also different to the residual of a regression using T1 score to predict T2 score. To see this lets consider the regression equation, from which this residual is obtained, which is
Line 22: Line 22:
The plot given [attachment:davyplot.pdf here] shows that the higher difference score given in red (a score rising from around 11 at time 1 to 16 at time 2) has a smaller (negative) residual than a smaller difference score given in blue (a score rising from around 5 at time 1 to 8 at time 2) which has a positive residual. This is because although the individual scoring 11 at time 1 increases by a larger amount (rising by around 5 units) than the one scoring 5 at time 1 (who rises by about 3 units) the former goes up by less than would be expected assuming a constant overall increase in scores represented by a constant T2/T1 ratio (in this case T2/T1 is close to 1.5). The pdf plot given [attachment:davyplot.pdf here] shows that the higher difference score given in red (a score rising from around 11 at time 1 to 16 at time 2) has a smaller (negative) residual than a smaller difference score given in blue (a score rising from around 5 at time 1 to 8 at time 2) which has a positive residual. This is because although the individual scoring 11 at time 1 increases by a larger amount (rising by around 5 units) than the one scoring 5 at time 1 (who rises by about 3 units) the former goes up by less than would be expected assuming a constant overall increase in scores represented by a constant T2/T1 ratio (in this case T2/T1 is close to 1.5).

How do I correlate change score with baseline?

To correlate change between two time points with baseline (at time 1, T1) we correlate the score at T1 with the difference in scores between time 2, T2, and T1. This is not the same as correlating T1 score with T2 score or T1 score with the residuals from a regression of T1 score on T2 score.

To see the former consider the following example

T1

T2

1

2

2

3

3

4

The change over time is the same (a difference of 1 unit) regardless of baseline so has no relationship with baseline score. The correlation between the scores at T1 and T2, on the other hand, is 1 because the score at T2 is exactly 1 higher than the score at T1 so is perfectly predicted by T1 score.

The difference score T2 score - T1 score is also different to the residual of a regression using T1 score to predict T2 score. To see this lets consider the regression equation, from which this residual is obtained, which is

T2 score = B T1 score + random error

with the regression coefficient, B, therefore approximated by T2 score/T1 score, the expected ratio of a score at time 2 to a score, on the same individual, at time 1. B T1 is an estimate of what an individual should score at time 2 given their time 1 score. So the residual, T2 score - B T1 score, obtained from the regression of T1 score on T2 score represents if an individual's T2 score is 'above' or 'below' the 'average' expected score, obtained using all T1 score, T2 score pairs, based on their score at T1.

The pdf plot given [attachment:davyplot.pdf here] shows that the higher difference score given in red (a score rising from around 11 at time 1 to 16 at time 2) has a smaller (negative) residual than a smaller difference score given in blue (a score rising from around 5 at time 1 to 8 at time 2) which has a positive residual. This is because although the individual scoring 11 at time 1 increases by a larger amount (rising by around 5 units) than the one scoring 5 at time 1 (who rises by about 3 units) the former goes up by less than would be expected assuming a constant overall increase in scores represented by a constant T2/T1 ratio (in this case T2/T1 is close to 1.5).

Aside: In fact the illustrative data in the plot was generated so that T2 = 1.5 T1 + Normal error(mean=0,variance=0.25). The closeness of the linear regression model is related to the size of the variance of the (normally distributed) random error which is assumed constant across all scores. The larger the error variance the worse will be the fit of the linear regression model.

None: FAQ/rt1d (last edited 2021-09-09 07:52:39 by PeterWatson)