Answer:
The residuals represent the distance from the Y’s to the fitted Y’s in either cases. If there
is no violation of the assumptions, there should be apparent pattern when plotting residuals
again the fitted Y values, which is more meaningful when carrying out residual analysis.
As for plotting residuals against observed Y’s, it should always manifest positive relations
between the two, so it’s not sensible. You can explain this from two perspectives. One
is by intuition: for large Y’s, it is more likely that the values depart from the regression
line(especially if the line is pretty flat, and the data are more spread out), and for small
Y values, it’s more likely that they’re somewhere around the regression line, thus have a
less significant residual. Another way to explain the positive correlation comes from the decomposition of the Y:
eˆ = Y − Yˆ , so we have Y = ˆe + Yˆ . Since ˆ(Y ) and ˆ(e) are independent, we have that the
covariance between ˆ(e) and Y is always positive, even when the assumption is not violated.
In this sense, the residual plot against fitted Y is more meaningful, actually it is the most
classic residual plot that we usually use.