- A good statistical model is parsimonious
- We assume that the random errors ε are normally distributed so we can carry out statistical hypothesis tests using the F and t distributions
- For a simple linear regression, we use the SSE (Sum of Squared Errors) to calculate an equation, which can then be used to predict the depending variable in the future
- When we have determined that a linear relationship exists between variables, we need to measure how strong it is. The MSE (Mean Square Error) is not relative -- for that, we have the square of the estimated correlation coefficient r also called the coefficient of determination
- r squared can be anything between 0 and 1, where 0.5 is a poor estimator, 0.6 may be satisfactory and 0.8 and higher is good to very good
SPSS prints out a model summary:
- The R says how good predictions using the regression equation will be: it represents the correlation between actual scores on the dependent variable and predicted scores based on the regression equation
- The squared R represents the portion of variance predictable in the dependent variable from the regression equation
- The ANOVA table ...