In many studies effects of variables on a dependent one are examined. A commonly used procedure to analyse such effects is regression analysis (or ANOVA in the case of factors), often with a elimination procedure such as stepwise backward elimination of variables. By this, different regression models are compared and the 'best' one selected, the one containing the 'important', i.e. the 'significant' variables. The selection of regression models is based on the calculated error probabilities. This procedure is problematic in different regard. It may not only lead to wrong conclusion, but it also lacks a theoretical foundation. It is based on hypothesis tests which only permit statements about the probability of occurrence of a certain event under the assumption of the validity of a certain hypothesis (the null hypothesis). No statements about hypotheses themselves are possible. Additionally, hypothesis tests in classical statistics are theoretically justified only for experimental procedures with control and treatment, including randomisation and replication. Often regression analyses are applied in studies dealing with many variables acting in a complex way and where it is not possible to isolate variables experimentally. In such studies as well as in the analysis of empirical data, testing null hypotheses are often not appropriate. Instead, comparing different models is the proper procedure. This is generally not possible with classical statistics. However, there are procedures available which enable to quantitatively compare models. In many research areas they are still rarely used, probably because many are unaware of the possibilities. If these notes contribute to making such procedures better known, it has served the main purpose.
In order to be able to select the most suitable model from a set of models ('the best approximating model'), we need a quantitative parameter which permits to compare the quality of the different models. As noted above, error probabilities are not suitable. The coefficient of determination R2 (as adj. R2) is adequate in some simple cases only. However, different methods of model selection have been developed which provide suitable parameters. Beside others, some are based on Bayes statistics, others on mathematical information theory. The parameters derived from Bayes statistics are called CAIC, BIC, SIC, WIC or HQ for instance, those from information theory AIC, AICc, QAICc and TIC.
The information-theoretical approach using Akaike's information criterion (AIC and related ones) seems to be a very accessible and 'user-friendly' one, not least because of the excellent representation by Kenneth P. Burnham and David R. Anderson in their book Model selection and multimodel inference: A practical information-theoretic approach.
Models have to be formulated precisely for model selection procedures. Therefore it is necessary to deal carefully with the objects of research and the questions to be answered. The models are derived from the research hypotheses which should help to explain how the different measured variables work. They should illustrate postulated connections. Such models often become complex and different in their structure, resulting in 'non-nested' models, e.g. y = a + b*log(x); y = a(x/(b + x)). Traditional procedures like regression analysis can be used only for models derived from each other (nested models, such as y = ax1 + bx2 + c; y = ax1 + c). For the quantitative comparison of models of different structure we need procedures of model selection like the above mentioned ones.