I've been stumped for quite a while trying to decide what the criteria really are for when one should use AIC vs BIC. Burnham and Anderson talk about it quite a bit, but they are such staunch AIC partisans that it took me a while to come around to their point of view. The main reason that I would have preferred BIC is that, if you look at the derivations, BIC approximates the log of the marginal likelihood for a large dataset with an uninformative prior, while AIC approximates the same thing — but with a very strong prior (see p. 212-213 of the book, or Kass and Raftery 1995). From this point of view, the BIC seems more sensible.

On the other hand, B&A make a compelling argument that BIC was developed to identify the "dimension" or "true number of parameters" of a model, and that this is rarely sensible in ecological modeling contexts because of what B&A call tapering effects. That is, if you have a large number of predictor variables some of which have non-zero effect sizes (e.g. regression coefficients) and some of which have zero coefficients, the BIC is trying to tell you how many are non-zero. B&A point out that it is more common that there will really be a few coefficients with large magnitude, more with smaller magnitude, even more with tiny magnitudes … and that all of the predictor variables really have some effect on the response, albeit very small. What we should be trying to do, they say, is identify how many parameters are useful for prediction rather than how many are non-zero. (This also agrees in general with the Bayesian argument against point null hypotheses, i.e. that parameters are never exactly zero — somewhat ironic, since it suggests that Bayesians would prefer AIC over the "Bayesian" Information Criterion.)

The bottom line: I would say the AIC is generally the right choice for ecological questions, over BIC, unless you're really trying to identify a specific number of components. (People do this in time-series analysis, to try to identify the number of time lags or interacting species, although I think they probably shouldn't — the "tapering effects" argument really applies here.)

Add a New Comment
or Sign in as Wikidot user
(will not be published)
- +
Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License