Recent Forum Posts
From categories:

Sorry, didn't see your post for a while.

  • I agree with your first paragraph completely — BIC will always choose a simpler model if n>7. Did I get something backwards in my explanation and say otherwise?
  • I agree that stepwise approaches are an occasionally necessary evil, but shouldn't be used wherever something better (exhaustive model comparison for relatively small model spaces, some smarter form of sampling of model space for larger model spaces) is possible.
  • DIC can account for explicitly stated informative priors (a parameter with a strong prior counts as <1 effective parameter)
  • Yes.
response to Radisich by bbolkerbbolker, 05 Jan 2012 22:14

(restoring accidentally deleted post)

I discuss notched boxplots above; ggplot2 doesn't have such an option at present (I think). I also discuss violin plots — I don't know of something specifically called a "violin boxplot". Can you say more specifically what you were looking for?

notches/violins by bbolkerbbolker, 04 Aug 2011 17:07
Bruce Lee (guest) 06 Jul 2011 13:12
in discussion Hidden / Per page discussions » Dynamite plots: unmitigated evil?

What you talk about is base boxplot. Do you think it will be better to some words on notched boxplot and violin boxplot?

by Bruce Lee (guest), 06 Jul 2011 13:12
Peter Radisich (guest) 13 May 2011 15:46
in discussion Hidden / Per page discussions » AIC vs BIC

I find it a bit difficult to see how BIC could find a greater number of "tiny" effects and AIC would not also find these "tiny" effects (and more). If the sample size is greater than 7 (which is always - who would use a "non-informative" measure like AIC or BIC when you only have 7 or less data points?) then BIC will always favor a simpler model compared to AIC. BIC can never find a "tiny" effect which AIC does not also find, if the search is done over the same set of models. That is if you calculate AIC and BIC for a fixed set of models, AIC can never prefer a model of lower dimensionality than BIC unless the sample size is below 8, simply because the dimension penalty is the only difference between how they rank models.

However if you use AIC "stepwise" or "forward" or any other path dependent model selection routine, then your final model may be different to a BIC "stepwise" approach. I think this says more about stepwise and other path dependent model selection routines than it does about AIC or BIC. If you do a "stepwise" AIC, then to get BIC for the same models searched simply requires a very simple adjustment of the sequence of AIC values (similar for BIC stepwise).

One other thing to note is that neither BIC nor AIC are particularly good when you have prior information about parameters within a model - such as knowing that variable X has a large effect, variable Y has a small effect, and variable Z has a tiny effect, and so on. Both methods basically assume that you don't know anything about the effect sizes and require them to be estimated from the data (if they did then there would be some place where you could put in that information, and there isn't).

Additionally they are both approximate tools, so will necessarily break down in particular problems. For BIC this will happen when you run into "identifiability" problems, and when MLEs of parameters lie on or close to boundaries of the parameter space, or multi-modal likelihoods (this could quite easily happen in structural equation modeling because of the many latent variables, perhaps explains why BIC). These conditions make the Laplace integral approximation perform poorly, and so BIC will also perform poorly here.

by Peter Radisich (guest), 13 May 2011 15:46

So nice to get a non-spam comment!

A box-percentile plot uses the empirical cumulative distribution function, the violin plot uses the kernel density estimate (so, basically a smoother version of the box-percentile plot).

by bbolkerbbolker, 05 May 2011 15:54
Raphael (guest) 05 May 2011 15:49
in discussion Hidden / Per page discussions » Dynamite plots: unmitigated evil?

I'm not sure how a box-percentile plot is different from a violin plot. Am I missing something?

by Raphael (guest), 05 May 2011 15:49
Trevor (guest) 22 Apr 2009 23:20
in discussion Hidden / Per page discussions » Dynamite plots: unmitigated evil?


Excellent book. Thank you for publishing it.

An alternative to above is the beanplot which I have found useful for comparing posteriors from Bayesian analyses. They are similar to the raindrop plots described by Barrowman and Myers (2001) but is contained in a nice R-package. I prefer them to violin or box plots since bimodality can be hidden with those approaches.

I hope you find this useful to the debate.

Barrowman, N.J. and R.A. Myers. A new graphical display for meta-analysis: the Raindrop Plot. Am. Stat.

Beanplots by Trevor (guest), 22 Apr 2009 23:20

I've been working with structural equations a lot recently, and have noted large discrepancies quite often between the aic and bic values. Interestingly, the bic often selects for models that actually do not fit the data, but happen to be simpler - particularly when the effect sizes of many coefficients are small. It's made me wary of its usage. The tapering effects argument makes a lot of sense, particularly when applied to ecological data here.

notes on aic v. bic in sem by jebyrnes (guest), 04 Dec 2008 01:35
AIC revisited
bbolkerbbolker 17 Oct 2008 20:30
in discussion Hidden / Per page discussions » AIC vs BIC

The BIC is consistent and was designed to identify the "true" dimensionality of an underlying model. The AIC is not consistent but has lower error: "if the number of models of the same dimension does not grow very fast in dimension, the average squared error of the model selected by AIC is asymptotically equivalent to the minimum offered by the candidate models … There has been a debate between AIC and BIC in the literature, centring on the issue of whether the true model is finite-dimensional or infinite-dimensional. There seems to be a consensus that, for the former case, BIC should be preferred, and AIC should be chosen for the latter" (Yang 2005). Furthermore, Yang 2005 shows (apparently: I haven't tried to follow the technical details) that you can't have your cake and eat it too — you have to make a decision between consistency and minimizing prediction errors.

Given that consensus I would say it usually makes more sense to think of infinite-dimensional models (or at least much higher dimension than the most complex of the models we try to fit) as being the default case for ecology, and therefore for AIC being preferred.

Yang, Yuhong. 2005. Can the strengths of AIC and BIC be shared? A conflict between model identification and regression estimation. Biometrika 92, no. 4 (December 1): 937-950. doi:10.1093/biomet/92.4.937.

AIC revisited by bbolkerbbolker, 17 Oct 2008 20:30

The argument against the BIC above certainly makes sense but I don't know of any theory that tells us that the AIC is any better in that respect. It tries to find the number of nonzeroes as well, though in a more liberal and upward biased way… I don't know whether that works well as a justification of using the AIC…

If you want a theory that, instead of estimating an unknown and essentially unobservable underlying truth, gives you what you find "interesting" or "useful", you have to formalise your concept of "usefulness" as a new criterion. Particularly, don't expect to get anything objective or standard from it, because usefulness generally is not objective.

Christian Hennig

And the AIC? by Anonymous (128.40.24.x), 16 Sep 2008 14:30
Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License