- p. 7: I would now add Hanna Kokko's new book, "Modelling For Field Biologists: and Other Interesting People" (CUP 2007, ISBN 0521538564) to my list of useful modeling/theory books
- pp. 11-12: Aaron Ellison wrote to ask:
if I set up a table that looks like Table 1.2:
26 25
184 706
and run fisher.test( ) on it, I get a P-value of 5.26 x 10^-6 (which you
do too). But there's a bit of ambiguity in your text. On P. 12, you
emphasize the importance of not fixating on p-values, and go on to say
that fisher.test yields 95% cis for the difference between rates (with a
footnote about odds ratios). But fisher.test yields a 95% ci of (2.2,
7.4), not (2.1, 6.1) as you report. That's because as you note R outputs
odss ratios (and presumably a 95 ci on the odds ratio) , and you turn this
into a probability ratio using your OR.to.PR function to the 95% ci that
fisher.test yields.
OR.to.PR <- function(OR,p1) {
O1 <- p1/(1-p1)
O2 <- O1*OR
p2 <- O2/(1+O2)
p2/p1
}
pr.conf <- OR.to.PR(F1$conf,p.mle[2])
Why not just stick with the 95% ci on the odds ratio? What gain comes from
transforming it into a probability ratio?
My answer: I had basically decided I didn't want to try to get into explaining odds ratios at that point (attempting a minimal, stripped-down example) and decided that probability ratios would be more intuitive. However, probability ratios don't have nearly the same set of nice statistical properties that odds ratios do (such as symmetry); that's why statisticians traditionally use odds ratios. I mostly wasn't thinking at that point of people trying to follow along in detail — I did have to jump through a number of hoops (documented sufficiently in the .Rnw file, I hope) in order to get the probability ratio CIs …
- Another thought: I actually probably shouldn't have used Fisher's exact probability test for this anyway (see [http://www.stat.columbia.edu/~cook/movabletype/archives/2009/05/i_hate_the_so-c.html] section 3.3 — more on this later)
- p. 23: it is always helpful to stay with the latest version of R (2.7.0 as I write this). The people on the R list will complain if you're even a few months out of date. On the other hand, if you are running critical analyses (e.g. finishing up your thesis), you may be safer sticking with an old version. In any case, always record the version of R you used for a particular analysis (use sessionInfo() to find out the precise version), and list it in your citations, along with the versions of any packages you use.
- other helpful hints: Tom Short's reference card, Firefox toolbar (need to add links for these/move to "software")
On to notes on chapter 2