1. In Chapter 2, I did a test on a gene list (the predicted substrates of Mec1). I did not correct for multiple testing. Why was this justified? What was the difference between that and the example gene list in this chapter?
- 2. I taught a graduate course where both professors had the same last initial: Prof. M and Prof. M. Were students in our class witness to something highly unlikely? (Hint: Assume that last initials are uniformly distributed and correct for multiple testing.) In fact, last names are not uniformly distributed. What effect is this expected to have on the probability?
- 3. A classic probability problem is the “birthday problem” where students in a class are surprised to learn that two of them either have the same birthday, or birthdays within a few days. Explain how the “surprise” in the birthday problem is really due to a failure to account for multiple testing.
- 4. Assume that you were trying to do an eQTL study using expression of all the genes in the ImmGen dataset on all the 1.5 million SNPs in the phase 3 HapMap. What would be the multiple testing correction?
REFERENCES AND FURTHER READING
Benjamini Y, Hochberg Y. (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing. J. Royal Stat. Soc. Ser. B 57:289-300.
Brem RB, Kruglyak L. (February 1, 2005). The landscape of genetic complexity across 5,700 gene expression traits in yeast. Proc. Natl. Acad. Sci. USA 102(5):1572-1577.
Storey J. (2002). A direct approach to false discovery rates. J. Royal Stat. Soc. Ser. B 64:479-498.