1. What is the biological interpretation of the parameter b0 in the gene expression transcription factor binding site regression model as formulated here?
- 2. Assume that you wanted to test all DNA patterns of length w = 6 for their association with gene expression. Explain how you could set up the FDR correction for this problem?
- 3. How many parameters are estimated in the linear regression model with 216 transcription factors?
- 4. It took 2 minutes for my laptop to fit (using R) all 216 multiple regression models I showed in the figures. Assuming each model takes about the same amount of time to fit, how long would it take to find the best possible model in this example?
- 5. PBMs (protein binding microarrays) measure the binding affinity of DNA-binding proteins to every possible DNA sequence (up to length 10). Formulate the prediction of DNA-binding affinity as a multiple regression problem (what is Y ? what is X?).
REFERENCES AND FURTHER READING
Brem RB, Kruglyak L. (2005). The landscape of genetic complexity across 5,700 gene expression traits in yeast. Proc. Natl. Acad. Sci. USA 102(5):1572-1577.
Bussemaker HJ, Li H, Siggia ED. (2001). Regulatory element detection using correlation with expression. Nat. Genet. 27(2):167-171.
Coghlan A, Wolfe KH. (2000). Relationship of codon bias to mRNA concentration and protein length in Saccharomyces cerevisiae. Yeast (Chichester, England) 1б(12):П31-П45.
De Boer CG, Hughes TR. (2012). YeTFaSCo: A database of evaluated yeast transcription factor sequence specificities. Nucleic Acids Res. 40(Database issue):D169-D179.
Drummond DA, Raval A, Wilke CO. (2006). A single determinant dominates the rate of yeast protein evolution. Mol. Biol. Evol. 23(2):327-337.
Ogawa N, DeRisi J, Brown PO. (2000). New components of a system for phosphate accumulation and polyphosphate metabolism in Saccharomyces cerevisiae revealed by genomic expression analysis. Mol. Biol. Cell 11(12):4309-4321.
Wall DP, Hirsh AE, Fraser HB, Kumm J, Giaever G, Eisen MB, Feldman MW. (2005). Functional genomic analysis of the rates of protein evolution. Proc. Natl. Acad. Sci. USA 102(15):5483-5488.