cn.FARMScn.FARMS is a latent variable model for detecting copy number variations in microarray data. Previous CNV detection methods for microarrays overestimate both the number and the size of CNV regions and, consequently, suffer from a high false discovery rate (FDR). A high FDR means that many CNVs are wrongly detected and therefore not associated with the disease, though correction for multiple testing takes them into account and thereby decreases the study's discovery power. For controlling the FDR, we propose a probabilistic latent variable model, cn.FARMS, which is optimized by a Bayesian maximum a posteriori approach. cn.FARMS controls the FDR through the information gain of the posterior over the prior. The prior represents the null hypothesis of copy number 2 for all samples from which the posterior can only deviate by strong and consistent signals in the data. On HapMap data, cn.FARMS clearly outperformed the two most prevalent methods with respect to sensitivity and FDR.
Our FARMS-algorithm for summarizing gene expression array data can be found here.
Djork-Arné Clevert, Andreas Mitterecker, Andreas Mayr, Marianne Tuefferd, An De Bondt, Willem Talloen, Hinrich W.H. Göhlmann, and Sepp Hochreiter . cn.FARMS: a latent variable model to detect copy number variations in microarray data with a low false discovery rate, Nucleic Acids Research 2011, doi:10.1093/nar/gkr197
Install the R-package directly from bioconductor:
Paper, supplement and manual:
- cn.FARMS paper: Nucl. Acids Res. Advance Access (2.5 MB)
- proof_ini.pdf: Mathematical properties (theorems & proofs) of the informative/non-informative calls (I/NI call) (15-12-2010, 180kB)
- cn.farms.pdf: Software manual
Data used in our experiments:
- SNP6_CEU.zip: SNP6.0 cel-files of 60 HapMap CEU-founders (1.8 GB)
- 250K_CEU.zip: 250K_NSP cel-files of 60 HapMap CEU-founders (1.8 GB)
- Regions_Conrad.csv: CNV regions in 60 HapMap CEU-founders which were detected and verified by different bio-technologies (Conrad et al. 2010). (5 kB)
D. F. Conrad, D. Pinto, R. Redon, L. Feuk, O. Gokcumen, Y. Zhang , et al. Origins and functional impact of copy number variation in the human genome (2010), Nature, 464(7289),704-712