DSL News and Highlights
- Dr. Hua Xu awarded two research grants. (1) An In-silico Method for Epidemiological Studies Using Electronic Medical Records; NCI/NIH; 1R01CA141307, Role: PI, from 09/03/2009 to 07/31/2013; The specific aim of this study is to develop an automated informatics approach to extract both fine-grained cancer findings and general clinical information from electronic medical records and use them to conduct cancer related epidemiological studies. (2) An Informatics-based Approach to Pharmacogenetic Studies of Warfarin, 5 UL1 RR024975-KL2 Scholar Award, Funded by Vanderbilt CTSA; Role: PI, from 07/01/2009 to 06/30/2012; This project is to developed informatics approaches to extract phenotypic data for pharmacogenomics research from EHRs, using natural language processing and machine learning technologies.
- Open post-doc position in DSL. The Discovery Systems Lab has an open post-doc position in the area of machine learning and knowledge discovery. The lab's focus is on learning from data: (1) Development and refinement of algorithms for knowledge discovery (including discovering cause and effect relationships) and (2) Application of knowledge discovery methods to various types of biomedical data including coded data, high throughput data, text and images. Candidates with an earned PhD in computer science, information science, biomedical informatics or a closely related area with research interests in machine learning and knowledge discovery are encouraged to contact Subramani Mani by email (subramani.mani at vanderbilt.edu). The DSL is part of the Department of Biomedical Informatics at Vanderbilt University located in Nashville, TN, USA.
- JMLR Special Topic on Causality. Pre-announcement. JMLR will host in 2007 a special topic with papers dedicated to Causality. Greg Cooper, Andre Elisseeff, Peter Spirtes, Isabelle Guyon, and Constantin Aliferis are guest editors.
- Invited Talk at University of Pennsylvania, Center for Bioinformatics (3-14-2007): "Methodological rigor of "omics" data analysis: efficiency, stability, guidelines, and sensationalism" (by Constantin Aliferis). In this talk we will provide a cautionary account of deficiencies in data analysis methods used in the broad field of biomedical research with high-throughput molecular data and outline paths toward improved methodological rigor in this nascent but extraordinarily important field.
- Causal Discovery Challenge. Isabelle Guyon, Andre Elisseeff, Greg Cooper, Peter Spirtes, and Constantin Aliferis, along with several international distinguished advisors are co-organizing this Causal Discovery Challenge. The Challenge will involve several causal discovery tasks and datasets from a variety of real life problem domains and will take place in 2007-2008. The PASCAL network is sponsoring this challenge while an NSF sponsoring application for the challenge plus a Causal Discovery Workbench is pending. Visit us again for more news about this exciting challenge.
- Forthcoming chapter detailing the link between causation and feature selection: "Causal Feature Selection". I. Guyon, C.F. Aliferis, A. Elisseeff. In: Computational Methods of Feature Selection, H. Liu and H. Motoda (Eds). Chapman and Hall (to appear). An extended technical report can be found under this link. This chapter complements the paper "Towards Principled Feature Selection: Relevance, Filters, and Wrappers" by Ioannis Tsamardinos and Constantin Aliferis (in AI & Stats 2003) while it is written so that it is much more accessible to the non-specialist.
- NIPS 2006 Workshop on Causality and Feature Selection. This Workshop, the first of its kind, explored the intersection of Causal discovery and Feature Selection. The quality and volume of the presented work, along with high attendance and quality discussion made this an interesting and productive meeting.
- NIPS 2006 Causality and Feature Selection Workshop paper: "Using SVM Weight-Based Methods to Identify Causally Relevant and Non-Causally Relevant Variables" by A. Statnikov, D. Hardin, and C.F. Aliferis. In this paper we show that SVM weight-based feature selection does not reveal the causally important variables (although of course highly predictive features are indeed identified).
- Research Highlight (forthcoming paper): "Understanding the role of environment, genetics and data analysis pitfalls in genome-wide association studies: An esophageal cancer case-study". A. Statnikov, C. Li, and C.F. Aliferis. Where we analyze a case study where SNP arrays may not convey as much predictive information as originally claimed due to data analysis limitations.
- Research Highlight (forthcoming paper): "Local regulatory-network inducing algorithms for biomarker discovery from mass-throughput datasets". C.F. Aliferis, A. Statnikov et al. Where we provide the most comprehensive analysis of biomarker selection algorithms for genomics and proteomics. We show how novel local regulatory network induction algorithms can give superior performance to biomarker discovery methods that are strictly geared toward predictivity. This study, 4 years in the making, strives to clarify many thorny issues in translational bioinformatics. More details will follow soon. Prelimnary results were presented in the following Invited Talk: "Pathway Induction and High-Fidelity Simulation for Molecular Signature and Biomarker Discovery in Lung Cancer Using Microarray Gene Expression Data" by C.F. Aliferis at APS Conference: Physiological Genomics and Proteomics of Lung Disease, 2006.
- Research Highlight (forthcoming paper): "Gene Expression Microarrays Do Predict Clinical Outcomes". C.F. Aliferis, A. Statnikov, I. Tsamardinos., J. Schildcrout, B. Shepherd, F. Harrell Jr. Where we thoroughly refute the claims by Michiels et al and Ioannidis in The Lancet 2005 that microarray datasets do not have statistically significant signal for clinical outcome prediction.
- Research Highlight (forthcoming paper): "The Problem of Statistical Gene Instability in Microarray Studies: External Reproducibility and Biological Importance of Unstable Genes and their Molecular Signatures" C.F. Aliferis, A. Statnikov, S. Pratap, E. Kokkotou. Where we show that instability neither precludes good generalization performance, nor should be trusted as a criterion for finding biologically important genes.
- New paper to appear in MedInfo 2007: "Text Categorization Models for Identifying Unproven Cancer Treatments on the Web" by Y. Aphinyanaphongs and C.F. Aliferis. In this exciting work we present the first validated computer model that identifies web sites promoting unproven and dangerous cancer treatments.
- New paper to appear in MedInfo 2007: "A Comparison of Impact Factor, Clinical Query Filters, and Pattern Recognition Query Filters in Terms of Sensitivity to Topic" by L. Fu, L. Wang, Y. Aphinyanaphongs, and C.F. Aliferis.
- New paper to appear in MedInfo 2007: "Learning Causal and Predictive Clinical Practice Guidelines from Data". S. Mani, C.F. Aliferis, S. Krishnaswami, T. Kotchen.
- New paper to appear in MedInfo 2007: "Comparing Decision Support Methodologies for Identifying Asthma Exacerbations" by J.W. Dexheimer, L. Brown et al.
- Omics data analysis paper in Cancer Informatics 2006: "Challenges in the Analysis of Mass-Throughput Data: A Technical Commentary from the Statistical Machine Learning Perspective" by C.F. Aliferis, A. Statnikov, and I. Tsamardinos. Where we summarize several vexing problems in analysis of omics data and propose methodologies for overcoming them.
- Weekly Digest: Latest papers from major journals related to Biomedical Informatics. This automatically updated digest may save you a lot of time and effort in your reading of the literature.