| EHR-based Pharmacogenomics | | | |
| This line of research is to develop informatics approaches to extract phenotypic data (drug exposure and drug reponse) for pharmacogenomics research from EHRs.
It inovloves natural language processing, machine learning, and data mining technologies. Currently we are working on extracting medication information from clinical notes and
modeling drug exposure status of patients based on longitudinal data from EHR.
We are collaborating with clinical teams to investigate pharmacogenomics of multiple drugs including wafarin, irinotecan, and tacrolimus. This work is funed by PGRN and VESPA grants. |
| Cancer epidemologic studies | | | |
| The specific aim of this funded study is to develop an automated informatics approach to extract both fine-grained cancer findings and general clinical information from electronic medical records and use them to conduct
cancer related epidemiological studies. |
|
This funded project is to develop a frameword that can 1) recognize abbrevaitions from clinical text; 2) build sense inventories of clinical abbreviations;
3) disambiguate abbreviations based on context; and 4) real-time encode abbreviations to remove ambiguity at the entry time.
|
| Basic Methods of NLP and Text Mining |
|
| |
|
We are interested in developing new algorithms and systems in following NLP and Text Mining areas: Grammar Induction from clinical text; Statistical Parsing; Topic Modeling using Latent Dirichlet Allocation
|
| Literature mining of nutrition studies | | | |
| Nutrition plays an important role in disease prevention and treatment. This project is to extract gene/nutrition/disease knowledge from Pubmed articles, thus to facilitate personalized nutrition. |