ICIBM 2016

Ken Chen, PhD

The University of Texas MD Anderson Cancer Center
Friday, December 9, 2016
11:50 am - 12:10 pm


Dr. Chen is an Associate Professor in Department of Bioinformatics and Computational Biology and the Director of Bioinformatics of Khalifa Institute of Personalized Cancer Therapy at the University of Texas MD Anderson Cancer Center. He received B.E. from Tsinghua University (Beijing), Ph.D. from University of Illinois at Urbana-Champaign, and postdoctoral training from University of California at San Diego. From 2005 to 2011, he worked for Washington University School of Medicine in St. Louis as a senior scientist and a research faculty. Having a background in machine learning, statistical signal processing, and cancer genomics, his primary goal is to develop computational algorithms to analyze and interpret human genomics and clinical data towards the realization of genomic medicine. Dr. Chen has designed, developed, and co-developed a set of computational tools such as BreakDancer, TIGRA, CREST, PolyScan, SomaticSniper, and VarScan that have been widely applied to characterize individual and population genomics in various large-scale next-generation sequencing projects such as the Cancer Genome Atlas (TCGA) and the 1000 Genomes Project. He is particularly interested in comprehensively and accurately constructing the genomes and the transcriptomes of various cancer cell populations towards understanding the heterogeneity and the evolution of cancer as a consequence of genetics and environment. He is also interested in developing integrative approaches to identify biomarkers that are useful for diagnosis and prognosis. More information about his research group is available here.

Tumor Phylogeny inference from single-cell DNA sequencing data

Tumor phylogenies can greatly benefit personalized cancer diagnostics and therapy by providing valuable insights into intra-tumor heterogeneity. Even though single-cell DNA sequencing (SCS) provides high-resolution data for reconstructing tumor phylogeny, existing methods are limited in that 1) they do not properly account for technological artifacts specific to single-cell DNA sequencing data and 2) they operate under infinite sites model, which are often violated in cancer due to chromosomal rearrangements. We set out to tackle these computational challenges by developing a set of new algorithms. First, we developed Monovar, a novel statistical algorithm for discovering and genotyping SNVs from SCS data. Monovar uses a population-based approach that accounts for nonuniform coverage and technical artifacts such as allele drop-out, deamination and other amplification errors in the SCS data. It significantly improves SNV calling results comparing with standard approaches such as GATK. Second, we developed SiFit, a likelihood-based stochastic search algorithm that reconstructs tumor phylogenies under finite-site model of evolution from noisy single-cell mutational profiles and estimates error rates of SCS experiments. Simulation studies demonstrate SiFit’s superior performance over competing methods. Application of SiFit to three experimental SCS datasets of human tumors show improved inference of clonal lineages and broad applicability to a variety of data types. Last but not least, we developed novoBreak, a novel k-mer targeted breakpoint assembly algorithm that is likely sensitive at detecting chromosomal rearrangement breakpoints and small indels in SCS data. Taken together, the suite of new algorithms that we have developed establishes a foundation for applying single-cell DNA sequencing technology in cancer research and medicine.