Jingyi Jessica Li

Professor, Statistics and Data Science, University of California Los Angeles

Professor, Human Genetics, University of California Los Angeles

Professor, Computational Medicine, University of California Los Angeles

Professor, Biostatistics, University of California Los Angeles

Jingyi Jessica Li, Professor of Statistics and Data Science (also affiliated with Biostatistics, Computational Medicine, and Human Genetics), leads a research group titled the Junction of Statistics and Biology at UCLA. With Ph.D. from UC Berkeley and B.S. from Tsinghua University, Dr. Li focuses on developing interpretable statistical methods for biomedical data. Her research delves into quantifying the central dogma, extracting hidden information from transcriptomics data, and ensuring statistical rigor in data analysis by employing synthetic negative controls. Recipient of multiple awards including the NSF CAREER Ward, Sloan Research Fellowship, ISCB Overton Prize, and COPSS Emerging Leaders Award, her contributions have gained recognition in the fields of computational biology and statistics.


Transcriptomics, Data Science, Bioinformatics, Statistics, Genomics

Education and Training

University of California, BerkeleyPh.D.05/2013Biostatistics
Tsinghua UniversityB.S.07/2007Biological Sciences

Awards and Honors

  • Emerging Leader Award, Committee of Presidents of Statistical Societies (COPSS), 2023.
  • Best Presentation Award, Cold Spring Harbor Asia Conferences, 2011.
  • Research Starter Award in Informatics, PhRMA Foundation, 2017.
  • Radcliffe Fellowship, Radcliffe Institute for Advanced Study at Harvard University, 2022-2023.
  • Faculty Research Grant / Trans-disciplinary Seed Grant, UCLA, 2016.
  • Junior Faculty Award, UCLA David Geffen School of Medicine W.M. Keck Foundation, 2019.
  • Faculty Career Development Award, UCLA, 2015.
  • Junior Researcher Paper Award, International Chinese Statistical Association (ICSA) China Conference, 2018.
  • Hellman Fellow, Hellman Foundation, 2015.
  • Sloan Research Fellowship, Alfred P. Sloan Foundation, 2018.
  • Distinguished Graduate of Class 2007, Tsinghua University, 2007.
  • International Dissertation Field Work Grant, Institute of International Studies, UC Berkeley, 2012.
  • Outstanding Graduate Student Instructor Award, UC Berkeley, 2010.
  • Overton Prize, International Society for Computational Biology (ISCB), 2023.
  • MIT Technology Review 35 Innovators Under 35 China, MIT Technology Review, 2020.
  • CAREER Award, National Science Foundation, 2019.
  • Women in STEM2D Math Scholar Award, Johnson & Johnson, 2018.


  1. Wang W, Cen Y, Lu Z, Xu Y, Sun T, Xiao Y, Liu W, Li JJ, Wang C. scCDC: a computational method for gene-specific contamination detection and correction in single-cell and single-nucleus RNA-seq data.. Genome biology, 2024.
  2. Cui Y, Ye W, Li JS, Li JJ, Vilain E, Sallam T, Li W. A genome-wide spectrum of tandem repeat expansions in 338,963 humans.. Cell, 2024.
  3. Xia L, Lee C, Li JJ. Statistical method scDEED for detecting dubious 2D single-cell embeddings and optimizing t-SNE and UMAP hyperparameters.. Nature communications, 2024.
  4. Yan G, Song D, Li JJ. scReadSim: a single-cell RNA-seq and ATAC-seq read simulator.. Nature communications, 2023.
  5. Zhang C, Zhang S, Li JJ. A Python Package itca for Information-Theoretic Classification Accuracy: A Criterion That Guides Data-Driven Combination of Ambiguous Outcome Labels in Multiclass Classification.. Journal of computational biology : a journal of computational molecular cell biology, 2023.
  6. Patowary A, Zhang P, Jops C, Vuong CK, Ge X, Hou K, Kim M, Gong N, Margolis M, Vo D, Wang X, Liu C, Pasaniuc B, Li JJ, Gandal MJ, de la Torre-Ubieta L. Developmental isoform diversity in the human neocortex informs neuropsychiatric risk mechanisms.. bioRxiv : the preprint server for biology, 2023.
  7. Li JJ. How the Monty Hall problem is similar to the false discovery rate in high-throughput data analysis.. Nature biotechnology, 2023.
  8. Song D, Wang Q, Yan G, Liu T, Sun T, Li JJ. scDesign3 generates realistic in silico data for multimodal single-cell and spatial omics.. Nature biotechnology, 2023.
  9. Yang L, Chen X, Lee C, Shi J, Lawrence EB, Zhang L, Li Y, Gao N, Jung SY, Creighton CJ, Li JJ, Cui Y, Arimura S, Lei Y, Li W, Shen L. Functional characterization of age-dependent p16 epimutation reveals biological drivers and therapeutic targets for colorectal cancer.. Journal of experimental & clinical cancer research : CR, 2023.
  10. Wu Y, Jin M, Fernandez M, Hart KL, Liao A, Ge X, Fernandes SM, McDonald T, Chen Z, Röth D, Ghoda LY, Marcucci G, Kalkum M, Pillai RK, Danilov AV, Li JJ, Chen J, Brown JR, Rosen ST, Siddiqi T, Wang L. METTL3-Mediated m6A Modification Controls Splicing Factor Abundance and Contributes to Aggressive CLL.. Blood cancer discovery, 2023.
  11. Zong W, Rahman T, Zhu L, Zeng X, Zhang Y, Zou J, Liu S, Ren Z, Li JJ, Sibille E, Lee AV, Oesterreich S, Ma T, Tseng GC. Transcriptomic congruence analysis for evaluating model organisms.. Proceedings of the National Academy of Sciences of the United States of America, 2023.
  12. Zhou HJ, Li L, Li Y, Li W, Li JJ. PCA outperforms popular hidden variable inference methods for molecular QTL mapping.. Genome biology, 2022.
  13. Say I, Chen YE, Sun MZ, Li JJ, Lu DC. Machine learning predicts improvement of functional outcomes in traumatic brain injury patients after inpatient rehabilitation.. Frontiers in rehabilitation sciences, 2022.
  14. Cui EH, Song D, Wong WK, Li JJ. Single-cell generalized trend model (scGTM): a flexible and interpretable model of gene expression trend along cell pseudotime.. Bioinformatics (Oxford, England), 2022.
  15. Song D, Xi NM, Li JJ, Wang L. scSampler: fast diversity-preserving subsampling of large-scale single-cell transcriptomic data.. Bioinformatics (Oxford, England), 2022.
  16. Li Y, Ge X, Peng F, Li W, Li JJ. Exaggerated false positives by popular differential expression methods when analyzing human population samples.. Genome biology, 2022.
  17. Eisen TJ, Li JJ, Bartel DP. The interplay between translational efficiency, poly(A) tails, microRNAs, and neuronal activation.. RNA (New York, N.Y.), 2022.
  18. Jiang R, Sun T, Song D, Li JJ. Statistics or biology: the zero-inflation controversy about scRNA-seq data.. Genome biology, 2022.
  19. Sun T, Song D, Li WV, Li JJ. Simulating Single-Cell Gene Expression Count Data with Preserved Gene Correlations by scDesign2.. Journal of computational biology : a journal of computational molecular cell biology, 2022.
  20. Ge X, Chen YE, Song D, McDermott M, Woyshner K, Manousopoulou A, Wang N, Li W, Wang LD, Li JJ. Clipper: p-value-free FDR control on high-throughput data from two conditions.. Genome biology, 2021.
  21. Guo Y, Xue Z, Yuan R, Li JJ, Pastor WA, Liu W. RAD: a web application to identify region associated differentially expressed genes.. Bioinformatics (Oxford, England), 2021.
  22. Shi J, Xu J, Chen YE, Li JS, Cui Y, Shen L, Li JJ, Li W. The concurrence of DNA methylation and demethylation is associated with transcription regulation.. Nature communications, 2021.
  23. Xi NM, Li JJ. Protocol for executing and benchmarking eight computational doublet-detection methods in single-cell RNA sequencing data analysis.. STAR protocols, 2021.
  24. Song D, Li K, Hemminger Z, Wollman R, Li JJ. scPNMF: sparse gene encoding of single cells to facilitate gene selection for targeted gene profiling.. Bioinformatics (Oxford, England), 2021.
  25. Jiang R, Li WV, Li JJ. mbImpute: an accurate and robust imputation method for microbiome data.. Genome biology, 2021.
  26. Wang N, Lefaudeux D, Mazumder A, Li JJ, Hoffmann A. Identifying the combinatorial control of signal-dependent transcription factors.. PLoS computational biology, 2021.
  27. Sun MZ, Babayan D, Chen JS, Wang MM, Naik PK, Reitz K, Li JJ, Pouratian N, Kim W. Postoperative Admission of Adult Craniotomy Patients to the Neuroscience Ward Reduces Length of Stay and Cost.. Neurosurgery, 2021.
  28. Sun YE, Zhou HJ, Li JJ. Bipartite tight spectral clustering (BiTSC) algorithm for identifying conserved gene co-clusters in two species.. Bioinformatics (Oxford, England), 2021.
  29. Sun T, Song D, Li WV, Li JJ. scDesign2: a transparent simulator that generates high-fidelity single-cell gene expression count data with gene correlations captured.. Genome biology, 2021.
  30. Li JJ, Chen YE, Tong X. A flexible model-free prediction-based framework for feature ranking.. Journal of machine learning research : JMLR, 2021.
  31. Song D, Li JJ. PseudotimeDE: inference of differential gene expression along cell pseudotime with well-calibrated p-values from single-cell RNA sequencing data.. Genome biology, 2021.
  32. Li JJ. A new bioinformatics tool to recover missing gene expression in single-cell RNA sequencing data.. Journal of molecular cell biology, 2021.
  33. Wang YXR, Li L, Li JJ, Huang H. Network Modeling in Biology: Statistical Methods for Gene and Brain Networks.. Statistical science : a review journal of the Institute of Mathematical Statistics, 2021.
  34. Xu J, Shi J, Cui X, Cui Y, Li JJ, Goel A, Chen X, Issa JP, Su J, Li W. Cellular Heterogeneity-Adjusted cLonal Methylation (CHALM) improves prediction of gene expression.. Nature communications, 2021.
  35. Xi NM, Li JJ. Benchmarking Computational Doublet-Detection Methods for Single-Cell RNA Sequencing Data.. Cell systems, 2020.
  36. Lyu J, Li JJ, Su J, Peng F, Chen YE, Ge X, Li W. DORGE: Discovery of Oncogenes and tumoR suppressor genes using Genetic and Epigenetic features.. Science advances, 2020.
  37. Yu C, Zhang M, Song J, Zheng X, Xu G, Bao Y, Lan J, Luo D, Hu J, Li JJ, Shi H. Integrin-Src-YAP1 signaling mediates the melanoma acquired resistance to MAPK and PI3K/mTOR dual targeted therapy.. Molecular biomedicine, 2020.
  38. Li JJ, Tong X. Statistical Hypothesis Testing versus Machine Learning Binary Classification: Distinctions and Guidelines.. Patterns (New York, N.Y.), 2020.
  39. Li WV, Li S, Tong X, Deng L, Shi H, Li JJ. AIDE: annotation-assisted isoform discovery with high precision.. Genome research, 2019.
  40. Li JJ, Chew GL, Biggin MD. Quantitative principles of cis-translational control by general mRNA sequence features in eukaryotes.. Genome biology, 2019.
  41. Ge X, Zhang H, Xie L, Li WV, Kwon SB, Li JJ. EpiAlign: an alignment-based bioinformatic tool for comparing chromatin state sequences.. Nucleic acids research, 2019.
  42. Li WV, Li JJ. A statistical simulator scDesign for rational scRNA-seq experimental design.. Bioinformatics (Oxford, England), 2019.
  43. Duong D, Ahmad WU, Eskin E, Chang KW, Li JJ. Word and Sentence Embedding Tools to Measure Semantic Similarity of Gene Ontology Terms by Their Definitions.. Journal of computational biology : a journal of computational molecular cell biology, 2018.
  44. Li WV, Li JJ. Modeling and analysis of RNA-seq data: a review from a statistical perspective.. Quantitative biology (Beijing, China), 2018.
  45. Burke JE, Longhurst AD, Merkurjev D, Sales-Lee J, Rao B, Moresco JJ, Yates JR, Li JJ, Madhani HD. Spliceosome Profiling Visualizes Operations of a Dynamic RNP at Nucleotide Resolution.. Cell, 2018.
  47. Li WV, Li JJ. An accurate and robust imputation method scImpute for single-cell RNA-seq data.. Nature communications, 2018.
  48. Tong X, Feng Y, Li JJ. Neyman-Pearson classification algorithms and NP receiver operating characteristics.. Science advances, 2018.
  49. Zhang Y, Harris CJ, Liu Q, Liu W, Ausin I, Long Y, Xiao L, Feng L, Chen X, Xie Y, Chen X, Zhan L, Feng S, Li JJ, Wang H, Zhai J, Jacobsen SE. Large-scale comparative epigenomics reveals hierarchical regulation of non-CG methylation in Arabidopsis.. Proceedings of the National Academy of Sciences of the United States of America, 2018.
  50. Li JJ, Chew GL, Biggin MD. Quantitating translational control: mRNA abundance-dependent and independent contributions and the mRNA sequences that specify them.. Nucleic acids research, 2017.
  51. Clifton SM, Kang C, Li JJ, Long Q, Shah N, Abrams DM. Hybrid Statistical and Mechanistic Mathematical Model Guides Mobile Health Intervention for Chronic Pain.. Journal of computational biology : a journal of computational molecular cell biology, 2017.
  52. Gao R, Li JJ. Correspondence of D. melanogaster and C. elegans developmental stages revealed by alternative splicing characteristics of conserved exons.. BMC genomics, 2017.
  53. Yang Y, Yang YT, Yuan J, Lu ZJ, Li JJ. Large-scale mapping of mammalian transcriptomes identifies conserved genes associated with different cell states.. Nucleic acids research, 2017.
  54. Li WV, Chen Y, Li JJ. TROM: A Testing-Based Method for Finding Transcriptomic Similarity of Biological Samples.. Statistics in biosciences, 2016.
  55. Li WV, Razaee ZS, Li JJ. Epigenome overlap measure (EPOM) for comparing tissue/cell types based on chromatin states.. BMC genomics, 2016.
  56. Ye Y, Li JJ. NMFP: a non-negative matrix factorization based preselection method to increase accuracy of identifying mRNA isoforms from RNA-seq data.. BMC genomics, 2016.
  57. Liu Z, Dai S, Bones J, Ray S, Cha S, Karger BL, Li JJ, Wilson L, Hinckle G, Rossomando A. A quantitative proteomic analysis of cellular responses to high glucose media in Chinese hamster ovary cells.. Biotechnology progress, 2015.
  58. Li JJ, Biggin MD. Gene expression. Statistics requantitates the central dogma.. Science (New York, N.Y.), 2015.
  59. Boyle AP, Araya CL, Brdlik C, Cayting P, Cheng C, Cheng Y, Gardner K, Hillier LW, Janette J, Jiang L, Kasper D, Kawli T, Kheradpour P, Kundaje A, Li JJ, Ma L, Niu W, Rehm EJ, Rozowsky J, Slattery M, Spokony R, Terrell R, Vafeados D, Wang D, Weisdepp P, Wu YC, Xie D, Yan KK, Feingold EA, Good PJ, Pazin MJ, Huang H, Bickel PJ, Brenner SE, Reinke V, Waterston RH, Gerstein M, White KP, Kellis M, Snyder M. Comparative analysis of regulatory information and circuits across distant species.. Nature, 2014.
  60. Gerstein MB, Rozowsky J, Yan KK, Wang D, Cheng C, Brown JB, Davis CA, Hillier L, Sisu C, Li JJ, Pei B, Harmanci AO, Duff MO, Djebali S, Alexander RP, Alver BH, Auerbach R, Bell K, Bickel PJ, Boeck ME, Boley NP, Booth BW, Cherbas L, Cherbas P, Di C, Dobin A, Drenkow J, Ewing B, Fang G, Fastuca M, Feingold EA, Frankish A, Gao G, Good PJ, Guigó R, Hammonds A, Harrow J, Hoskins RA, Howald C, Hu L, Huang H, Hubbard TJ, Huynh C, Jha S, Kasper D, Kato M, Kaufman TC, Kitchen RR, Ladewig E, Lagarde J, Lai E, Leng J, Lu Z, MacCoss M, May G, McWhirter R, Merrihew G, Miller DM, Mortazavi A, Murad R, Oliver B, Olson S, Park PJ, Pazin MJ, Perrimon N, Pervouchine D, Reinke V, Reymond A, Robinson G, Samsonova A, Saunders GI, Schlesinger F, Sethi A, Slack FJ, Spencer WC, Stoiber MH, Strasbourger P, Tanzer A, Thompson OA, Wan KH, Wang G, Wang H, Watkins KL, Wen J, Wen K, Xue C, Yang L, Yip K, Zaleski C, Zhang Y, Zheng H, Brenner SE, Graveley BR, Celniker SE, Gingeras TR, Waterston R. Comparative analysis of the transcriptome across distant species.. Nature, 2014.
  61. Li JJ, Huang H, Bickel PJ, Brenner SE. Comparison of D. melanogaster and C. elegans developmental stages, tissues, and cells by modENCODE RNA-seq data.. Genome research, 2014.
  62. Li JJ, Bickel PJ, Biggin MD. System wide analyses have underestimated protein abundances and the importance of transcription in mammals.. PeerJ, 2014.
  63. Fisher WW, Li JJ, Hammonds AS, Brown JB, Pfeiffer BD, Weiszmann R, MacArthur S, Thomas S, Stamatoyannopoulos JA, Eisen MB, Bickel PJ, Biggin MD, Celniker SE. DNA regions bound at low occupancy by transcription factors do not drive patterned reporter gene expression in Drosophila.. Proceedings of the National Academy of Sciences of the United States of America, 2012.
  64. Gao Q, Ho C, Jia Y, Li JJ, Huang H. Biclustering of linear patterns in gene expression data.. Journal of computational biology : a journal of computational molecular cell biology, 2012.
  65. Li JJ, Jiang CR, Brown JB, Huang H, Bickel PJ. Sparse linear modeling of next-generation mRNA sequencing (RNA-Seq) data for isoform discovery and abundance estimation.. Proceedings of the National Academy of Sciences of the United States of America, 2011.