Nonlinear Embedding and Integration of Omics Data: A Fast and Tuning-Free Approach
Speaker(s): Tianwei Yu(CUHK-Shenzhen)
Time: 15:00-16:00 April 11, 2025
Venue: Room 77201, Jingchunyuan 78, BICMR
Abstract:
The rapid progress of single-cell technology has facilitated cost-effective acquisition of diverse omics data, allowing biologists to unravel the complexities of cell populations, disease states, and more. Additionally, single-cell multi-omics technologies have opened new avenues for studying biological interactions. However, the high dimensionality and sparsity of omics data present significant analytical challenges. Dimension reduction (DR) techniques are hence essential for analyzing such complex data, yet many existing methods have inherent limitations. Linear methods like PCA struggle to capture intricate associations within data. In response, nonlinear techniques have emerged, but they may face scalability issues, be restricted to single-omics data, or prioritize visualization over generating informative embeddings. Here, we introduce DCOL (Dissimilarity based on Conditional Ordered List) correlation, a novel measure for quantifying nonlinear relationships between variables. Based on this measure, we propose DCOL-PCA and DCOL-CCA for dimension reduction and integration of single- and multi-omics data. In simulations, our methods outperformed eight DR methods and four joint dimension reduction methods, demonstrating stable performance across various settings. We also validated these methods on real datasets, with our method demonstrating its ability to detect intricate signals within and between omics data and generate lower-dimensional embeddings that preserve the essential information and latent structures.
Biography:
于天维现任香港中文大学(深圳)数据科学学院教授。于天维1997年毕业于清华大学生物系,2000年获得清华大学生物化学与分子生物学硕士学位,2005年获得加利福尼亚大学洛杉矶分校的统计学博士学位。在加入香港中文大学(深圳)之前,于天维教授为埃默里(Emory)大学生物统计学和生物信息学系终身教授。 于天维教授的研究重点集中于生物信息学,应用统计学与应用机器学习;其研究兴趣也包括代谢组学,药物基因组学和系统生物学的应用。在他的合作研究中,他致力于环境卫生、病毒学/疫苗学,营养学和癌症研究。