美国Calico 有限责任公司David R. Kelley和Johannes Linder共同合作,近期取得重要工作进展。他们研究提出将从DNA序列预测的RNA-seq覆盖度作为基因调控的统一模型。相关研究成果2025年1月8日在线发表于《自然—遗传学》杂志上。
据介绍,基于基因组数据训练的基于序列的机器学习模型,通过提供描述其对顺式调控密码影响的功能预测,改善了遗传变异解读。然而,由于建模方面的挑战,目前的工具无法预测RNA-seq表达谱。
研究人员引入了Borzoi模型,该模型能够学习从DNA序列预测细胞类型特异性和组织特异性的RNA-seq覆盖度。利用从Borzoi模型预测的覆盖度中得出的统计数据,研究人员能够分离并准确评估DNA变异在多个调控层面(包括转录、剪接和多聚腺苷酸化)产生的影响。在对数量性状基因座进行评估时,Borzoi模型与针对单个调控功能训练的最先进模型相比颇具竞争力,且常常表现更优。
通过将归因方法应用于所导出的统计数据,研究人员提取出了,驱动正常组织中RNA表达和转录后调控的顺式调控基序。鉴于涵盖不同物种、条件以及针对特定调控方面,进行分析的RNA-seq数据广泛可得,这凸显了该方法在解读从DNA序列到调控功能映射关系方面的潜力。
附:英文原文
Title: Predicting RNA-seq coverage from DNA sequence as a unifying model of gene regulation
Author: Linder, Johannes, Srivastava, Divyanshi, Yuan, Han, Agarwal, Vikram, Kelley, David R.
Issue&Volume: 2025-01-08
Abstract: Sequence-based machine-learning models trained on genomics data improve genetic variant interpretation by providing functional predictions describing their impact on the cis-regulatory code. However, current tools do not predict RNA-seq expression profiles because of modeling challenges. Here, we introduce Borzoi, a model that learns to predict cell-type-specific and tissue-specific RNA-seq coverage from DNA sequence. Using statistics derived from Borzoi’s predicted coverage, we isolate and accurately score DNA variant effects across multiple layers of regulation, including transcription, splicing and polyadenylation. Evaluated on quantitative trait loci, Borzoi is competitive with and often outperforms state-of-the-art models trained on individual regulatory functions. By applying attribution methods to the derived statistics, we extract cis-regulatory motifs driving RNA expression and post-transcriptional regulation in normal tissues. The wide availability of RNA-seq data across species, conditions and assays profiling specific aspects of regulation emphasizes the potential of this approach to decipher the mapping from DNA sequence to regulatory function.
DOI: 10.1038/s41588-024-02053-6
Source: https://www.nature.com/articles/s41588-024-02053-6
Nature Genetics:《自然—遗传学》,创刊于1992年。隶属于施普林格·自然出版集团,最新IF:41.307
官方网址:https://www.nature.com/ng/
投稿链接:https://mts-ng.nature.com/cgi-bin/main.plex