New Haven, Conn. — Each person has about 4 million sequence differences in their genome relative to the reference human genome. These differences are known as variants. A central goal in precision medicine is understanding which of these variants contribute to disease in a particular patient. Therefore, much of the human genome annotation effort is devoted to developing resources to help interpret the relative contribution of human variants to different observable phenotypes – i.e., determining variant impact.
Recently, Yale School of Medicine led a large NIH-sponsored study where multiple institutions and international collaborators came together to address this challenge. This study generated a large, organized dataset from four individual donors using high-quality genome sequencing to identify all the variants and many different assays to determine their effect on molecular phenotypes in 25 different tissues. Known as EN-TEx, the resource is an important step toward the future of personalized care. The team published its findings in Cell on March 30.
“Our work helps provide a better annotation of the genome and a better understanding of variant impact,” says Mark Gerstein, PhD, Albert Williams Professor of Biomedical Informatics and member of the new Yale Section of Biomedical Informatics & Data Science. He also is affiliated at Yale with molecular biophysics & biochemistry, computer science, and statistics & data science. “An average person’s personal genome has variants in 4 million places. We’re trying to figure out which of these lead to meaningful differences.”
"This work represents the type of innovative large-scale data mining and teamwork that Yale is well-poised to create, coordinate, or participate in,” says Lucila Ohno-Machado, MD, MBA, PhD, Waldemar von Zedtwitz Professor of Medicine and of Biomedical Informatics & Data Science, and chair of the new section. “As our new academic unit grows, we expect to see more and more of this type of exemplary biomedical data science work originate from here.”
In their latest project, the team utilized long-read sequencing technologies to determine diploid genomes from four donors with high accuracy. Everyone has a diploid genome. This means that we have two copies of 22 chromosomes as well as sex chromosomes—one from our mother and one from our father. “Now, for each position on the genome, we can look for the differences between mom and dad in many different functional assays in a perfectly balanced way, allowing us to accurately ascertain variant effect in many tissues,” says Gerstein.
The team developed a variety of statistical and deep learning approaches to be able to leverage the dataset for practical applications. In particular, they built statistical models that identify subsets of regulatory regions in the human genome highly associated with disease variants. They also found many new linkages between variants and changes in nearby gene expression, connecting impactful but uncharacterized variants to genes with known function. This considerably expands previously determined catalogs, especially in many hard-to-assay tissues.
More fundamentally, the team developed a deep learning model that was able to predict whether a variant would disrupt a binding site for a regulatory factor—a protein that binds to specific sequences in the genome to turn nearby genes on or off. Interestingly, they found that to accurately predict this, they needed to look beyond just the binding site itself and consider a large genomic region around the site. The key to whether a binding site would be impacted was the presence of nearby binding sequences for other regulatory factors. “Think of regulatory factors as the legs of the Lunar Module,” says Gerstein. “If it has four legs and one leg doesn’t work, the three other legs can anchor the defective leg.” Similarly, the anchoring of other regulatory factors might stabilize the disrupted binding site and make it less sensitive to variants.
One limitation of the resource is that only four people of European descent are profiled. The team would like to eventually enlarge their study to encompass hundreds of individuals with more diverse ancestries.
Overall, these advances will allow researchers and clinicians to better interpret potential disease-causing variants in an individual, connecting them to regulatory sites, nearby genes, and their tissue of action. “We’ve provided a consistent, beautiful data set and annotation resource for making these interpretations,” says Gerstein.
The global team was assembled by the National Human Genome Research Institute (NHGRI) within NIH, as part of NHGRI's ENCODE consortium, which aims to functionally annotate the genome. The team included collaborators from institutions including Baylor College of Medicine; the Broad Institute of MIT and Harvard; California Institute of Technology; the Centre for Genomic Regulation; Cold Spring Harbor Laboratory; the Dana-Farber Cancer Institute; the European Bioinformatics Institute; HudsonAlpha Institute for Biotechnology; Johns Hopkins University; New York Institute of Technology; Stanford University; University of California, Irvine; University of California, San Diego; University of Hong Kong; University of Massachusetts Medical School; University of Toronto (Canada); and University of Washington, Seattle.
JOURNAL
Cell
ARTICLE PUBLICATION DATE
30-Mar-2023
and connect with fellow professionals
At DoveMed, our utmost priority is your well-being. We are an online medical resource dedicated to providing you with accurate and up-to-date information on a wide range of medical topics. But we're more than just an information hub - we genuinely care about your health journey. That's why we offer a variety of products tailored for both healthcare consumers and professionals, because we believe in empowering everyone involved in the care process.
Our mission is to create a user-friendly healthcare technology portal that helps you make better decisions about your overall health and well-being. We understand that navigating the complexities of healthcare can be overwhelming, so we strive to be a reliable and compassionate companion on your path to wellness.
As an impartial and trusted online resource, we connect healthcare seekers, physicians, and hospitals in a marketplace that promotes a higher quality, easy-to-use healthcare experience. You can trust that our content is unbiased and impartial, as it is trusted by physicians, researchers, and university professors around the globe. Importantly, we are not influenced or owned by any pharmaceutical, medical, or media companies. At DoveMed, we are a group of passionate individuals who deeply care about improving health and wellness for people everywhere. Your well-being is at the heart of everything we do.
0 Comments
Please log in to post a comment.