A Combined Approach to Pinpoint Key Genes Driving Complex Traits

Jim Crocker
31st May, 2025

A Combined Approach to Pinpoint Key Genes Driving Complex Traits

By integrating GWAS meta-analysis peaks with significant gene expression associations, this analysis prioritizes DGAT1 (a) and SERINC5 (b) as candidate causal genes regulating milk polar lipid concentrations.

Image adapted from: Ghoreishifar et al. / CC BY (Source)

Key Findings

  • In Australia, researchers analyzed nearly 14 million genetic markers in 336 dairy cows to uncover links between DNA and milk polar lipid levels
  • They combined gene activity data from blood and mammary tissues to identify over 3,500 genes associated with milk fat composition
  • Regions with strong genetic signals were more likely to include fat-regulating genes, reinforcing their role in determining milk quality
[1] Researchers from Agriculture Victoria Research, La Trobe University, Livestock Improvement Corporation, Massey University, University of Melbourne, and Wageningen University & Research set out to address a common challenge in genetic studies. Genome-wide association studies (GWAS) have helped locate regions of the genome associated with complex traits, yet many of these regions fall in non-coding areas of the genome. This makes it difficult to pinpoint the specific genes or genetic variants responsible for the observed traits. In this study, the focus was on traits linked to the concentration of polar lipids in cow milk—a measurable and important feature in terms of dairy quality and animal biology. The study investigated three main types of evidence that can be used to identify genes responsible for quantitative trait loci (QTL). The three approaches include: determining gene proximity to the most significant GWAS variant; assessing whether levels of gene expression correlate with the trait; and examining whether the gene plays a known physiological role related to the trait. By using these three criteria, the researchers hoped to improve confidence in selecting candidate causal genes, while recognizing some inherent limitations in the methods. In the study, researchers first performed single-trait GWAS on approximately 14 million imputed genetic variants in 336 cows. They examined 56 specific milk polar lipid (PL) phenotypes. A multi-trait meta-analysis of these GWAS data led to the identification of more than 10,000 significant single nucleotide polymorphisms (SNPs) that met a false discovery rate threshold. This initial step indicated that there were strong genetic signals associated with milk PL concentration. To dig deeper into understanding which genes may be responsible, the team gathered transcriptome data—that is, comprehensive measurements of gene expression—from two tissues: blood (from 143 cows) and mammary tissue (from 169 cows). Using a method called genetic score omics regression (GSOR), the researchers connected the dots between observed gene expression levels and the predicted genetic influence on the PL phenotypes. In simple terms, GSOR evaluates whether a gene’s expression level, when driven by the animal’s genetics, has an association with a trait. The GSOR analysis identified over 2,100 genes in blood and 1,400 genes in mammary tissue that were significantly associated with at least one polar lipid trait. To further validate these findings, the genome was divided into 100 Kb non-overlapping windows. The researchers then tested whether the regions where GSOR had identified significant genes overlapped with the regions highlighted by the GWAS signals. They found a significant overlap between the GSOR hits and the GWAS signals. Specifically, regions with significant GWAS signals were 1.47 times more likely to contain genes flagged by GSOR than regions without such signals. This overlap reinforces confidence that both methods, when combined, provide insights into the genetic regulation of milk polar lipids. An additional layer of analysis involved examining the functions of these candidate genes using gene ontology (GO), which categorizes genes into groups based on shared biological roles. When comparing all expressed genes with those identified in the significant windows, researchers found that the significant genes were enriched for functions related to lipid metabolism. For example, in mammary tissue, seven genes in the key regions were involved in lipid metabolism, while five were identified in blood. Among the candidate genes were DGAT1, ACSM5, SERINC5, ABHD3, CYP2U1, PIGL, ARV1, SMPD5, and NPC2. The overlap between the GWAS, GSOR, and GO analyses suggests that using multiple layers of evidence increases the likelihood of successfully identifying genes that mediate the effects of QTL variations on traits. Earlier studies have similarly combined genetic association data with measures of gene expression to elucidate complex trait biology. One such study[2] introduced a gene-based association method known as PrediXcan, which estimates genetically regulated gene expression and correlates it with phenotype. This approach reduced the burden of multiple testing and provided a framework for designing subsequent experiments. Other investigations[3] have explored transcriptome-wide association studies (TWAS) to prioritize causal genes at GWAS loci. Researchers in that work highlighted both the advantages and limitations of using expression quantitative trait loci (eQTL) across different tissues. The current study echoes these findings by reinforcing that integrating gene expression with GWAS data can improve causal gene discovery, although the strength of the evidence is moderate and subject to limitations such as modest odds ratios. There is also a complementary body of work[4] that capitalizes on the integration of genotype, gene expression, and phenotype data to better comprehend the genetic basis of complex traits, demonstrating the potential impact of these methods on understanding biological mechanisms. Furthermore, another study[5] estimated local genetic correlations between gene expression and traits, which further adds to the evidence that combining different types of data can uncover hidden relationships between genotype and phenotype. This main study advances the field by testing the effectiveness of various methods on a clearly defined system—cow milk polar lipids. By comparing the proximity of candidate genes to GWAS signals, their expression correlations with the trait, and their known biological roles, the researchers created a robust framework for candidate gene identification. While larger sample sizes and further methodological refinements are needed to overcome challenges like linkage disequilibrium (a factor where genetic variants are inherited together), these findings mark a step forward in resolving which genes are truly instrumental in the manifestation of complex traits. The integration of GWAS and transcriptomics in this study builds on past research and demonstrates the value of combining multiple lines of evidence. Although the results exhibit limitations in statistical power, they highlight an important direction for future research.

GeneticsAnimal Science

References

Main Study

1) An integrative approach to prioritize candidate causal genes for complex traits in cattle

Published 30th May, 2025

https://doi.org/10.1371/journal.pgen.1011492


Related Studies

2) A gene-based association method for mapping traits using reference transcriptome data.

https://doi.org/10.1038/ng.3367


3) Opportunities and challenges for transcriptome-wide association studies.

https://doi.org/10.1038/s41588-019-0385-z


4) Integrative approaches for large-scale transcriptome-wide association studies.

https://doi.org/10.1038/ng.3506


5) Integrating Gene Expression with Summary Association Statistics to Identify Genes Associated with 30 Complex Traits.

https://doi.org/10.1016/j.ajhg.2017.01.031



Related Articles

An unhandled error has occurred. Reload 🗙