AI Predicts Disease Links In Our Genes

Greg Howard
20th June, 2025

AI Predicts Disease Links In Our Genes

The MVHGCN workflow predicts circRNA-disease associations by integrating diverse data sources into a heterogeneous graph, from which multiple relational views are extracted via meta-paths and aggregated using a graph convolutional network.

Image adapted from: Miao et al. / CC BY (Source)

Key Findings

  • Researchers from Chinese universities developed MVHGCN, a new AI tool, to accurately predict links between circular RNAs and diseases, overcoming data sparsity issues
  • This method builds a comprehensive network of biological data and uses advanced AI to find hidden, indirect connections between circRNAs and diseases
  • MVHGCN significantly outperforms previous methods, offering a powerful new way to accelerate disease research and develop new treatments
Circular RNAs, or circRNAs, are a fascinating class of genetic molecules found within cells. Unlike the more common linear RNAs, which have distinct start and end points, circRNAs form a closed loop. For a long time, their exact roles were unclear, with many considering them merely byproducts of gene activity. However, research has increasingly shown that circRNAs are not just cellular debris but active participants in biological processes. Early studies, for instance, revealed that circRNAs can act as important regulators in cells, with some like CDR1as having many binding sites for tiny molecules called microRNAs (miRNAs), effectively "sponging" them up and preventing them from acting elsewhere. This suggested that circRNAs form a large class of post-transcriptional regulators, influencing gene expression after the initial genetic code has been read[2]. The understanding of circRNAs has expanded significantly since these initial discoveries. It is now known that they play significant roles in human health and disease. Given their universality, specificity, and stability, circRNAs are becoming an ideal class of potential biomarkers for disease diagnosis, treatment, and predicting how a disease might progress. Identifying which specific circRNAs are linked to particular diseases is therefore crucial for developing new diagnostic tools and therapeutic strategies. However, traditional laboratory experiments to uncover these circRNA-disease associations are often inefficient, costly, and time-consuming. This has led to a growing reliance on computational models as an effective alternative. While some computational models have been developed to predict various RNA interactions, such as those between circRNAs and miRNAs, using advanced techniques like deep graph collaboration learning to capture complex relationships[3], and even models for direct circRNA-disease association prediction[4], they still face significant challenges. These include issues like data sparsity – meaning there isn't enough comprehensive data to train models effectively – and the difficulty of confirming "negative samples," which are instances where a circRNA is definitively not associated with a disease. These limitations can hinder the accuracy of predictions. To address these persistent challenges, a novel computational method called MVHGCN has been proposed by researchers at Northeast Forestry University, Harbin Institute of Technology, Harbin Medical University, and University of Electronic Science and Technology[1]. This method aims to predict potential associations between circRNAs and diseases with greater accuracy and efficiency. The MVHGCN approach tackles the problem by first constructing what is known as a "heterogeneous graph." In this context, a graph is a network of interconnected points, or "nodes." A heterogeneous graph means it contains different types of nodes – in this case, circRNAs and diseases – and different types of connections, or "edges," between them. To build this comprehensive network, MVHGCN integrates information from multiple existing biological databases, generating detailed "feature descriptors" for both circRNAs and diseases. This integration is crucial because the biological functions of circRNAs can be complex, involving interactions with various other molecules. For example, the ability of circRNAs to bind to RNA-binding proteins (RBPs), which is influenced by their unique three-dimensional "secondary structures," is a key aspect of their function that specialized computational tools can predict[5]. By incorporating such diverse information, MVHGCN can build a richer picture of potential interactions. A key innovation of MVHGCN lies in its use of "meta-paths" to extract different "connection views" between circRNAs and diseases. A meta-path describes a sequence of relationships between different types of nodes in the heterogeneous graph. For instance, a meta-path might describe a connection from a circRNA to a miRNA, then from that miRNA to a disease. By exploring these various meta-paths, MVHGCN maximizes the utilization of all known association information, not just direct links, allowing it to uncover more subtle and indirect relationships. This is a significant step beyond simpler models that might only consider direct connections, and it builds upon the concept of capturing "deep collaborative features" that have proven successful in predicting other types of RNA interactions[3]. Following the construction of this intricate network and the extraction of various connection views, MVHGCN employs "graph convolutional networks" (GCNs). GCNs are a type of artificial intelligence algorithm, specifically a neural network, designed to process data that is structured as a graph. They work by aggregating information from a node's immediate neighbors in the graph, and then from its neighbors' neighbors, and so on. This allows the GCN to learn "deep feature information" – complex patterns and relationships that might not be obvious from simple inspection. Finally, a Multi-Layer Perceptron (MLP), which is another type of neural network, is used to process these learned features and predict the association scores between circRNAs and diseases. A higher score indicates a stronger predicted association. The experimental results demonstrate that MVHGCN significantly outperforms existing methods on benchmark datasets, as shown by 5-fold cross-validation. This means that when tested on different subsets of data, the model consistently delivered superior prediction accuracy compared to previous approaches, including those like KATZHCDA[4]. By effectively alleviating the problem of data sparsity and accurately identifying potential associations, this research provides a powerful new approach to studying the complex relationships between circRNAs and diseases. It represents an important step forward in leveraging computational power to accelerate our understanding of disease mechanisms and potentially lead to new diagnostic and therapeutic breakthroughs.

MedicineBiotechGenetics

References

Main Study

1) MVHGCN: Predicting circRNA-disease associations with multi-view heterogeneous graph convolutional neural networks

Published 19th June, 2025

https://doi.org/10.1371/journal.pcbi.1013225


Related Studies

2) Circular RNAs are a large class of animal RNAs with regulatory potency.

https://doi.org/10.1038/nature11928


3) DGCLCMI: a deep graph collaboration learning method to predict circRNA-miRNA interactions.

https://doi.org/10.1186/s12915-025-02197-9


4) Prediction of CircRNA-Disease Associations Using KATZ Model Based on Heterogeneous Networks.

https://doi.org/10.7150/ijbs.28260


5) CRBPSA: CircRNA-RBP interaction sites identification using sequence structural attention model.

https://doi.org/10.1186/s12915-024-02055-0



Related Articles

An unhandled error has occurred. Reload 🗙