AI Uncovers Ocean Reef Ancestry From Physical Shapes

Jenn Hoskins
21st June, 2025

AI Uncovers Ocean Reef Ancestry From Physical Shapes

This figure illustrates the specific morphological measurements taken from the macroscopic colony structure, or corallum (A-B), and the microscopic corallites (C-D), which provided the quantitative data for the machine learning models used to predict the genetic lineage of Porites spp. corals.

Image adapted from: Mitushasi et al. / CC BY (Source)

Key Findings

  • Researchers at the University of Tsukuba developed a new machine learning method to accurately identify coral species, overcoming challenges where corals look alike or vary widely
  • This method combines genetic data with detailed physical measurements of coral colonies and their tiny structures, outperforming traditional visual identification
  • The new approach is reproducible, cost-effective, and accessible, significantly aiding global coral conservation and understanding hidden diversity
Coral reefs, vital ecosystems teeming with life, are facing unprecedented threats from climate change. Understanding and accurately identifying the species that build these reefs, known as scleractinian or stony corals, is fundamental for effective conservation and for comprehending their evolution and ecology. However, this task is surprisingly complex. Coral species often display significant phenotypic plasticity, meaning their physical appearance can vary widely depending on environmental conditions, even within the same species. Conversely, different species can look almost identical, a phenomenon known as morphological crypsis or cryptic species. This inherent variability makes traditional identification based solely on visual traits unreliable. Previous research has repeatedly highlighted these challenges. For instance, a study on Hawaiian Montipora corals revealed a species complex where genetically related corals exhibited very different colony growth forms, demonstrating pervasive phenotypic plasticity[2]. This complexity, further compounded by processes like incomplete lineage sorting (when genetic lineages don't separate cleanly during speciation) or introgression (gene flow between different species), means that even extensive genomic data, such as the over 60,000 genetic markers (SNPs) used in that study, might not fully resolve species relationships based on current taxonomic frameworks[2]. Similarly, investigations into widespread Indo-Pacific corals like Pachyseris speciosa have uncovered multiple lineages that are morphologically indistinguishable but ecologically distinct, suggesting that our current understanding of coral diversity may be superficial[3]. The extensive Tara Pacific Expedition, which sampled corals across the Pacific Ocean, further underscored this issue, revealing numerous cryptic species within widely distributed corals such as Pocillopora meandrina and Porites lobata. These hidden species exhibited distinct evolutionary patterns despite inhabiting similar environments, emphasizing the need for multi-species investigations for conservation[4]. To address these persistent challenges in coral identification, researchers at the University of Tsukuba have developed a novel approach[1]. Their study focuses on integrating multiple lines of evidence, specifically combining molecular (genetic) and morphological (physical) data, using advanced machine learning techniques to improve coral species identification. The study utilized samples, including those from the Tara Pacific Expedition, of two common coral genera, Porites and Pocillopora. These samples were thoroughly genotyped, meaning their genetic makeup was analyzed using genome-wide data, hierarchical clustering (a method for grouping similar genetic profiles), and coalescence analyses (a technique for tracing genetic lineages back to their common ancestors). Alongside this detailed genetic work, comprehensive morphological traits were documented. This included features of the entire coral colony, known as the corallum, and the individual cups where coral polyps reside, called corallites. The core of their method involved training "Random Forest models." A Random Forest is a type of machine learning algorithm that constructs many decision trees and combines their outputs to make more accurate predictions. In this context, the models were trained to associate specific morphological traits with the genetically defined species. Two distinct models were developed for each coral genus. One model was designed for identifying species from in-situ photographs, meaning pictures taken directly in the ocean, by analyzing corallum traits. The second, more detailed model, was for integrative species identification, combining both corallum and corallite data obtained from high-resolution scanning electron micrographs. Traditional methods for analyzing morphological data, such as Principal Component Analysis (PCA) or Factor Analysis of Mixed Data (FAMD) followed by clustering techniques like k-means or hierarchical clustering, often struggle when morphological variations overlap significantly between genetic lineages. The University of Tsukuba's research demonstrated that their Random Forest models consistently outperformed these traditional approaches, accurately classifying corals into their correct genetic lineages even when their physical appearances were highly similar or overlapping. This advancement is particularly relevant given past taxonomic difficulties, such as those within the Poritidae family, where molecular studies were crucial to resolve relationships and revise genera like Goniopora, Machadoporites, and Poritipora that were morphologically indistinguishable[5]. The ability of machine learning to bridge the gap between morphology and genetics offers a powerful tool for these complex taxonomic revisions, providing a means to reconcile the often conflicting evidence from physical appearance and genetic data. This machine learning approach offers a reproducible, cost-effective, and accessible way to identify coral species. It significantly reduces the reliance on extensive taxonomic expertise, which is often a bottleneck in large-scale ecological and conservation studies. By complementing traditional molecular and phylogenetic studies, this method has the potential to significantly advance an "integrative taxonomy" workflow for corals, where genetic and morphological data are seamlessly combined to provide a more accurate picture of coral diversity and aid in their protection and restoration. The ongoing challenges of phenotypic plasticity and cryptic species, highlighted by earlier studies[2][3][4], underscore the critical need for such innovative tools to truly understand and conserve these vital marine ecosystems.

GeneticsMarine BiologyEvolution

References

Main Study

1) Morphological traits and machine learning for genetic lineage prediction of two reef-building corals

Published 18th June, 2025

https://doi.org/10.1371/journal.pone.0326095


Related Studies

2) Rare coral under the genomic microscope: timing and relationships among Hawaiian Montipora.

https://doi.org/10.1186/s12862-019-1476-2


3) Morphological stasis masks ecologically divergent coral species on tropical reefs.

https://doi.org/10.1016/j.cub.2021.03.028


4) Disparate genetic divergence patterns in three corals across a pan-Pacific environmental gradient highlight species-specific adaptation.

https://doi.org/10.1038/s44185-023-00020-8


5) A phylogeny of the family Poritidae (Cnidaria, Scleractinia) based on molecular and morphological analyses.

https://doi.org/10.1371/journal.pone.0098406



Related Articles

An unhandled error has occurred. Reload 🗙