Sequence vs Structure: Grouping Antibodies in Simulated Data

Jenn Hoskins
2nd June, 2025

Sequence vs Structure: Grouping Antibodies in Simulated Data

The constructed dataset establishes a diverse ground truth for evaluating clustering algorithms by including antibody pairs with varying CDRH3 lengths (b) and antigen targets, primarily SARS-CoV-2 (a), while crucially demonstrating the existence of functionally converged antibodies that share high epitope overlap despite low sequence identity (c).

Image adapted from: Waury et al. / CC BY (Source)

Key Findings

  • In a study from Amsterdam and partner institutions, adding 3D structural details to antibody clustering revealed new, functionally related groups beyond traditional sequence methods
  • The new structure-based methods grouped antibodies with similar functions even when their sequences differed, though SPACE2 is limited by its need for identical CDR lengths
A recent study from[1] conducted by researchers at Vrije Universiteit Amsterdam, Utrecht University, ENPICOM B.V., The Hyve B.V., and Sorbonne University explores how antibody repertoire sequencing can be improved by incorporating structural information into clustering methods. This work looks at new ways to group antibodies based on similarities in their three-dimensional structures rather than relying solely on sequence information. Antibody repertoire sequencing is a technique used to analyze the diverse collection of antibodies produced by B cells. B cells are the immune cells that make antibodies to help defend the body against infections. Traditional methods for grouping these antibodies, known as clonotyping, usually focus on the similarity of specific parts of the antibody sequence, like the CDRH3 region and the use of V/J gene segments. However, research has shown that important functional similarities may exist between antibodies that do not have highly similar sequences[2]. Earlier studies have emphasized that understanding the differences between clones and their sequence diversity is central to unraveling the complexity of immune responses[2]. Moreover, there is additional evidence that looking at convergent features in antibody repertoires can greatly assist in identifying functionally similar antibodies that target the same region on an antigen[3]. The current study tackles the challenge of identifying antibodies that are functionally related but may not exhibit a high degree of sequence similarity. To achieve this, the researchers compared conventional sequence-based methods with new structure-based clustering algorithms. Two specific structure-based methods, SAAB+ and SPACE2, were evaluated to see if they could effectively group antibodies that share a common binding site (epitope) on an antigen, even if their sequences differ significantly. To test these methods, the team curated a dataset of well-annotated antibody pairs. These pairs were selected because they bind to the same region on their respective antigen, meaning that despite potential differences in sequence, they perform similar functions. The curated dataset was then introduced into a simulated repertoire, which is a model that represents the diverse population of antibodies found in a typical immune response. By using this approach, the researchers could compare how effectively each clustering method grouped functionally related antibodies. The findings show that structure-based methods such as SAAB+ and SPACE2 grouped more antibodies together than the conventional clonotyping approach. This suggests that incorporating structural information can reveal links between antibodies that might otherwise appear unrelated if only sequence information is considered. However, the study also identified some limitations. One notable challenge with the SPACE2 method is its reliance on having CDR regions of the same length, which can restrict its usefulness when dealing with natural variations in antibody structures. This limitation highlights an area in need of further development so that structure-based clustering can become more broadly applicable to diverse sets of repertoires. The current work builds on earlier insights that have shown the importance of antibody diversity and the potential for functional convergence among antibodies[2][3]. In those studies, the focus was primarily on understanding the immune repertoire by exploring sequence variations and identifying convergent patterns after vaccination or infection. By introducing structural information into the clustering methods, the new study extends these previous approaches, offering a promising alternative to overcome the limitations inherent in solely sequence-based methods. Practically speaking, the implications of this research are significant for both basic immunology and applied medical science. For example, in drug discovery, finding antibodies that target a specific epitope—even when their sequences differ—can be crucial for developing effective therapeutics. Similarly, in understanding immune responses, a more accurate grouping of antibodies based on structural similarities could lead to better interpretations of how the immune system adapts during infection or in disease states. The method used in this study is notable for its comprehensive approach. Rather than testing the new clustering methods on a small, uniform set of antibodies, the researchers evaluated them on a simulated repertoire designed to mimic the complexity found in natural immune responses. This level of rigor in creating realistic conditions for testing gives greater confidence in the applicability of their findings to actual antibody repertoires. In summary, the study demonstrates that moving beyond sequence-based grouping to include structural information can lead to improved clustering of antibodies. While the new methods are promising, especially in their ability to identify low-sequence identity groups that share functional properties, there remain challenges such as handling varying CDR lengths. Nonetheless, the work importantly bridges gaps between previous findings that highlighted the significance of antibody sequence analysis[2] and those that pointed to convergent functional properties[3]. The research sets a foundation for future efforts aimed at refining structure-based clustering methods to fully harness the complexity and potential of the immune repertoire.

BiotechGenetics

References

Main Study

1) Comparison of sequence- and structure-based antibody clustering approaches on simulated repertoire sequencing data

Published 30th May, 2025

https://doi.org/10.1371/journal.pcbi.1013057


Related Studies

2) The analysis of clonal expansions in normal and autoimmune B cell repertoires.

https://doi.org/10.1098/rstb.2014.0239


3) Current strategies for detecting functional convergence across B-cell receptor repertoires.

https://doi.org/10.1080/19420862.2021.1996732



Related Articles

An unhandled error has occurred. Reload 🗙