Algorithm-Driven VHH Discovery

Isogenica enables AI- and ML-ready antibody discovery, combining large-scale NGS data with CIS Display to power rapid make–test–learn cycles — especially for difficult and noisy targets.

Book a Discovery Call

From Enrichment Bias to Data-Driven VHH Selection

Traditional VHH antibody discovery relies on natural enrichment of sequences binding to target antigens in an in vitro display setting such as phage display which are identified via Sanger Sequencing. While a valuable approach, this method is vulnerable to biases which can artifically enrich or entirely mask binders; leading to missing out on rare and valuable properties in the resulting lead panels. In recent years, the development of next generation sequencing (NGS) has profoundly advanced our ability to select binders with desired properties. Studies have shown that NGS sequencing outputs from phage display not only captures all the sequences which are identified via enrichment and Sanger Sequencing, but also many additional clones which would have been missed using traditional methods.

When to use NGS sequencing and Analysis?

A common question is when to use NGS sequencing and analysis to support antibody discovery. At Isogenica, we can work with our customers to identify project requirements or antigen types which are well suited during our pre-project consultation. In general, NGS analysis is typically useful in cases where the panel of VHH binders is expected to be limited or where we are using ‘noisy’ types of antigens. Examples of project features where we might expect a limited number of binders are where there is a small epitope, closely related proteins to be avoided, GPCR or ion channel targets, stringent project criteria, rare desired functions, desired antagonist activity, or a junctional epitope. Difficult targets may require specialised antigens and bio panning strategies such as using nanodiscs or cell lines to display the target in a physiologically relevant conformation. In these cases, NGS sequencing of both the target-containing antigens and also empty versions allows us to deconvolute non-specific binders from the rare binders which are exclusive to the target.

How does it work?

Different approaches exist, but at Isogenica we effectively collect VHH proteins which have or have not bound to desired targets over multiple rounds of enrichment and NGS sequence those outputs. From there, we can employ a number of analytics using our in-house software and tools to help us to identify rare binders that would have been missed using only traditional Sanger Sequencesing methods and also to find and define families of binders to form a reference bank to pull additional sequences which might have functions similar to others as we progress through the project. We might, for example, learn that a particular function is linked to a specific sequence motif or binder family and use the NGS data to identify additional sequences with those desired features. These sequence families with positive and negative binders may also form the basis for downstream affinity maturation and antibody engineering by providing valuable insights. Combining our cell-free CIS Display library discovery platform either using our in-house VHH libraries or your custom libraries with NGS analysis removes biological biasis and uses CIS’s high capacity for parallel conditions to specifically correlate sequences to phenotypes.

Why VHH Antibodies?

VHH single-domain antibodies are uniquely suited to algorithm-driven discovery:

Compact, stable sequences simplify model training and interpretation
High tolerance for mutation supports large-scale sequence exploration
Single-domain architecture reduces confounding structural variables
Well suited to difficult targets and non-canonical epitopes

These properties make VHHs ideal for data-dense discovery programmes, where the goal is not just to find a binder — but to understand why it binds.

Key Advantages of Algorithm Testing

Sequence Clustering

Deep sequence clustering analysis

Rare Binder Detection

Identify rare binders

Sequence Traceability

Build comprehensive sequence history for your lead panel

Panel Diversity

Keep lead panels wider for longer

Outcome Mapping

Associate sequence patterns to positive and negative outcomes

Genotype–Phenotype Linking

Link genotype to phenotype

IP Expansion

Widen patent aplication claims

Iterative Learning

Drive make–test cycles

AI/ML Training Data

Train AI/ML prediction algorithms

Engineering Success

Improve success of downstream affinity maturation and antibody engineering with more complete sequence information

Where can Algorithm Testing be Used?

Algorithm-driven discovery supports:

Early hit discovery and triage
Difficult target programmes
Functional screening (e.g. antagonists vs agonists)
Affinity maturation and optimisation
Data generation for AI-guided antibody engineering

It integrates seamlessly with:

Publications

Next-generation sequencing enables the discovery of more diverse positive clones from a phage-displayed antibody library

This study in Experimental & Molecular Medicine investigates NGS-enabled phage display analysis. The authors show that sequencing-based screening identifies a broader and more diverse set of functional antibody clones than conventional colony screening methods.

Assessing antibody and nanobody nativeness for hit selection and humanization with AbNatiV

In Nature Machine Intelligence, researchers present a framework for antibody and VHH nativeness. AbNatiV enables nativeness scoring and humanization guidance, supporting hit selection while retaining binding and stability, including an automated nanobody humanization pipeline.

Prediction of Antibody-Antigen Binding via Machine Learning: Development of Data Sets and Evaluation of Methods

Published on ScienceDirect, this paper analyses machine-learning approaches to antibody–antigen binding. It evaluates datasets and early ML methods for predicting antibody specificity directly from sequence, outlining key limitations and opportunities for scalable, sequence-based binding prediction.

IgBlend: Unifying 3D Structures and Sequences in Antibody Language Models

A recent bioRxiv preprint examines structure-aware antibody language models. IgBlend combines antibody sequences with 3D structural coordinates, improving sequence recovery, CDR editing and inverse folding, with model likelihoods correlating with experimental binding affinity.

Accurate prediction of antibody function and structure using bio-inspired antibody language model

External research in Briefings in Bioinformatics explores antibody language models. This study introduces BALM and BALMFold, showing how large-scale antibody sequence models enable accurate antigen-binding and full atomic structure prediction, outperforming established structure-prediction methods.

By-passing in vitro screening—next generation sequencing technologies applied to antibody display and in silico candidate selection

Read about bypassing classical in vitro antibody screening in Nucleic Acids Research. This study shows how combining display technologies with next-generation sequencing enables in silico candidate selection, revealing high-affinity antibodies missed by conventional screening workflows.

More technical resources

Turn Discovery Campaigns into Learning Engines

If your programme requires deeper mining of noisy data, faster iteration, or AI-ready discovery workflows, Isogenica can help you design campaigns that learn — not just select.

Book a Discovery Call