Algorithm-Driven VHH Discovery
From Enrichment Bias to Data-Driven VHH Selection
When to use NGS sequencing and Analysis?
A common question is when to use NGS sequencing and analysis to support antibody discovery. At Isogenica, we can work with our customers to identify project requirements or antigen types which are well suited during our pre-project consultation. In general, NGS analysis is typically useful in cases where the panel of VHH binders is expected to be limited or where we are using ‘noisy’ types of antigens. Examples of project features where we might expect a limited number of binders are where there is a small epitope, closely related proteins to be avoided, GPCR or ion channel targets, stringent project criteria, rare desired functions, desired antagonist activity, or a junctional epitope. Difficult targets may require specialised antigens and bio panning strategies such as using nanodiscs or cell lines to display the target in a physiologically relevant conformation. In these cases, NGS sequencing of both the target-containing antigens and also empty versions allows us to deconvolute non-specific binders from the rare binders which are exclusive to the target.
How does it work?
Different approaches exist, but at Isogenica we effectively collect VHH proteins which have or have not bound to desired targets over multiple rounds of enrichment and NGS sequence those outputs. From there, we can employ a number of analytics using our in-house software and tools to help us to identify rare binders that would have been missed using only traditional Sanger Sequencesing methods and also to find and define families of binders to form a reference bank to pull additional sequences which might have functions similar to others as we progress through the project. We might, for example, learn that a particular function is linked to a specific sequence motif or binder family and use the NGS data to identify additional sequences with those desired features. These sequence families with positive and negative binders may also form the basis for downstream affinity maturation and antibody engineering by providing valuable insights. Combining our cell-free CIS Display library discovery platform either using our in-house VHH libraries or your custom libraries with NGS analysis removes biological biasis and uses CIS’s high capacity for parallel conditions to specifically correlate sequences to phenotypes.

Why VHH Antibodies?
- Compact, stable sequences simplify model training and interpretation
- High tolerance for mutation supports large-scale sequence exploration
- Single-domain architecture reduces confounding structural variables
- Well suited to difficult targets and non-canonical epitopes
These properties make VHHs ideal for data-dense discovery programmes, where the goal is not just to find a binder — but to understand why it binds.
Key Advantages of Algorithm Testing

Sequence Clustering
Deep sequence clustering analysis

Rare Binder Detection
Identify rare binders

Sequence Traceability
Build comprehensive sequence history for your lead panel

Panel Diversity
Keep lead panels wider for longer

Outcome Mapping
Associate sequence patterns to positive and negative outcomes

Genotype–Phenotype Linking
Link genotype to phenotype

IP Expansion
Widen patent aplication claims

Iterative Learning
Drive make–test cycles

AI/ML Training Data
Train AI/ML prediction algorithms

Engineering Success
Improve success of downstream affinity maturation and antibody engineering with more complete sequence information
Where can Algorithm Testing be Used?
- Early hit discovery and triage
- Difficult target programmes
- Functional screening (e.g. antagonists vs agonists)
- Affinity maturation and optimisation
- Data generation for AI-guided antibody engineering
It integrates seamlessly with:
Publications
Next-generation sequencing enables the discovery of more diverse positive clones from a phage-displayed antibody library
This study in Experimental & Molecular Medicine investigates NGS-enabled phage display analysis. The authors show that sequencing-based screening identifies a broader and more diverse set of functional antibody clones than conventional colony screening methods.
Assessing antibody and nanobody nativeness for hit selection and humanization with AbNatiV
In Nature Machine Intelligence, researchers present a framework for antibody and VHH nativeness. AbNatiV enables nativeness scoring and humanization guidance, supporting hit selection while retaining binding and stability, including an automated nanobody humanization pipeline.
Prediction of Antibody-Antigen Binding via Machine Learning: Development of Data Sets and Evaluation of Methods
Published on ScienceDirect, this paper analyses machine-learning approaches to antibody–antigen binding. It evaluates datasets and early ML methods for predicting antibody specificity directly from sequence, outlining key limitations and opportunities for scalable, sequence-based binding prediction.
IgBlend: Unifying 3D Structures and Sequences in Antibody Language Models
A recent bioRxiv preprint examines structure-aware antibody language models. IgBlend combines antibody sequences with 3D structural coordinates, improving sequence recovery, CDR editing and inverse folding, with model likelihoods correlating with experimental binding affinity.
Accurate prediction of antibody function and structure using bio-inspired antibody language model
External research in Briefings in Bioinformatics explores antibody language models. This study introduces BALM and BALMFold, showing how large-scale antibody sequence models enable accurate antigen-binding and full atomic structure prediction, outperforming established structure-prediction methods.
By-passing in vitro screening—next generation sequencing technologies applied to antibody display and in silico candidate selection
Read about bypassing classical in vitro antibody screening in Nucleic Acids Research. This study shows how combining display technologies with next-generation sequencing enables in silico candidate selection, revealing high-affinity antibodies missed by conventional screening workflows.






