HCMV: one weekend, two pathways

Case study of using Boltz-2 to screen two small molecules against hundreds of host proteins in a single weekend.

Summary

We started with a simple question that is usually answered the hard way: two promising small molecules and a need to know what they actually bind. Rather than test one protein at a time, we used Boltz-2 to virtually screen each compound against roughly three hundred candidate protein targets. Boltz-2 returns confident protein–ligand complexes together with binding affinity predictions, which is the difference between a pose that only looks plausible and one that is likely to bind.

That workflow let us cut through the noise and surface a short list of strong binders for each compound, pointing to two distinct host pathways that HCMV relies on. Wet lab validation is ongoing, with a second part of this case study coming soon, but the takeaway is clear: structure plus affinity beats structure alone when you are deciding what to test next.

Note on privacy: we ran the full screen ourselves. No one at Vici.bio handled or viewed the underlying data. The platform enforces encryption and data isolation, and this project stayed entirely under our control.

Introduction

If you work on HCMV, you know the pattern: very common virus, real risk in immunosuppressed patients, limited drugs, and biology that likes to hide. HCMV establishes lifelong latency, reactivates when immunity dips, and leans heavily on host pathways to get what it needs. Clinically, there is still a trade off between toxicity and resistance with current antivirals, so new mechanisms matter.

In our project, two small molecules had already shown a clear reduction in HCMV activity. That is a good start, but “works” is not the same as “we know why.” To move forward confidently, we needed to work out the protein targets so we could design focused assays instead of broad fishing expeditions. In other words, we needed target deconvolution rather than another round of guesswork.

The traditional route involves pull downs, proteomics, and a lot of careful detective work. That approach is valuable and still essential, but it is slow and expensive if you do not already have a short list. Here the goal was different: use computational screening to do the triage first, then spend wet lab time on the most informative experiments.

Methodology: why Boltz-2 helps

The challenge looked simple on paper: two compounds and about three hundred protein targets. The real question was which pairs are likely to be true binders. Structure only models can deliver high pLDDT and similar confidence scores for complexes that still say nothing about whether a ligand actually binds. In a screen of this size, that kind of signal quickly saturates and you are left without a ranking.

Boltz-2 closes that gap by co modelling the protein with the small molecule and predicting both the three dimensional complex and an affinity signal. That affinity readout is the “moneyball” statistic for this use case. It lets you separate complexes that could in principle bind from those that are likely to matter in the cell.

Screen configuration

  • Targets. Around three hundred protein sequences, curated from earlier work and from host factors that are relevant to HCMV biology.
  • Compounds. Two small molecules encoded as SMILES strings.
  • Boltz-2 settings. recycles = 3, diffusion_steps = 200, samples = 1, ensemble = 1, and affinity = True for every compound–protein pair. This is a fast configuration that is designed to separate likely binders from non binders rather than micromanage poses.
  • Ligand limits. Affinity is currently supported for small molecules binding proteins. Ligand sizes up to about 128 atoms are accepted, with best reliability when you stay at or below roughly 56 atoms after hydrogen removal.

Affinity outputs that matter

  • affinity_probability_binary. A score between 0 and 1 that asks “binder or decoy.” This is the main signal for hit discovery and the first pass filter that we used in the rapid screen.
  • affinity_pred_value. An estimate of log10(IC50 in micromolar) for the predicted complex. Lower values correspond to stronger predicted binders and can be used to compare molecules once you are already inside the hit space.

As a rough guide, a predicted value near −3 corresponds to nanomolar scale potency, around 0 maps to micromolar, and values near 2 or higher look more like weak binders or decoys. For this first screen we focused on the binary probability and treated the continuous estimate as a secondary ranking signal.

Quick triage rule

For each compound–protein pair, we kept results where affinity_probability_binary in ensemble 1 was above 0.5 and dropped the rest. That simple threshold produced a manageable shortlist of candidate targets per compound that still captured clear patterns at the pathway level.

Screening pipeline in four steps

  1. Batch run. Generate candidate complexes and affinity outputs for every compound–protein combination in the panel.
  2. Filter. Keep the pairs with high binder probability and sanity check the poses, contacts, and gross stability of the predicted complexes.
  3. Group by pathway. Map high probability targets to their host pathways to build a pathway level hypothesis for each compound.
  4. Hand off to the lab. Design a minimal set of high signal assays that probe those pathways directly instead of chasing many unrelated candidates.

Results and next steps

After filtering on the affinity signal and grouping by pathway, a clean pattern emerged. For the first compound, the surviving targets clustered in a host pathway that supports cell entry and trafficking. For anyone used to thinking about HCMV, that fits with a virus that leans heavily on host machinery to get in and move around.

For the second compound, the screen highlighted targets associated with RNA processing and splicing. That is also consistent with what we know about viral dependence on host gene expression and post transcriptional control. In other words, the two molecules appear to press on two very different but biologically sensible host pathways.

We are deliberately not naming the individual proteins here. Validation is ongoing and there are constraints around how much detail can be shared before the experiments are complete. The important point at this stage is the pattern: a small, coherent set of targets per compound that maps neatly onto two distinct host pathways that HCMV relies on.

In practical terms, this saves a significant amount of experimental effort. Instead of dozens of pull downs and broad surveys built around “maybe” targets, we can move directly to a handful of well designed assays that confirm or reject a small number of hypotheses at the pathway level. That is the kind of shortcut that matters when time, budget, or precious samples are limited.

The next step is simple: pressure test these AI derived hits in the lab, report how they hold up, and then dig into the mechanistic story if the data supports it. A second part of this case study will cover that follow up. If you want to run a similar screen, for example two compounds against hundreds of proteins without a computational background, that is exactly the sort of use case the Vici.bio platform is designed to support.

References

  1. Boltz-2: Towards Accurate and Efficient Binding Affinity Prediction. bioRxiv (2025). doi: 10.1101/2025.06.14.659707.