Meet Chai-1, AlphaFold 3 power without limitations.

An open AlphaFold3-class engine for proteins, nucleic acids, ligands, and glycans on Vici.bio.

Summary

Chai-1 is a cutting-edge multimodal structure predictor, the open answer to AlphaFold 3. It was trained to co-fold proteins, nucleic acids, ligands, ions, and glycans simultaneously, so the same network can reason about whole molecular assemblies instead of treating partners separately. In practice Chai-1 reaches AlphaFold 3-level accuracy across complexes, protein–ligand systems, and nucleic-acid interactions, making it a go-to engine for drug discovery, bioengineering, and structural biology. Unlike AlphaFold3, which is encumbered by restricted licensing, Chai-1 is freely available for all uses. The network respects stereochemistry and keeps cofactors explicit, so predicted complexes align with experimental intuition. You can run lightweight single-sequence jobs in minutes, incorporate alignments and templates when evolutionary context matters, or inject restraints to steer the model using structural biology data.

Pair Chai-1 with validation metrics like DockQ, pLDDT, and ipTM to loop quickly between prediction, assessment, and design. Chai-1 gives you state-of-the-art structural predictions comparable to AlphaFold 3, without the red tape. It’s integrated into the Vici.bio platform so you can run it easily through our web interface, no coding or special hardware needed on your end. Whether you want to model a single protein or a complex molecular assembly, Chai-1 on Vici.bio provides fast, high-quality predictions and flexibility to incorporate your experimental knowledge.

How to use Chai-1 on Vici.bio

A.Paste a minimum of 1 sequence into the molecules box and hit execute. This will execute a Chai-1 job under default settings. Below I will explain what each parameter is and why tuning them to your task will optimise your output.

You can run Chai-1 as a Solo job or incorporate it into a NEXO . workflow.

Inputs for Chai-1

  • Protein sequences. Paste each chain’s FASTA sequence as a separate molecule entry. To model a multimer, repeat sequences for every copy in the stoichiometry.
  • Nucleic acids. Provide DNA or RNA strings when your complex includes them; Chai-1 natively co-folds nucleic acid chains with proteins.
  • Ligands, ions, cofactors. Add small molecules via SMILES or CCD identifiers, anything from drug-like compounds to metal ions such as [Zn+2].
  • Glycans Supply CCD codes for glycans and other modifications so the model keeps them explicit in the 3D result.
  • Assembly size. There is no hard cap on chains, but large systems demand more GPU time; trimming to functional domains keeps iteration fast.

MSA & template options

  • None: run Chai-1 in single-sequence mode for maximum speed. It still rivals other no-MSA models thanks to its protein language model backbone.
  • MMseqs2: the default when you want evolutionary context without long queues; ideal for well-studied families.
  • JackHMMER: the deep search option. It uncovers remote homologs that can sharpen tough complexes at the cost of compute.
  • Templates: enable when you trust a homologous PDB entry; Chai-1 blends template geometry with its own reasoning.

Start with single-sequence mode for novel antibodies, freshly designed proteins, or when you need an answer in minutes, if the result looks promising, rerun with MMseqs2 or JackHMMER to squeeze out extra accuracy. Toggle templates only when a reliable structure exists; they can accelerate convergence but should not override creative designs.

Refinement controls

  • Recycles: three passes are the sweet spot for quick screens; bump to 6–8 (or even 12) when you need to resolve stubborn loops or inter-chain clashes.
  • Diffusion steps: 200–300 keeps rapid prototyping snappy, 300–600 is the balanced production range, and 600+ is for squeezing out the last angstrom on finalists.
  • Seed: fix the seed to reproduce a model exactly; vary it when you want alternative diffusion trajectories or multiple binding hypotheses.
  • Samples: 1–5 samples explore most conformational space; go higher for flexible antibodies, ligands with many poses, or when combining with varied seeds.

Mix higher recycles with extra samples when wrestling with flexible complexes: the recycles polish geometry while the sampling surfaces alternate poses. For large campaigns, start with a light configuration (MSA = None, 200 diffusion steps, one sample), tag the best seeds, then rerun those hits with deeper settings and alignments.

Restraints

  • Pocket restraint: pin a ligand or partner near a known pocket or epitope loop.
  • Contact restraint: enforce residue-residue proximity from cross-linking, mutational, or literature data.
  • Covalent restraint: lock atoms together for covalent inhibitors, fusion junctions, or engineered disulfides.

Use restraints sparingly: they supercharge tough cases but should reflect real hypotheses or data.

Case study

Imagine engineering a new antibody against a viral antigen. You can load the heavy and light chains plus the antigen sequence as three molecules and run Chai-1 in single-sequence mode (MSA = None) to get a fast baseline prediction. If epitope mapping suggests residues 100–110 on the antigen form the binding site, add a pocket restraint or explicit contact restraints that focus Chai-1 on that loop.

Request a handful of samples with different seeds to explore binding orientations. When the job completes, compare the models using interface confidence, pLDDT colour maps, and optionally DockQ if you have an experimental reference. In tests, top-ranked samples often nail the Ig fold, surface complementarity, and epitope contacts, letting you pick the most plausible pose for downstream design.

That workflow illustrates how restraints and sampling become design tools: you can screen multiple hypotheses rapidly, feed leading poses into Antibody MPNN, and iterate without leaving the platform.

Six ways to use Chai-1

Drug discovery

Protein–ligand complexes in context.

  • Model how small-molecule candidates sit in active sites and assess key contacts before synthesis.
  • Use Chai-1’s ~77% ligand RMSD success (within 2 Å) to triage chemotypes and plan structure-based iterations.

Antibody engineering

Antibody–antigen docking with loops.

  • Predict CDR loop conformations and epitope footprints to accelerate vaccine or therapeutic programs.
  • Feed trusted complexes into Antibody MPNN or mutational scans to propose affinity-improving variants.

Protein assemblies

Multimers, receptors, and adaptors.

  • Explore dimer, trimer, or higher-order stoichiometries to map interaction networks or mutation effects.
  • Compare ipTM and DockQ across samples to prioritise realistic assemblies.

Enzyme mechanism

Cofactors and metals in place.

  • Include metals or cofactors directly in the job to recover active-site geometry in one pass.
  • Visualise enzyme–substrate complexes to infer catalytic mechanisms or design transition-state mimics.

Nucleic acid complexes

DNA/RNA with proteins or ligands.

  • Predict protein–DNA promoter binding or RNA–protein recognition with the native nucleic acid sequence.
  • Design genome-editing tools by inspecting how guides, Cas proteins, and accessory factors co-fold.

De novo design

Validate creative folds fast.

  • Validate hypothetical proteins or peptides using fast single-sequence runs before investing lab effort.
  • Incorporate low-resolution restraints (cryo-EM, SAXS) so Chai-1 builds atomic models consistent with experimental envelopes.

Tips and tricks for best results

  • Leverage single-sequence mode for speed. Skip alignments when exploring novel proteins or screening many variants—Chai-1’s language model keeps accuracy high even without MSAs.
  • Bring in MSAs or templates for tough folds. Well-studied enzymes and large assemblies often benefit from evolutionary context or a trusted template, but don’t overuse them when you’re testing new designs.
  • Dial recycles and diffusion steps to suit the task.
    • Use ~3 recycles and 200 diffusion steps for rapid triage.
    • Push toward 8 recycles and 600+ steps when polishing finalists or stubborn loops.
  • Sample broadly when systems are flexible. Request multiple samples and vary seeds to uncover alternative binding modes for antibodies, ligands, or multi-domain proteins.
  • Include metals and unusual cofactors explicitly. Encode them as ligands (e.g., [Zn+2]) so the diffusion module positions them correctly in the active site.
  • Use restraints as experimental hypotheses. Pocket, contact, and covalent restraints help test “what if” scenarios, but sanity-check that the resulting models remain chemically plausible.
  • Chain Chai-1 with validation metrics. Combine DockQ, RMSD, or binding-energy checks to quantify accuracy and prioritise follow-up work.
  • Watch for disorder. Low pLDDT regions often reflect genuinely floppy segments; treat hallucinated order with caution and lean on experiments when available.

Validation companions

  • pLDDT: highlight confident vs. uncertain loops; low scores often mean disorder or alternative states.
  • ipTM: gauge assembly confidence; combine with DockQ or interface inspection.
  • Clash flags: resolve steric overlaps with more recycles or quick minimization.
  • DockQ / RMSD: compare against experiments or trusted models for quantitative agreement.

What Chai-1 optimises

L = L distance (S,S*) + L angle (S,S*) + L ligand (S,S*) +

𝓛 = 𝓛distance(S, S*) + 𝓛angle(S, S*) + 𝓛ligand(S, S*) + …

During training the network minimises a composite loss that balances global geometry (𝓛distance), local stereochemistry (𝓛angle), ligand and cofactor placement (𝓛ligand), and auxiliary penalties for clashes, torsions, and restraint satisfaction.

  • Distance and pair losses teach the Pairformer to honour coarse-grained contact maps and inter-chain separations.
  • Angle and torsion losses keep bond lengths realistic, maintain chirality, and reinforce secondary structure motifs.
  • Ligand-aware terms penalise incorrect coordination, RMSD, and contact counts so small molecules, ions, and glycans settle into sensible poses.

The diffusion sampler acts like an iterative solver for this objective: each denoising step nudges noisy coordinates toward the minima implied by the loss, rather than attempting to guess the final structure in one pass. Sampling with different seeds explores alternative low-loss solutions, which is why multiple recycles and samples can reveal distinct conformations.

Conceptually, Chai-1 has learned a data-driven energy landscape. Minimising 𝓛 is analogous to relaxing a physics-based force field, but the gradients are supplied by neural networks trained on millions of experimentally solved structures.

Chai-1 vs. other tools

  • AlphaFold 2 / AlphaFold-Multimer. Stellar for protein-only targets, but lacks direct support for ligands, nucleic acids, or glycans. Licensing and hardware demands complicate industrial deployment, whereas Chai-1 ships under Apache 2.0 and is tuned for cloud throughput.
  • AlphaFold 3. Introduced the Pairformer and diffusion advances, yet the official weights are closed. Chai-1 mirrors its architecture and benchmark performance while staying openly accessible and restraint-aware on Vici.bio.
  • Boltz-2. Another open AF3-class release that performs similarly on antibodies and complexes. Many teams run both, using Chai-1 for its restraint support, workflow hooks, and ligand training while cross-checking against Boltz-2 when extra consensus helps.
  • ESMFold These single-sequence models are fast but drop accuracy on multimers and heteromolecular assemblies. Chai-1 leverages language embeddings when MSAs are absent and benefits from alignments, so it outperforms ESMFold on monomers and vastly surpasses on complexes.
  • Classical docking engines. Docking keeps the receptor rigid and scores hundreds of poses. Chai-1 co-folds proteins with ligands so induced-fit side-chain motions emerge naturally; you can still pass the resulting poses into docking or free-energy tools for affinity ranking.

In practice you can ensemble predictions, run fast MSA-free screens with Chai-1, and only escalate to alternative engines when you need orthogonal confirmation.

Trust it when

  • Targets are structured domains or well-behaved complexes with clear interfaces.
  • pLDDT and ipTM agree and restraints (if any) are satisfied without stress.
  • Ligands land in chemically sensible pockets with correct coordination.

Add extra verification when

  • Regions are intrinsically disordered; the model may hallucinate secondary structure.
  • Assemblies mix many chains with little co-evolution—sample broadly and compare poses.
  • Ligands are exotic, macrocyclic, or highly flexible. Check chemistry and consider docking or MD follow-up.
  • Experimental data disagrees. Reconcile with mutagenesis, cryo-EM density, or orthogonal metrics.

How it works

Chai-1 is a multimodal transformer that fuses sequence intelligence with geometric reasoning. It keeps the spirit of AlphaFold2’s Evoformer while upgrading to the Pairformer backbone and a diffusion-based structure module introduced with AlphaFold 3. The result is a single network that understands proteins, nucleic acids, ligands, ions, and glycans together.

Self-distillation on model-generated examples further sharpens the network, allowing Chai-1 to stabilise long loops, respect stereochemistry, and keep cofactors seated even without experimental templates. Paired with fast MMseqs2 searches, it recovers co-evolutionary signals when they exist yet remains robust when you switch alignments off entirely.

Neural network pipeline

  1. Input embedder. Encodes amino acids, nucleotides, ligand atoms, and optional MSA rows into rich token representations. When no alignment is supplied, Chai-1 falls back to a pretrained protein language model embedding.
  2. Pairformer. Transformer blocks focus on pairwise relationships between every token pair (residue↔residue, residue↔ligand atom, etc.), learning contact maps and interaction intent without the rotational constraints of the original Evoformer.
  3. Structure module with diffusion. Instead of a one-shot geometry head, Chai-1 iteratively denoises a noisy 3D scaffold through hundreds of diffusion steps, guided by the pair representation and optional restraints.

Restraint-aware conditioning

  • Spatial masks bias attention toward residues that anchor a pocket restraint, steering ligands into chemically sensible cavities.
  • Distance priors convert contact restraints into soft penalties inside the diffusion loss, tightening interfaces without forcing unrealistic geometry.
  • Covalent links are treated as fixed edges in the internal graph representation, so covalent restraints behave like true bonds during sampling.

Multimodal inputs

  • Protein chains (with or without alignments) are tokenised and augmented with evolutionary or language-derived context.
  • Nucleic acids use their own alphabet yet share the same latent space, enabling protein–DNA/RNA complexes to co-fold naturally.
  • Ligands and cofactors are embedded from SMILES/graph descriptions so their atoms can interact with residues during folding.
  • Custom restraints inject prior knowledge: pocket regions bias ligand placement, contact distance priors enforce experimentally observed pairs, and covalent bonds turn partners into a continuous chain.

Diffusion-based structure generation

The diffusion module starts from noise, then repeatedly refines coordinates. Each recycle feeds the partially denoised structure back into the network, letting Chai-1 resolve clashes, tighten side chains, and align ligands with induced-fit adjustments. Sampling with different seeds explores alternative minima on the learned energy landscape—perfect for flexible antibodies or small-molecule poses.

The model also predicts per-atom uncertainties during denoising, so Vici.bio can warn you when a ligand pose still looks ambiguous or when an interface needs more sampling. Those signals feed directly into the UI overlays that highlight low-confidence loops or metal coordination that may require extra recycles.

Training & performance

  • Trained as a foundation model on a broad mix of protein complexes, nucleic acid assemblies, and protein–ligand structures.
  • Matches reported AlphaFold 3 accuracy across monomers, protein–protein interfaces, and ligand placement (~77% within 2 Å ligand RMSD).
  • Single-sequence mode still reaches AlphaFold-Multimer-level TM-scores, making Chai-1 ideal when alignments are sparse.
  • Template conditioning, restraints, and extra samples extend the model to challenging targets without retraining.
  • Weights are published alongside inference code and evaluation scripts, enabling reproducible benchmarking and on-prem deployments.
  • Quantisation-aware checkpoints are available for edge accelerators, so you can run Chai-1 locally if regulatory requirements keep data in-house.

FAQ

References

  1. Meier, J. et al. (2024) Chai-1: Decoding the molecular interactions of life. bioRxiv 2024.10.10.615955.
  2. Jumper, J. M. et al. (2024) Accurate structure prediction of biomolecular interactions with AlphaFold 3. Nature 630, 493–500.