Skip to content

Gen sweep#208

Open
Sidney-Lisanza wants to merge 33 commits into
mainfrom
gen_sweep
Open

Gen sweep#208
Sidney-Lisanza wants to merge 33 commits into
mainfrom
gen_sweep

Conversation

@Sidney-Lisanza
Copy link
Copy Markdown
Collaborator

Description

Brief description of changes made

Type of Change

  • Bug fix
  • [ X] New feature
  • [ X] Documentation update
  • Performance improvement
  • Code refactoring

Lisanza and others added 22 commits November 6, 2025 14:55
…-ligand training

- Add max_cluster_replicates parameter to StructureLightningDataModule to cap
  upsampling of small datasets in balanced training mode
- Add data configs: structure_ligand_all (7-dataset combined), PLINDER baseline,
  distillation, and intermediate configs for protein-ligand training
- Fix elif→if in gen_ume protein-ligand model to allow simultaneous IF/FF eval
- Fix PDB loading edge cases in latent_generator io
- Add structure transforms for protein-ligand data handling
…line eval

- Add compute_protein_ligand_contacts and compute_aligned_ligand_rmsd to
  generation utils as reusable standalone functions
- Add contact-based ligand_in_pocket metric to forward folding evaluator:
  checks if predicted ligand contacts GT pocket residues (replaces centroid-based)
- Add ligand_contacts_protein metric (any protein-ligand contact at 6A)
- Allow skipping ESMFold in conditioned generation (plm_fold=None)
- Add best-of-N display and ligand placement stats to FF cmdline output
- Add LigandMPNN inverse folding baseline evaluator and cmdline script
- Update inverse folding evaluator with pocket-aware metrics
- Update conditioned gen cmdline with additional generation parameters
- Update forward folding and inverse folding callbacks with ligand support
- Update hydra callback configs with protein-ligand evaluation parameters
- Add save_structures and minimize_ligand options to callback configs
…ct-based ligand placement

- Add good_fold_and_in_pocket_fraction (TM > 0.5 AND ligand in correct pocket)
  to FF evaluator summary and cmdline output
- Update merge_cofold_results.py to use contact-based ligand_in_pocket
  (CA within 6A of GT pocket residues) instead of centroid distance
- Add cofold_ligand_contacts_protein and cofold_n_pocket_contacts metrics
- Report good_fold_and_in_pocket in merge summary
- Restructure run_full_eval.sh: Phase 2 supports rf3, boltz, or both
  backends with configurable task selection (COFOLD_TASKS=if,ff,cg,lmpnn)
- RF3 co-folding runs in parallel chunks across multiple GPUs
- Boltz2 co-folding uses SLURM array jobs (one per sample)
- Phase 3 merges co-fold results from either backend
- Add benchmark_conditioned_gen.py for Gen-UME vs Proteina-Complexa comparison
  with ESMFold pre-filtering and per-design timing
- Add run_rf3_ff_baseline.py for RF3 co-folding on designed sequences
- Add submit_cofold_batch.py and run_cofold_local.py for batch co-folding
Lisanza and others added 4 commits April 10, 2026 17:34
… docs

- Add ligand-conditioned generation and LigandMPNN baseline sections
- Document evaluation pipeline (Phase 1-3) with RF3/Boltz2 co-folding
- Document contact-based ligand placement metrics and good_fold_and_in_pocket
- Add training data configs and training commands
- Document benchmark script for Gen-UME vs Proteina-Complexa
- Add best-of-N forward folding and aligned ligand RMSD
- Update PoseBusters benchmark description
…ses)

Implements the ProteinMPNN-analog AR-MC pseudo-likelihood estimator for
Gen-UME (sequence + structure heads), plus best-of-N selection drivers and
analysis scripts derived from the PLL signal.

Scoring + correlation:
- score_gen_ume_pll.py: K-draw stratified-t MC PLL scorer (score_unif /
  score_arllh / fixed-t variants; per-modality and joint_true_2)
- correlate_pll_with_quality.py: pooled + per-length Pearson/Spearman vs
  task quality CSVs
- score_gen_ume_pll_failed_attempts.py / compare_pll_vs_sr_gate.py: PLL
  scoring of SR-rejected attempts and PLL-vs-SR-gate decision comparison

Best-of-N drivers + analyzers:
- forward_fold_bestofN_pll.py / analyze_bestofN_ff.py
- inverse_fold_bestofN_pll.py / analyze_bestofN_if.py
- unconditional_bestofN_pll.py / analyze_bestofN_uc.py
- analyze_bestofN_topk_softpick.py: hard-argmin vs top-K soft pick
- plot_ff_struc_pll_per_target.py: per-target within-correlation diagnostic
- regen_top_K_by_nll_uc.py: replay top-K-by-NLL UC candidates for ESMFold

Self-reflection IF:
- inverse_fold_self_reflection.py / analyze_if_self_reflection.py
- plot_if_sr_jump.py: per-target before/after waterfall + scatter

SLURM drivers:
- score_gen_ume_pll.sh
- run_forward_fold_bestofN_pll.sh
- run_unconditional_bestofN_pll.sh

Co-authored-by: Cursor <[email protected]>
…tricsCSVWriter

GenUME unconditional benchmark vs LaProteina/DPLM2 + Self-Reflection
threshold sweep (SR>=0.833 vs SR>=0.9) + SR-QC vs ESMFold-QC concordance
analyses.

New benchmark scripts:
- eval_competitor_unconditional.py: subsample 100 PDBs/length, ESMFold,
  compute designability + clustering + SSE + novelty for LaProteina/DPLM2
- convert_afdb_cluster_reps_to_pdb.py: AFDB SwissProt cluster reps -> PDB
- analyze_tm_score_novelty.py: foldseek easy-search of cluster reps vs
  PDB / AFDB / DeNovo reference sets
- compile_benchmark_table.py: stitch GenUME + competitor results
- analyze_selfreflection_paired.py / build_selective_sr_table.py:
  per-sample paired SR analysis + length-selective SR policy
- analyze_sr_threshold_sweep.py / plot_sr_qc_threshold_sweep.py /
  plot_sr_qc_tm_sweep_balanced.py: SR forward-fold-TM gate sweep
  (T=0/0.833/0.9), pooled + length-balanced
- plot_per_length_designability_bars.py: per-length designability bars
  with Fisher's exact significance
- plot_forward_vs_esmfold_tm.py / plot_forward_vs_esmfold_tm_cameo.py:
  lobster forward-fold TM vs ESMFold TM scatter (uncond + CAMEO)
- esmfold_failed_attempts.py / build_sr_esmfold_concordance.py /
  plot_sr_vs_esmfold_unconditional.py: post-hoc ESMFold of SR-rejected
  attempts + 2x2 concordance matrix vs ESMFold gate

SLURM:
- eval_gen_ume_denovo_sr_tm0p9.sh: SR runs at the tighter 0.9 gate
- eval_competitor_unconditional.sh: LaProteina/DPLM2 ESMFold + clustering

Source changes:
- generate.py: add _save_failed_self_reflection_attempt() so SR runs with
  generation.self_reflection.save_failed_attempts=true persist the
  initial seq+backbone of every forward-fold-TM-rejected attempt for the
  ESMFold-concordance follow-up
- _generation_utils.py: fix MetricsCSVWriter column-shift bug (drop two
  *_kabsch keys that had no header columns; add ESMFold-agreement
  comparison columns); add compute_complex_metrics_vs_gt() shared helper

Co-authored-by: Cursor <[email protected]>
… eval callback

Mirrors the protein-only PLL study on the 4-modality protein-ligand
checkpoint (sequence, protein-structure, ligand-atom, ligand-structure).

PLL scoring + best-of-N:
- score_gen_ume_protein_ligand_pll.py: 4-modality AR-MC PLL scorer with
  per-modality (seq / struc / lig_atom / lig_struc) scores, additive joints
  (joint_protein, joint_ligand, joint_all), and a true 4-way joint
  (joint_true_4) computed via one extra forward per K with all four
  modalities masked simultaneously
- forward_fold_bestofN_pll_ligand.py / inverse_fold_bestofN_pll_ligand.py
  / conditioned_gen_bestofN_pll_ligand.py: per-task best-of-N drivers on
  PoseBusters with the new pickers (per-modality + joint variants +
  Boltz2 iptm / TM oracles)
- benchmark_conditioned_gen.py: refactored CG benchmark with picker
  comparison

CG Boltz callback:
- _cg_boltz_eval.py + cg_boltz_eval.yaml: lightweight Boltz2 cofold
  evaluator wired into the protein-ligand training callbacks suite

Source updates:
- evaluate_ligand_conditioned_protein_generation.py: tune default
  temperature/stochasticity per-modality (seq/struc/ligand) for CG
  benchmark hyperparameters
- ligand_conditioned_protein_generation.py: prefer ligand_data["smiles"]
  over re-reading from SDF when available
- submit_cofold_batch.py: --max_concurrent for SLURM array throttle and
  skip-if-result-already-complete guard
- run_full_eval.sh: CG_NUM_LIGANDS / CG_NUM_DESIGNS / CG_DATA_DIR knobs;
  default CG to Proteina-style 4 ligands x 10 designs at nsteps=200

Co-authored-by: Cursor <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant