Gen sweep by Sidney-Lisanza · Pull Request #208 · prescient-design/lobster

Sidney-Lisanza · 2025-11-04T14:52:55Z

Description

Brief description of changes made

Type of Change

Bug fix
[ X] New feature
[ X] Documentation update
Performance improvement
Code refactoring

…nd components Co-authored-by: Cursor <[email protected]>

…-ligand pipeline Co-authored-by: Cursor <[email protected]>

…-ligand training - Add max_cluster_replicates parameter to StructureLightningDataModule to cap upsampling of small datasets in balanced training mode - Add data configs: structure_ligand_all (7-dataset combined), PLINDER baseline, distillation, and intermediate configs for protein-ligand training - Fix elif→if in gen_ume protein-ligand model to allow simultaneous IF/FF eval - Fix PDB loading edge cases in latent_generator io - Add structure transforms for protein-ligand data handling

…line eval - Add compute_protein_ligand_contacts and compute_aligned_ligand_rmsd to generation utils as reusable standalone functions - Add contact-based ligand_in_pocket metric to forward folding evaluator: checks if predicted ligand contacts GT pocket residues (replaces centroid-based) - Add ligand_contacts_protein metric (any protein-ligand contact at 6A) - Allow skipping ESMFold in conditioned generation (plm_fold=None) - Add best-of-N display and ligand placement stats to FF cmdline output - Add LigandMPNN inverse folding baseline evaluator and cmdline script - Update inverse folding evaluator with pocket-aware metrics - Update conditioned gen cmdline with additional generation parameters

- Update forward folding and inverse folding callbacks with ligand support - Update hydra callback configs with protein-ligand evaluation parameters - Add save_structures and minimize_ligand options to callback configs

…ct-based ligand placement - Add good_fold_and_in_pocket_fraction (TM > 0.5 AND ligand in correct pocket) to FF evaluator summary and cmdline output - Update merge_cofold_results.py to use contact-based ligand_in_pocket (CA within 6A of GT pocket residues) instead of centroid distance - Add cofold_ligand_contacts_protein and cofold_n_pocket_contacts metrics - Report good_fold_and_in_pocket in merge summary

- Restructure run_full_eval.sh: Phase 2 supports rf3, boltz, or both backends with configurable task selection (COFOLD_TASKS=if,ff,cg,lmpnn) - RF3 co-folding runs in parallel chunks across multiple GPUs - Boltz2 co-folding uses SLURM array jobs (one per sample) - Phase 3 merges co-fold results from either backend - Add benchmark_conditioned_gen.py for Gen-UME vs Proteina-Complexa comparison with ESMFold pre-filtering and per-design timing - Add run_rf3_ff_baseline.py for RF3 co-folding on designed sequences - Add submit_cofold_batch.py and run_cofold_local.py for batch co-folding

… docs - Add ligand-conditioned generation and LigandMPNN baseline sections - Document evaluation pipeline (Phase 1-3) with RF3/Boltz2 co-folding - Document contact-based ligand placement metrics and good_fold_and_in_pocket - Add training data configs and training commands - Document benchmark script for Gen-UME vs Proteina-Complexa - Add best-of-N forward folding and aligned ligand RMSD - Update PoseBusters benchmark description

…ses) Implements the ProteinMPNN-analog AR-MC pseudo-likelihood estimator for Gen-UME (sequence + structure heads), plus best-of-N selection drivers and analysis scripts derived from the PLL signal. Scoring + correlation: - score_gen_ume_pll.py: K-draw stratified-t MC PLL scorer (score_unif / score_arllh / fixed-t variants; per-modality and joint_true_2) - correlate_pll_with_quality.py: pooled + per-length Pearson/Spearman vs task quality CSVs - score_gen_ume_pll_failed_attempts.py / compare_pll_vs_sr_gate.py: PLL scoring of SR-rejected attempts and PLL-vs-SR-gate decision comparison Best-of-N drivers + analyzers: - forward_fold_bestofN_pll.py / analyze_bestofN_ff.py - inverse_fold_bestofN_pll.py / analyze_bestofN_if.py - unconditional_bestofN_pll.py / analyze_bestofN_uc.py - analyze_bestofN_topk_softpick.py: hard-argmin vs top-K soft pick - plot_ff_struc_pll_per_target.py: per-target within-correlation diagnostic - regen_top_K_by_nll_uc.py: replay top-K-by-NLL UC candidates for ESMFold Self-reflection IF: - inverse_fold_self_reflection.py / analyze_if_self_reflection.py - plot_if_sr_jump.py: per-target before/after waterfall + scatter SLURM drivers: - score_gen_ume_pll.sh - run_forward_fold_bestofN_pll.sh - run_unconditional_bestofN_pll.sh Co-authored-by: Cursor <[email protected]>

…tricsCSVWriter GenUME unconditional benchmark vs LaProteina/DPLM2 + Self-Reflection threshold sweep (SR>=0.833 vs SR>=0.9) + SR-QC vs ESMFold-QC concordance analyses. New benchmark scripts: - eval_competitor_unconditional.py: subsample 100 PDBs/length, ESMFold, compute designability + clustering + SSE + novelty for LaProteina/DPLM2 - convert_afdb_cluster_reps_to_pdb.py: AFDB SwissProt cluster reps -> PDB - analyze_tm_score_novelty.py: foldseek easy-search of cluster reps vs PDB / AFDB / DeNovo reference sets - compile_benchmark_table.py: stitch GenUME + competitor results - analyze_selfreflection_paired.py / build_selective_sr_table.py: per-sample paired SR analysis + length-selective SR policy - analyze_sr_threshold_sweep.py / plot_sr_qc_threshold_sweep.py / plot_sr_qc_tm_sweep_balanced.py: SR forward-fold-TM gate sweep (T=0/0.833/0.9), pooled + length-balanced - plot_per_length_designability_bars.py: per-length designability bars with Fisher's exact significance - plot_forward_vs_esmfold_tm.py / plot_forward_vs_esmfold_tm_cameo.py: lobster forward-fold TM vs ESMFold TM scatter (uncond + CAMEO) - esmfold_failed_attempts.py / build_sr_esmfold_concordance.py / plot_sr_vs_esmfold_unconditional.py: post-hoc ESMFold of SR-rejected attempts + 2x2 concordance matrix vs ESMFold gate SLURM: - eval_gen_ume_denovo_sr_tm0p9.sh: SR runs at the tighter 0.9 gate - eval_competitor_unconditional.sh: LaProteina/DPLM2 ESMFold + clustering Source changes: - generate.py: add _save_failed_self_reflection_attempt() so SR runs with generation.self_reflection.save_failed_attempts=true persist the initial seq+backbone of every forward-fold-TM-rejected attempt for the ESMFold-concordance follow-up - _generation_utils.py: fix MetricsCSVWriter column-shift bug (drop two *_kabsch keys that had no header columns; add ESMFold-agreement comparison columns); add compute_complex_metrics_vs_gt() shared helper Co-authored-by: Cursor <[email protected]>

… eval callback Mirrors the protein-only PLL study on the 4-modality protein-ligand checkpoint (sequence, protein-structure, ligand-atom, ligand-structure). PLL scoring + best-of-N: - score_gen_ume_protein_ligand_pll.py: 4-modality AR-MC PLL scorer with per-modality (seq / struc / lig_atom / lig_struc) scores, additive joints (joint_protein, joint_ligand, joint_all), and a true 4-way joint (joint_true_4) computed via one extra forward per K with all four modalities masked simultaneously - forward_fold_bestofN_pll_ligand.py / inverse_fold_bestofN_pll_ligand.py / conditioned_gen_bestofN_pll_ligand.py: per-task best-of-N drivers on PoseBusters with the new pickers (per-modality + joint variants + Boltz2 iptm / TM oracles) - benchmark_conditioned_gen.py: refactored CG benchmark with picker comparison CG Boltz callback: - _cg_boltz_eval.py + cg_boltz_eval.yaml: lightweight Boltz2 cofold evaluator wired into the protein-ligand training callbacks suite Source updates: - evaluate_ligand_conditioned_protein_generation.py: tune default temperature/stochasticity per-modality (seq/struc/ligand) for CG benchmark hyperparameters - ligand_conditioned_protein_generation.py: prefer ligand_data["smiles"] over re-reading from SDF when available - submit_cofold_batch.py: --max_concurrent for SLURM array throttle and skip-if-result-already-complete guard - run_full_eval.sh: CG_NUM_LIGANDS / CG_NUM_DESIGNS / CG_DATA_DIR knobs; default CG to Proteina-style 4 ligands x 10 designs at nsteps=200 Co-authored-by: Cursor <[email protected]>

Lisanza and others added 7 commits November 4, 2025 14:21

readme and submitwith wandb agents

0399115

uodate readme

8fa8783

updatew readme

4ef6d05

slim readme

ed81a0d

lg training code refactoring

be476a9

minor update

02d3dcb

update param

f8fdf72

Sidney-Lisanza requested a review from karinazad November 6, 2025 14:33

Lisanza and others added 22 commits November 6, 2025 14:55

update readme

538160d

update readme

0394fe5

ligand training and better analysis code

4198d65

added eval script and then fixed bug in callbacks

45fbb35

i forgot

30f3231

forward folding callback

c0faf3e

protein-ligand refractorign and forward_folding callback

7e43889

adding ligand capabilities

12b85aa

protein-ligand training and inference eval

70ebeb7

dats locations

2a92d7f

update documentations

6d97923

update hydra callback

6629fbc

fixes for pl gen callbacks and data handling

bb32366

update readme

cdd12f2

Update callbacks, metrics, latent_generator, and gen_ume protein-liga…

aa655eb

…nd components Co-authored-by: Cursor <[email protected]>

Add callbacks, cmdline, metrics, model, and hydra configs for protein…

89bc408

…-ligand pipeline Co-authored-by: Cursor <[email protected]>

update redme with examples

cb2b2d2

Update callbacks and hydra configs for protein-ligand training eval

2390cb3

- Update forward folding and inverse folding callbacks with ligand support - Update hydra callback configs with protein-ligand evaluation parameters - Add save_structures and minimize_ligand options to callback configs

Lisanza and others added 4 commits April 10, 2026 17:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Gen sweep#208

Gen sweep#208
Sidney-Lisanza wants to merge 33 commits into
mainfrom
gen_sweep

Sidney-Lisanza commented Nov 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Sidney-Lisanza commented Nov 4, 2025

Description

Type of Change

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant