Dev -> master for nf-core/proteinfold 2.0.0#540
Conversation
Improve version reporting and refactor inline scripts
Update utils_nfschema to fix help message with strict syntax
Move foldseek logic to post_processing
| @@ -0,0 +1,7 @@ | |||
| name: multifasta_to_csv | |||
There was a problem hiding this comment.
channels missing yes
| .out | ||
| .pdb | ||
| .map { it -> | ||
| it[0].model = "boltz" |
There was a problem hiding this comment.
claude says something like this, not sure how true though: Every other workflow uses def meta = it[0].clone() then sets meta.model = "...". Boltz does it correctly at line 77 but then
skips cloning at lines 141-172. The fix is straightforward — follow the same pattern:
.map { it ->
def meta = it[0].clone()
meta.model = "boltz"
[meta, it[1]]
}
Verdict: Real issue, easy fix, low practical risk in this case since all mutations set the same value, but should be fixed for
correctness and consistency.
There was a problem hiding this comment.
Yes, actually mutating meta in-place can be risky as it could cause side effects if the same meta object is used downstream by other channels. Good spot!
| ch_multiqc_files = ch_multiqc_files.mix(ch_methods_description.collectFile(name: 'methods_description_mqc.yaml')) | ||
| ch_multiqc_files = ch_multiqc_files.mix(ch_collated_versions) | ||
|
|
||
| ch_multiqc_rep |
|
|
||
| // WORKFLOW: Run Boltz | ||
| // | ||
| if (params.mode.toLowerCase().split(",").contains("boltz")) { |
There was a problem hiding this comment.
| if (params.mode.toLowerCase().split(",").contains("boltz")) { | |
| if (requested_modes.contains("boltz")) { |
| ch_rnacentral = channel.value(file(rnacentral_active_seq_path)) | ||
| ch_nt_rna = channel.value(file(nt_rna_2023_02_23_path)) | ||
| ch_rfam = channel.value(file(rfam_path)) |
There was a problem hiding this comment.
| ch_rnacentral = channel.value(file(rnacentral_active_seq_path)) | |
| ch_nt_rna = channel.value(file(nt_rna_2023_02_23_path)) | |
| ch_rfam = channel.value(file(rfam_path)) | |
| ch_rnacentral = channel.value(file(rnacentral_active_seq_path), checkIfExists: true) | |
| ch_nt_rna = channel.value(file(nt_rna_2023_02_23_path), checkIfExists: true) | |
| ch_rfam = channel.value(file(rfam_path), checkIfExists: true) |
There was a problem hiding this comment.
These are optionals, which is why they are not checked
|
|
||
| # The update to 2.2.1, was complaining about this missing dependencies | ||
| RUN pip install --no-cache-dir \ | ||
| cuequivariance_ops_cu12==0.8.1\ |
There was a problem hiding this comment.
| cuequivariance_ops_cu12==0.8.1\ | |
| cuequivariance_ops_cu12==0.8.1 \ |
just add space
| params.colabfold_db, | ||
| params.colabfold_server, | ||
| params.use_msa_server, | ||
| params.colabfold_alphafold2_params_path, |
There was a problem hiding this comment.
is it fine to not have these params defined in nextflow.config?
params: colabfold_alphafold2_params_link, colabfold_alphafold2_params_path, colabfold_alphafold2_params_tags, foldseek_db, foldseek_db_path,
alphafold2_params_prefix
There was a problem hiding this comment.
colabfold_alphafold2_params_link, colabfold_alphafold2_params_path, colabfold_alphafold2_params_tags and alphafold2_params_prefix are modified in a function an otherwise, this can not be done (due to the resolution order). I have moved foldseek_db and foldseek_db_path to the nextflow.config (they were assigned to null in the db.config.
There was a problem hiding this comment.
FYI - I refactored these colabfold params in #527 on the hackathon branch which should be much cleaner moving forward.
| }, | ||
| "uniref30_prefix": { | ||
| "type": "string", | ||
| "default": "UniRef30_2023_02", |
suzannejin
left a comment
There was a problem hiding this comment.
hello @JoseEspinosa ,
Just gave some minor suggestions, otherwise the PR looks good to me :)
Address release 2.0.0 review suggestions
…mer, server and local
docker.pullStrategy = 'lazy'"
Update parameters table in changelog
Release 2.0.0
New Features:
--modevalues (PR Enable running multiple modes in parallel #178)--split_fastaparameter to split multi-sequence FASTA files for parallel folding--dbglobal database flag to simplify database path configuration (PR Add global db flag #315)Bug Fixes:
--full_dbsas a global option (Readd a global --full_dbs flag #382)obsolete.datpath error (AlphaFold2_standardNo such file or directory: 'pdb_mmcif/obsolete.dat'#387) and nested obsolete PDBs from pdb70 (Twice obsolete structures crash AF2 monomer #378)No such file or directory: 'T1024_predicted_aligned_error_v1.json#388), monomer ID inheritance (Running colabfold with 2 monomers in the same samplesheet will crash if the fasta header is the same #455), and multimer weight downloads (ColabFold workflow always tries to re-download model weights when run in multimer mode #457)mmcifsymlinking causing I/O issues (PR Stops symlinking every mmcif_file #287)--nvargument passing for Apptainer/Singularity (Argument--nvnot correctly set #281)Improvements:
/reportsoutput directory (PR HTML reports -> /reports not /generate #469)Renamed Parameters:
--small_bfd_link--alphafold2_small_bfd_link--mgnify_link--alphafold2_mgnify_link--pdb_mmcif_link--alphafold2_pdb_mmcif_link--uniref30_alphafold2_link--alphafold2_uniref30_link--uniref90_link--alphafold2_uniref90_link--pdb_seqres_link--alphafold2_pdb_seqres_link--small_bfd_path--alphafold2_small_bfd_path--mgnify_path_alphafold2--alphafold2_mgnify_path--pdb_mmcif_path--alphafold2_pdb_mmcif_path--uniref30_alphafold2_path--alphafold2_uniref30_path--uniref90_path--alphafold2_uniref90_path--pdb_seqres_path--alphafold2_pdb_seqres_path--uniprot_path--alphafold2_uniprot_path--colabfold_server--use_msa_server--host_url--msa_server_urlNew Parameters:
--alphafold3_db,--alphafold3_small_bfd_*,--alphafold3_mgnify_*,--alphafold3_uniref90_*,--alphafold3_pdb_seqres_*,--alphafold3_uniprot_*,--alphafold3_pdb_mmcif_*,--alphafold3_params_path,--alphafold3_rnacentral_*,--alphafold3_nt_rna_*,--alphafold3_rfam_*--boltz_db,--boltz_model,--boltz_use_potentials,--boltz_use_kernels,--boltz_ccd_*,--boltz_model_*,--boltz2_aff_*,--boltz2_conf_*,--boltz2_mols_*--helixfold3_db,--helixfold3_precision,--helixfold3_infer_times,--helixfold3_max_template_date, and mode-specific database path/link parameters--rosettafold_all_atom_db, and mode-specific database path/link parameters--rosettafold2na_db, and mode-specific database path/link parameters--db,--save_intermediates,--split_fasta,--alphafold2_full_dbs,--skip_visualisation,--skip_foldseek,--foldseek_easysearch_arg,--alphafold2_random_seed,--alphafold2_pdb_obsolete_pathRemoved Parameters:
--max_memory,--max_cpus,--max_time