Bulk MultiQC reporting as a pipeline-installed Python module#450
Bulk MultiQC reporting as a pipeline-installed Python module#450keiran-rowell-unsw wants to merge 22 commits into
Conversation
|
|
pip installing will break in a container due to unwritable locations. Also since it's pipe-line specific I can't post-process with Leave this for now because the refactor effort would fit into |
There was a problem hiding this comment.
Pull request overview
Implements “bulk” ProteinFold MultiQC reporting by packaging a pipeline-specific MultiQC plugin (Python module) and wiring the pipeline’s MULTIQC process to install/use it when generating reports.
Changes:
- Add a Python package (
multiqc_proteinfold) providing a MultiQC module that parses pipeline TSV metrics and generates general stats + pLDDT line plots. - Add
setup.pywith MultiQC entry-point registration for the plugin. - Update the nf-core
multiqcmodule to install the local plugin at runtime / via conda environment config.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
| setup.py | Defines the plugin package + MultiQC entry point for module discovery. |
| multiqc_proteinfold/proteinfold.py | Implements the MultiQC module parsing metrics and adding report sections. |
| multiqc_proteinfold/multiqc_config.yaml | Adds MultiQC search pattern config for proteinfold TSVs and logo settings. |
| multiqc_proteinfold/init.py | Exposes the module and package version. |
| modules/nf-core/multiqc/main.nf | Installs the plugin before running multiqc. |
| modules/nf-core/multiqc/environment.yml | Attempts to add pip-based installation of the local plugin into the conda env. |
Comments suppressed due to low confidence (2)
modules/nf-core/multiqc/environment.yml:10
- The conda
environment.ymlhas invalid syntax for pip-installed dependencies (- pipfollowed by a nested list). Conda expects apip:mapping (e.g.,- pip:with a list under it). As written, environment creation will fail.
- bioconda::multiqc=1.32
- pip
- ${projectDir} # Install proteinfold_multiqc as a local plugin
modules/nf-core/multiqc/environment.yml:10
${projectDir}inenvironment.ymlwill not be interpolated by conda/mamba, so it won’t resolve to the pipeline path. If you need to install the local plugin, do it in the process script (or bake it into the container/image) rather than relying on a conda env file variable.
- pip
- ${projectDir} # Install proteinfold_multiqc as a local plugin
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
|
Warning Newer version of the nf-core template is available. Your pipeline is using an old version of the nf-core template: 3.5.1. For more documentation on how to update your pipeline, please see the nf-core documentation and Synchronisation documentation. |
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…l-Biology-Computing/proteinfold into multiqc_bulk_report
|
@JoseEspinosa the CI seems to be fail on the bioflow metromap generation (#426) but I don't understand this automated metromap system |
|
The |
Just ignore it for the moment |
Placeholder draft PR for implementing 'bulk' MultiQC reporting as per #439.
Recommendation was deploying as a
setup.pymodule as its pipeline specific and won't work out-of-the-box on underlying tools (see MultiQC PR)Still testing on a local UNSW HPC and then OoD-proteinfold deployment. Works fine on pre-computed input.