Skip to content

feat: per-ORF P-site quantification#188

Draft
pinin4fjords wants to merge 7 commits into
feat/167-orf-cataloguefrom
feat/166-orf-quantification
Draft

feat: per-ORF P-site quantification#188
pinin4fjords wants to merge 7 commits into
feat/167-orf-cataloguefrom
feat/166-orf-quantification

Conversation

@pinin4fjords
Copy link
Copy Markdown
Member

Summary

Adds a per-ORF P-site quantification path that runs additively to the existing gene-level Plastid path. Expands the cohort BED12 catalogue (#167) into codon-start BED6 positions, then runs bedtools intersect per sample against plastid wiggle tracks and assembles an ORF x sample count matrix.

Changes

  • New local subworkflow QUANTIFY_ORF_PSITE chaining CUSTOM_BED12CODONPOSITIONS (cohort-level codon-start expansion) + QUANTIFY_INFRAME_PSITE_ORF (per-sample bedtools intersect + groupby) + ORF_COUNT_MATRIX (pivot to ORF x sample matrix, zero-fill for missing ORFs).
  • New local modules: quantify_inframe_psite_orf, orf_count_matrix.
  • Workflow integration in the plastid block: invokes QUANTIFY_ORF_PSITE when --extended_orf_analysis true, at least one ORF caller is enabled, and plastid is not skipped.
  • conf/modules.config: publish dirs under <outdir>/orf_quantification/.

Codon-start counting (not span counting)

Only P-sites falling exactly at the first position of each in-frame codon are counted (positions 0, 3, 6, ... relative to each ORF's own ATG). Maximises in-frame purity at the cost of ~1/3 the raw counts vs span counting. The frame is taken from the ORF's own start codon, not from GTF phase, so novel ORFs without GTF annotation work correctly.

🚨 Upstream dependency

Uses custom/bed12codonpositions from nf-core/modules#11733 (open). modules.json is pinned to that PR branch. Before this PR can leave draft:

Stacked PR notes

Twelfth in the stack splitting #174. Targets #187 (feat/167-orf-catalogue).

Closes #166

🤖 Generated with Claude Code

Expand the cohort BED12 catalogue into codon-start BED6 positions
(frame defined by each ORF's own ATG, not GTF `phase`), then run
per-sample bedtools intersect against plastid wiggle tracks to
assemble an ORF x sample count matrix.

Gated on --extended_orf_analysis true + at-least-one-caller +
--skip_plastid false. Matrix published under
<outdir>/orf_quantification/.

Uses upstream custom/bed12codonpositions (nf-core/modules#11733
pending).
@nf-core-bot
Copy link
Copy Markdown
Member

Warning

Newer version of the nf-core template is available.

Your pipeline is using an old version of the nf-core template: 3.5.1.
Please update your pipeline to the latest version.

For more documentation on how to update your pipeline, please see the nf-core documentation and Synchronisation documentation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants