feat: per-ORF P-site quantification#188
Draft
pinin4fjords wants to merge 7 commits into
Draft
Conversation
Expand the cohort BED12 catalogue into codon-start BED6 positions (frame defined by each ORF's own ATG, not GTF `phase`), then run per-sample bedtools intersect against plastid wiggle tracks to assemble an ORF x sample count matrix. Gated on --extended_orf_analysis true + at-least-one-caller + --skip_plastid false. Matrix published under <outdir>/orf_quantification/. Uses upstream custom/bed12codonpositions (nf-core/modules#11733 pending).
Member
|
Warning Newer version of the nf-core template is available. Your pipeline is using an old version of the nf-core template: 3.5.1. For more documentation on how to update your pipeline, please see the nf-core documentation and Synchronisation documentation. |
… under correct repo URL [skip ci]
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds a per-ORF P-site quantification path that runs additively to the existing gene-level Plastid path. Expands the cohort BED12 catalogue (#167) into codon-start BED6 positions, then runs
bedtools intersectper sample against plastid wiggle tracks and assembles an ORF x sample count matrix.Changes
QUANTIFY_ORF_PSITEchainingCUSTOM_BED12CODONPOSITIONS(cohort-level codon-start expansion) +QUANTIFY_INFRAME_PSITE_ORF(per-sample bedtools intersect + groupby) +ORF_COUNT_MATRIX(pivot to ORF x sample matrix, zero-fill for missing ORFs).quantify_inframe_psite_orf,orf_count_matrix.QUANTIFY_ORF_PSITEwhen--extended_orf_analysis true, at least one ORF caller is enabled, and plastid is not skipped.<outdir>/orf_quantification/.Codon-start counting (not span counting)
Only P-sites falling exactly at the first position of each in-frame codon are counted (positions 0, 3, 6, ... relative to each ORF's own ATG). Maximises in-frame purity at the cost of ~1/3 the raw counts vs span counting. The frame is taken from the ORF's own start codon, not from GTF
phase, so novel ORFs without GTF annotation work correctly.🚨 Upstream dependency
Uses
custom/bed12codonpositionsfrom nf-core/modules#11733 (open).modules.jsonis pinned to that PR branch. Before this PR can leave draft:master.nf-core modules install custom/bed12codonpositionssomodules.jsonshows the master SHA.Stacked PR notes
Twelfth in the stack splitting #174. Targets #187 (
feat/167-orf-catalogue).Closes #166
🤖 Generated with Claude Code