Support NCBI microbe GTF/GFF with no transcripts (CDS only)#1627
Open
nuno-agostinho wants to merge 7 commits into
Open
Support NCBI microbe GTF/GFF with no transcripts (CDS only)#1627nuno-agostinho wants to merge 7 commits into
nuno-agostinho wants to merge 7 commits into
Conversation
added 3 commits
September 5, 2024 14:05
…embl-vep into gtf/ncbi-cds-microbes
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #1620
Support NCBI GTF/GFF annotation files that only contain CDS lines: these CDS lines are children from gene IDs (instead of transcript IDs, as usual in Ensembl annotation files) and don't have exons as children.
If a CDS is a child from a gene and has no exons of its own, parse the feature as a single-exon transcript with the same strand, start and end as the CDS.
TODO
--cds_as_transcript_gxf--cds_as_transcript_gxfin public docsTesting
Example files for avian paramyxovirus 1
Example VCF
Example command
Test conditions
--cds_as_transcript_gxfshould return a warning if there are CDS in the annotation whose parent is a gene record--cds_as_transcript_gxfshould successfully use the CDS in the annotation as single-exon transcripts