Hi,
we have some gff3s that contain ? in the strand column. These features are not genes (in our case oriC in bacteria). Nonetheless, when I try to extract protein and cds sequences it fails with an error:
gffread r1.gff3 -S -g chromosomes.fasta -y protein.fasta
Error parsing strand (?) from GFF line:
chr1 PRODIGALEXEX oriC 663246 664269 . ? . ID=JMHBPODKGB_1056;Name=origin of replication;product=origin of replication;inference=similar to DNA sequence
Using ? is OK following GFF3 specs (https://github.com/the-sequence-ontology/specifications/blob/master/gff3.md):
Column 7: "strand"
The strand of the feature. + for positive strand (relative to the landmark), - for minus strand, and . for features that are not stranded. In addition, ? can be used for features whose strandedness is relevant, but unknown.
Could it be an option that you ignore the ? and not output protein/cds in such cases or add a parameter to switch that behaviour on? Or is there maybe a way I have not seen in the help?
Many thanks!
Hi,
we have some gff3s that contain
?in the strand column. These features are not genes (in our case oriC in bacteria). Nonetheless, when I try to extract protein and cds sequences it fails with an error:Using
?is OK following GFF3 specs (https://github.com/the-sequence-ontology/specifications/blob/master/gff3.md):Could it be an option that you ignore the
?and not output protein/cds in such cases or add a parameter to switch that behaviour on? Or is there maybe a way I have not seen in the help?Many thanks!