Skip to content

Commit 49ab85e

Browse files
committed
Added support for MetaGeneMark 3.25 (gene column now CDS)
1 parent ff67c52 commit 49ab85e

File tree

1 file changed

+13
-3
lines changed

1 file changed

+13
-3
lines changed

gff/convert_metagenemark_gff_to_gff3.py

+13-3
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,16 @@
3535
##SQKEVEVTYDLQPQAGQDNLNSNGTANLFDIDESTGTYKETIYVNNKQREQNNTRILIEN
3636
3737
38+
EXPECTED INPUT (v3.25+)
39+
40+
In my tests of v 3.25 the use of 'gene' in the column above has been replaced with CDS instead.
41+
The rest appears to be the same.
42+
43+
SRS019986_Baylor_scaffold_53 GeneMark.hmm CDS 3 512 -635.997977 + 0 gene_id=18, length=510, gene_score=-635.997977, rbs_score=-0.020000, rbs_spacer=-1, stop_enforced=N, start_codon=0, logodd=37.242660
44+
SRS019986_Baylor_scaffold_53 GeneMark.hmm CDS 530 1501 -1202.233893 + 0 gene_id=19, length=972, gene_score=-1202.233893, rbs_score=-0.020000, rbs_spacer=-1, stop_enforced=N, start_codon=0, logodd=62.393607
45+
SRS019986_Baylor_scaffold_53 GeneMark.hmm CDS 1603 2109 -608.550058 + 0 gene_id=20, length=507, gene_score=-608.550058, rbs_score=-0.020000, rbs_spacer=-1, stop_enforced=N, start_codon=0, logodd=52.060246
46+
47+
3848
EXAMPLE OUTPUT:
3949
4050
855 GeneMark.hmm gene 1 852 . - . ID=HUZ239124.gene.1
@@ -84,10 +94,10 @@ def main():
8494
feat_type = cols[2]
8595

8696
## we expect only gene types here
87-
if feat_type != 'gene':
88-
raise Exception("ERROR: expected only 'gene' feature types as input.")
97+
if feat_type not in ['gene', 'CDS']:
98+
raise Exception("ERROR: expected only 'gene' or 'CDS' feature types as input (depending on metagenemark version).")
8999

90-
m_gene = re.match('gene_id (\d+)', cols[8])
100+
m_gene = re.match('gene_id[ =](\d+)', cols[8])
91101

92102
if m_gene:
93103
gene_num = m_gene.group(1)

0 commit comments

Comments
 (0)