"- extract the gene ID from the `#query` field of the EggNOG-mapper output\n",
"- break up the content of the attributes field of the GFF file into a dictionary\n",
"- find the correct protein name for a gene ID"
]
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"def parse_gene_id(x):\n",
" \"\"\"Extract gene ID from a string\n",
"\n",
" Parameters\n",
" ----------\n",
" x : str\n",
" A protein ID from the eggNOG-mapper output.\n",
"\n",
" Returns\n",
" -------\n",
" str\n",
" will return the gene ID in the format of 'PB.X' (PacBio genes) or 'gX' (BRAKER round 1) or 'r2_gX' (BRAKER round 2) or 'at_DNX (de-novo transcriptome-assembled genes)'\n",
- extract the gene ID from the `#query` field of the EggNOG-mapper output
- break up the content of the attributes field of the GFF file into a dictionary
- find the correct protein name for a gene ID
%% Cell type:code id: tags:
``` python
defparse_gene_id(x):
"""Extract gene ID from a string
Parameters
----------
x : str
A protein ID from the eggNOG-mapper output.
Returns
-------
str
will return the gene ID in the format of 'PB.X' (PacBio genes) or 'gX' (BRAKER round 1) or 'r2_gX' (BRAKER round 2) or 'at_DNX (de-novo transcriptome-assembled genes)'