proMGE: a resource for the exploration of prokaryotic Mobile Genetic Elements

Version: 1.0
Design & Development:biobyte solutions GmbH
Data:European Molecular Biology Laboratory
Contacts:

proMGE allows the exploration of 2.4 million MGEs that are found across 76k genomes (from species with at least 2 genomes in the proGenomes2 database). At the species level, it is possible to explore different MGE categories, their genomic positions, antibiotic resistance carrying potential and the extent of their horizontal transfer across higher taxonomic levels: family and above (for each species HGT visualisation is provided by iTOL: interactive Tree Of Life). Additionally for Integrons, IS elements and transposons it is possible to explore if they’re nested within other MGE categories (under the 'Nested recombinase' tab in the species search results).

Workflow

MGE identification and classification

Note: Consider the following points while analysing and interpreting MGE predictions from this resource (i) MGE boundaries represent upper limits of genomic regions that harbour one or more MGEs of same or different types; (ii) all proteins within the MGE boundaries might not be sufficiently annotated, so that some of our gene context-based approaches for MGE category assignment might overlook them; (iii) beyond the recombinase marker gene, phage structural genes for phages and conjugation machinery genes for conjugative elements other gene features may not be annotated and or well-defined for all the predicted MGEs.

proMGE can annotate MGEs in user submitted protein sequences through the Annotate page. Sequences are annotated using a collection of 68 recombinase subfamily specific profile Hidden Markov Models (HMM) (Letunic et al., 2020). The association of these HMMs to specific MGE types as shown in the table below allows inference of 6 different MGE categories as per rules described in the figure.

Recombinase/endonuclease anchor familyHMM sourceSub-familyIS_TnPhageCEIntegronCellular
HUHThis studyhuh_y1
This studyhuh_y2
Pfam - PF03432Relaxase
Pfam - PF01446Rep_1
Pfam - PF01719Rep_2
Pfam - PF00799Gemini_AL1
Pfam - PF01076Mob_Pre
Pfam - PF03389MobA_MobL
Pfam - PF08724Rep_N
Pfam - PF08751TrwC
Pfam - PF05840Phage_GPA
Pfam - PF02407Viral_Rep
serineThis studyserine_tn
This studyserine_ce
This studyserine_lsr
tyrosineSmyshlyaev et al., 2019Arch1
Smyshlyaev et al., 2019Arch2
Smyshlyaev et al., 2019Int_BPP-1
Smyshlyaev et al., 2019Int_Tn916
Smyshlyaev et al., 2019Int_Brujita
Smyshlyaev et al., 2019TnpR
Smyshlyaev et al., 2019Int_CTnDOT
Smyshlyaev et al., 2019Candidate
Smyshlyaev et al., 2019Cyan
Smyshlyaev et al., 2019Int_Des
Smyshlyaev et al., 2019IntKX
Smyshlyaev et al., 2019Integron
Smyshlyaev et al., 2019Myc
Smyshlyaev et al., 2019Int_P2
Smyshlyaev et al., 2019RitA
Smyshlyaev et al., 2019RitB
Smyshlyaev et al., 2019RitC
Smyshlyaev et al., 2019Int_SXT
Smyshlyaev et al., 2019TnpA
Smyshlyaev et al., 2019Xer
Pfam - PF00589phage_integrase
DDEPfam - PF02371Transposase_20 (HHH)
Pfam - PF01610DDE_Tnp_ISL3
Pfam - PF00872Transposase_mut
Pfam - PF01385OrfB_IS605
Pfam - PF13610DDE_Tnp_IS240
Pfam - PF13586DDE_Tnp_1_2
Pfam - PF08722Tn7_Tnp_TnsA_N (PDDEXK)
Pfam - PF01526DDE_Tnp_Tn3
Pfam - PF13612DDE_Tnp_1_3
Pfam - PF04754Transposase_31(PDDEXK)
Pfam - PF03050DDE_Tnp_IS66
Pfam - PF12762DDE_Tnp_IS1595
Pfam - PF12784PDDEXK_2 (PDDEXK)
Pfam - PF02992Transposase_21
Pfam - PF13358DDE_3
Pfam - PF01609DDE_Tnp_1
Pfam - PF13359DDE_Tnp_4
Pfam - PF02914DDE_2
Pfam - PF13701DDE_Tnp_1_4
Pfam - PF13546DDE_5
Pfam - PF04693DDE_Tnp_2
Pfam - PF01548DEDD_Tnp_IS110 (DEDD)
Pfam - PF03400DDE_Tnp_IS1
Pfam - PF07592DDE_Tnp_ISAZ013
Pfam - PF13737DDE_Tnp_1_5
Pfam - PF13751DDE_Tnp_1_6
Pfam - PF03184DDE_1
Pfam - PF13843DDE_Tnp_1_7
Pfam - PF00665rve
Pfam - PF13333rve_2
Pfam - PF13683rve_3
cas1 soloThis studycas1