FEV_KEGG.Experiments.28 module

Context

The approach of building a consensus/majority graph of enzymes/EC numbers to find a core metabolism shared among several organisms has to be validated against previous research. One such previous research deals with E. coli, but uses a different approach, asking not what the core metabolism ‘can do’, but what it ‘always does’. Almaas et al. (2005) list core reactions calculated via flux analysis in table S1 (https://doi.org/10.1371/journal.pcbi.0010068.st001), some of which have annotated EC numbers. These EC numbers are used to validate the approach of this library. Multifunctional enzymes and EC numbers containing wildcards (e.g. 1.2.-.-) are excluded on both sides, to minimise statistical skew. This leaves 62 EC numbers in Almaas’ approach.

Question

Does the consensus/majority graph approach to core metabolism yield a similar set of EC numbers as the approach of Almaas et al. (2005)?

Method

  • extract EC numbers from Almaas et al. (2005) by hand
  • get group of organisms ‘Escherichia coli’
  • REPEAT for varying majority-percentages:
  • calculate EC numbers occuring in group’s core metabolism
  • overlap Almaas’ set with ours and print amount of EC numbers inside the intersection and falling off either side

Result

Maj. %   others    both    ours
100%:    28        34      381
 90%:    19        43      491
 80%:    19        43      499
 70%:    19        43      510
 60%:    19        43      518
 50%:    19        43      522
 40%:    19        43      531
 30%:    19        43      542
 20%:    19        43      550
 10%:    19        43      564
  1%:    19        43      602

Conclusion

With a 90% majority and below, the number of overlapping ECs does not increase any more. This indicates that, at least for E. coli, a 90% majority is enough to create a stable core metabolism, diminishing the skew excerted by unusually specialised organisms. In the case of E. coli these could be soil-based E. coli strains, which remains to be researched.

About 69% of the ECs in the reaction-based core metabolism, as postulated by Almaas et al., are also included in the majority-based core metabolism of our approach. Due to some ECs missing in Almaas’ table S1, this percentage could have been even bigger. This substantial overlap shows most essential reactions are also covered by a majority approach. However, this goes along with two interesting observations:

  1. 31% of essential reactions are not included in any majority, not even in a single organism from KEGG at 1% majority (effectively n=1).
    This could be because of a flaw in either approach, or because the data Almaas et al. use stems from the year 2000 (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC25862/) and it might be that then Escherichia coli MG1655 was said to include different ECs than in today’s KEGG database. This has to be investigated.
  2. Only 8% of majority ECs (at 90%) are essential reactions. This indicates that while E. coli organisms share many ECs, most of them are only active at
    certain times.