FEV_KEGG.Experiments.22 module

Context

19 found many EC numbers new to an example group of Enterobacteriales vs. the super group of Gammaproteobacteria. “108/190 -> 56.8% of EC numbers in Enterobacteriales are new, compared to Gammaproteobacteria consensus”

Question

Could this be due to incomplete data in KEGG? Because substance-ec graphs are intersected to form the consensus. Does the same result occur when intersecting the set of EC numbers itself?

Method

Similar to 19, only intersect sets instead of networks. - Create a group of example organisms of Order Enterobacteriales. - Create a group of example organisms of Class Gammaproteobacteria, including the same organsims as the group of Enterobacteriales. - Get sets of EC numbers from each graph. - Calculate consensus set for both groups (Order and Class). Leaving only EC numbers which occur in all organisms of the group. - Calculate the difference of the two sets of consensus EC numbers, leaving only the EC numbers which occur in Enterobacteriales consensus, but not in Gammaproteobacteria consensus. - Print these EC numbers and their percentage of all EC numbers in Enterobacteriales, ie. how many of the EC numbers in Enterobacteriales do not exist in Gammaproteobacteria consensus.

Result

107/190 -> 56.3% of EC numbers in Enterobacteriales are new, compared to Gammaproteobacteria consensus

Conclusion

Only one EC number (0.5%) was lost due to the way the consensus is calculated. This kind of error in KEGG data should not be able to cause much difference in results.

FEV_KEGG.Experiments.22.enterobacterialesEcSets()[source]
FEV_KEGG.Experiments.22.gammaproteobacteriaEcSets()[source]