FEV_KEGG.Experiments.22 module¶
Context¶
19 found many EC numbers new to an example group of Enterobacteriales vs. the super group of Gammaproteobacteria.
“108/190 -> 56.8% of EC numbers in Enterobacteriales are new, compared to Gammaproteobacteria consensus”
Question¶
Could this be due to incomplete data in KEGG? Because substance-ec graphs are intersected to form the consensus. Does the same result occur when intersecting the set of EC numbers itself?
Method¶
Similar to 19, only intersect sets instead of networks.
- Create a group of example organisms of Order Enterobacteriales.
- Create a group of example organisms of Class Gammaproteobacteria, including the same organsims as the group of Enterobacteriales.
- Get sets of EC numbers from each graph.
- Calculate consensus set for both groups (Order and Class). Leaving only EC numbers which occur in all organisms of the group.
- Calculate the difference of the two sets of consensus EC numbers, leaving only the EC numbers which occur in Enterobacteriales consensus, but not in Gammaproteobacteria consensus.
- Print these EC numbers and their percentage of all EC numbers in Enterobacteriales, ie. how many of the EC numbers in Enterobacteriales do not exist in Gammaproteobacteria consensus.
Result¶
107/190 -> 56.3% of EC numbers in Enterobacteriales are new, compared to Gammaproteobacteria consensus