FEV_KEGG.Graph.Elements module

exception FEV_KEGG.Graph.Elements.DrugIdError[source]

Bases: Exception

Raised if a SubstanceID is created from a drug ID, because only compounds and glycans are useful in our model of metabolism.

class FEV_KEGG.Graph.Elements.EcNumber(ecNumberString: 4.2.3.1)[source]

Bases: FEV_KEGG.Graph.Elements.Element

Represents an enzyme of metabolism by EC number, e.g. ‘4.2.3.1’.

Parameters:

ecNumberString (str) – EC number represented as a string. Will be checked for correct formatting!

Variables:
  • self.ecNumberString (str) – E.g. ‘4.2.3.-‘.
  • self.ecNumberLevels (List[str]) – E.g. [‘4’, ‘2’, ‘3’, ‘-‘].
  • self.ecNumberLevelsInteger (List[int]) – E.g. [4, 2, 3, -1]. A wildcard is translated to -1.
  • self.description (str) – Descriptive name of the enzymes behind this EC number. May likely be None. Usually a list of synonymous names.
  • self.name (str) – Short name of the enzymes behind this EC number. May likely be None. Is the shortest name occuring in description.
  • self.reaction (str) – IUBMB string describing the reaction formula. May likely be None.
Raises:

ValueError – If EC number is not formatted correctly.

See also

FEV_KEGG.Graph.SubstanceGraphs.SubstanceEcGraph.addEcDescriptions
The function to download and add self.description, self.name, and self.reaction.
REGEX_PATTERN = re.compile('^[1-7]\\.(([1-9][0-9]{0,1})|\\-)\\.(((?<!\\-\\.)([1-9][0-9]{0,1}))|\\-)\\.(((?<!\\-\\.)([1-9][0-9]{0,2}))|\\-)$')
WILDCARD = '-'
addDescription()[source]

Query KEGG and add further description to this EC number.

Warning

Much slower than doing addEcDescriptions() for several EC numbers in bulk!

static addEcDescriptions(ecNumbers: Iterable[T_co])[source]

Query KEGG for further descriptions and add them to each EC number in ecNumbers.

contains(ecNumber: FEV_KEGG.Graph.Elements.EcNumber) → bool[source]

Check whether this EC number is a superset of ecNumber, made possibly by the wildcard.

Parameters:ecNumber (EcNumber) – The EC number to compare against.
Returns:True, if the other EC number is part of the set of EC numbers defined by wildcard dashes in the levels of this EC number. For example 1.2.3.- contains 1.2.3.1 up to 1.2.3.999, but 1.2.3.4 can only contain itself.
Return type:bool
ecNumberLevelsInteger
classmethod fromArray(ecNumberLevels: Iterable[T_co]) → FEV_KEGG.Graph.Elements.EcNumber[source]

Creates EcNumber object from single EC number levels.

Parameters:ecNumberLevels (Iterable) – Iterable of the EC number levels, can be int or str. For a wildcard, obviously only str is reasonable.
Raises:ValueError – If the resulting EC number is not formatted correctly.
hasWildcard() → bool[source]

Whether this EC number contains a wildcard.

Returns:True if this EC number contains a wildcard (-) at any level, otherwise, returns False.
Return type:bool
static insertWildcards(ecNumbers: Iterable[T_co], keepLevels=3, allowHigherWildcards=True, returnSet=True, deduplicateList=False) → Iterable[T_co][source]

Turns EC numbers without wildcards into EC numbers with wildcards.

Returning them in a list preserves order.

Parameters:
  • ecNumbers (Iterable) – The EcNumber objects to abstract using wildcards.
  • keepLevels (int, optional) – The first x levels of each EC number are kept intact. If keepLevels == 3, turns 1.2.3.4 into 1.2.3.-. Only 1, 2, 3, and 4 are allowed. EC numbers already containing wildcards are left unchanged.
  • allowHigherWildcards (bool, optional) – If False and there is a wildcard in a level above ‘keepLevels’ (e.g. 3):, 1.2.3.4 -> 1.2.3.- and 2.3.4.- -> 2.3.4.-, but 3.4.-.- is removed completely.
  • returnSet (bool, optional) – If True, returns results in a set. Takes precedence over ‘deduplicateList’, as sets automatically deduplicate.
  • deduplicateList (bool, optional) – If True, result list is deduplicated before returning, preserving order.
Returns:

Either a list or a set of abstracted EC numbers.

Return type:

Iterable

Raises:

ValueError – If keepLevels is not one of [1, 2, 3, 4].

matchingLevels(ecNumber: FEV_KEGG.Graph.Elements.EcNumber, wildcardMatchesNumber=True) → int[source]

Determines the number of levels which match between this EC number and ecNumber.

This could act as a coarse distance measure for EC numbers.

Parameters:
  • ecNumber (EcNumber) – The EC number to compare against.
  • wildcardMatchesNumber (bool, optional) – If True, a wildcard acts as a sure match: ‘1.-.-.-‘.matchingLevels(‘1.2.3.4’) = 4. If False, a wildcard only matches another wildcard.
Returns:

Number of consecutive levels that match, if any, starting with the first (leftmost). ‘1.2.3.4’.matchingLevels(‘1.2.6.7’) = 2 because the first two levels match consecutively. ‘1.2.3.4’.matchingLevels(‘2.2.3.4’) = 0 because the very first level does not match.

Return type:

int

static removeWildcards(ecNumbers: Iterable[T_co]) → Iterable[T_co][source]

Remove EC numbers containing wildcards from an Iterable.

Parameters:ecNumbers (Iterable[EcNumber]) – The EcNumber objects to check for wildcards.
Returns:A new Iterable of the same type, containing only EC numbers which do not have a wildcard (-) anywhere. This does not deduplicate EC numbers.
Return type:Iterable[EcNumber]
class FEV_KEGG.Graph.Elements.Element(uniqueID: str)[source]

Bases: object

Generic graph element with a uniqueID.

Comparable (==, !=, <, >, <=, >=) and hashable by this unique ID. Converting to a string returns the uniqueID.

Parameters:uniqueID (str) – String uniquely identifying this element among all other possible elements.
Variables:self.uniqueID (str) – Unique element ID.
getRestUrl()[source]

Get the link to KEGG’s REST-API for this EC number.

Essentially the same as getUrl(), but meant to be read by machines, therefore no eye-candy.

Returns:URL to KEGG’s REST-API
Return type:str
getUrl()[source]

Get the link to KEGG for this EC number.

Returns:URL to KEGG.
Return type:str
toHtml(short=False, noTd=False)[source]

Get the Element’s string representation surrounded by its URL as an HTML line.

class FEV_KEGG.Graph.Elements.Enzyme(organismAbbreviation: eco, geneName: b0004, ecNumberStrings: List[str], name: thrC = None, description: (RefSeq) hydrogenase 4, subunit = None)[source]

Bases: FEV_KEGG.Graph.Elements.Element

Represents an enzyme of metabolism.

It has exactly one GeneID, which is its unique identifier.

Parameters:
  • organismAbbreviation (str) – Abbreviation string of the organism this enzyme belongs to, as known to KEGG, e.g. ‘eco’. Must obviously be unique and existant in KEGG.
  • geneName (str) – Name of the gene which represents this enzyme, e.g. ‘b0004’. Will be combined with organismAbbreviation to form the unique GeneID. Thus, must be unique within the organism.
  • ecNumberStrings (List[str]) – List of strings representing the EC numbers associated with this enzyme. Will be split and parsed into EcNumber objects.
  • name (str, optional) – Colloquial name of this enzyme, e.g. ‘thrC’. This is not used for automatic identification, you may make it None.
  • description (str, optional) – Full description of this enzyme from KEGG, e.g. ‘(RefSeq) hydrogenase 4, subunit’. This is not used for automatic identification, you may make it None.
Variables:
  • self.organismAbbreviation (str) –
  • self.geneName (str) –
  • self.geneID (GeneID) –
  • self.name (str) –
  • self.ecNumbers (Set[EcNumber]) –
  • self.description (str) –
Raises:

ValueError – If organismAbbreviation and geneName do not form a valid gene ID. Or if any of the EC numbers in ecNumberStrings is not a valid EC number.

Note

This does not check if the organism, gene ID, EC numbers, or anything else actually exist in KEGG! You will find out eventually when trying to retrieve information about them.

classmethod fromGene(gene: FEV_KEGG.KEGG.DataTypes.Gene) → FEV_KEGG.Graph.Elements.Enzyme[source]

Creates an Enzyme from a FEV_KEGG.KEGG.DataTypes.Gene.

Parameters:gene (Gene) – Gene object, retrieved and parsed from KEGG GENE at some point.
Returns:An enzyme object.
Return type:Enzyme
Raises:ValueError – If organismAbbreviation and geneName do not form a valid gene ID. Or if any of the EC numbers in ecNumberStrings is not a valid EC number.
getEcNumbersString()[source]

EC numbers associated with this enzyme as a string.

Returns:EC numbers associated with this enzyme in a string, eg. ‘1.2.3.4, 2.3.4.5’
Return type:str
class FEV_KEGG.Graph.Elements.EnzymeComplete(gene: FEV_KEGG.KEGG.DataTypes.Gene)[source]

Bases: FEV_KEGG.Graph.Elements.Enzyme

Represents an enzyme of metabolism, saving the original underlying gene description gene for later manual use.

The underlying gene description is usually not necessary, use the parent class to save memory space.

Parameters:gene (Gene) – Gene object, retrieved and parsed from KEGG GENE at some point. Will be kept in memory in the gene attribute.
Variables:self.gene (FEV_KEGG.KEGG.DataTypes.Gene) – Original underlying gene description.
Raises:ValueError – See parent class.
class FEV_KEGG.Graph.Elements.GeneID(geneIDString: eco:b0004)[source]

Bases: FEV_KEGG.Graph.Elements.Element

Represents am enzyme of metabolism by gene ID, e.g. ‘eco:b0004’.

Parameters:geneIDString (str) – Gene ID represented by a string, e.g. ‘eco:b0004’. Will be checked for correct formatting!
Variables:self.geneIDString (str) –
Raises:ValueError – If gene ID is not formatted correctly.
REGEX_PATTERN = re.compile('^[a-z]{3,4}:[a-zA-Z0-9_\\-\\.]+$')
geneName

returns: ‘b0004’ from ‘eco:b0004’. :rtype: str

organismAbbreviation

returns: ‘eco’ from ‘eco:b0004’. :rtype: str

class FEV_KEGG.Graph.Elements.KeggOrthologyID(keggOrthologyIDString: K01733)[source]

Bases: FEV_KEGG.Graph.Elements.Element

Represents an enzyme of metabolism by KEGG Orthology ID.

Parameters:keggOrthologyIDString (str) – String representation of a KEGG Orthology ID. Will be checked for correct formatting!
Variables:self.keggOrthologyIDString (str) –
Raises:ValueError – If KEGG Orthology ID is not formatted correctly.
REGEX_PATTERN = re.compile('^K[0-9]{5}$')
class FEV_KEGG.Graph.Elements.ReactionID(keggReactionID: R01899)[source]

Bases: FEV_KEGG.Graph.Elements.Element

Represents a reaction of metabolism by reaction ID from KEGG, eg. ‘R01899’.

Parameters:keggReactionID (str) – Unique ID of the reaction.
Variables:self.keggReactionID (str) – Unique reaction ID.

Note

This does not check if the reaction actually exists in KEGG! You will find out eventually when trying to retrieve information about it.

class FEV_KEGG.Graph.Elements.SubstanceID(keggSubstanceID: C01102)[source]

Bases: FEV_KEGG.Graph.Elements.Element

Represents a substrate/product of metabolism by compound/glycan ID from KEGG, eg. ‘C01102’ or ‘G00160’.

Parameters:
  • keggSubstanceID (str) – Unique ID of the compound or glycan.
  • description (str, optional) – Descriptive chemical name of the compound/glycan.
Variables:
  • self.keggCompoundID (str) – Unique compound/glycan ID.
  • self.description (str) – Descriptive chemical name of the compound/glycan. May likely be None. Usually a list of synonymous names.
  • self.name (str) – Short chemical name of the compound/glycan. May likely be None. Is the shortest name occuring in description.
Raises:

DrugIdError – Drug IDs, eg. D08603, raise a DrugIdError, because only compounds and glycans are useful in our model of metabolism. Use the synonymous Compound ID instead.

Note

This does not check if the compound/glycan actually exists in KEGG! You will find out eventually when trying to retrieve information about it.

See also

FEV_KEGG.Graph.SubstanceGraphs.SubstanceGraph.addSubstanceDescriptions
The function to download and add self.description, and self.name.
REGEX_PATTERN = re.compile('^C|G[0-9]{5}$')