Name Mapping#

The name mapping module implements all functionalities required to map external IDs like HMDB, ChEBI, KEGG, InCHI keys etc. to the internal IDs used in the mantra database.

While it is possible to use the base classes for mapping between certain databases, we recommend using the wrapper class (NameMapper).

Wrapper#

class pymantra.namemapping.NameMapper(**sqlite3_args)[source]#

Metabolite ID mapping class

Mapping between HMDB, ChEBI, NCBI, KEGG, Reactome, InChI, SMILES, VMH and the internal database IDs. The sources of mapping are coming from the resources themselves.

Parameters:
  • hmdb (HMDBQuery) – Interface to query from or to HMDB IDs

  • chebi (ChEBIQuery) – Interface to query from or to ChEBI IDs

  • reactome (ReactomeQuery) – Interface to query from or to Reactome IDs

  • mantra_db (MantraDBQuery) – Interface to query from or to mantra-internal IDs

  • query_functions (Dict[Tuple[str, str], callable]) – Dictionary mapping (source ID type, target ID type) to the correct mapping function

__init__(**sqlite3_args)[source]#

Construct NameMaper instance

Parameters:

sqlite3_args – Optional keywora arguments to be passed to sqlite3.connect() for all database connections

close()[source]#

Ensuring all database connections are closed

inchi_to_internal(inchi_id: str) List[str][source]#

Map InCHI key to mantra-internal ID

Parameters:

inchi_id (str) – InCHI key

Returns:

Mapped internal ID

Return type:

List[str]

inchi_to_reactome(inchi_id: str) List[str][source]#

Map InCHI key to Reactome

Parameters:

inchi_id (str) – InCHI ID

Returns:

Mapped Reactome ID

Return type:

List[str]

kegg_to_reactome(kegg_id: str) List[str][source]#

Map KEGG ID to Reactome ID

Parameters:

kegg_id (str) – KEGG ID

Returns:

List of mapped Reactome IDs

Return type:

List[str]

map_data(data: DataFrame | List[Dict[str, str]], remove_na: bool = False) DataFrame | List[MetaboliteIdentification][source]#

Maps multiple metabolites with one or multiple database identifiers to KEGG and Reactome IDs

Parameters:
  • data (Union[pd.DataFrame, List[Dict[str, str]]) – database identifiers to map. If a pandas DataFrame, each row represents a metabolite and each column a database (i.e. each cell is a database identifier) If a list, each list entry represents a metabolite and each key - value pair a database - ID pair.

  • remove_na (bool, default False) – If True metabolites for which no match was found are removed. This option is only considered when data is a DataFrame.

Returns:

If data is a DataFrame, so is the output. Columns in this case are ‘kegg’ and ‘reactome’ and index is the same as data.index. Else a list of 2-tuples, where each tuple represents the database ids for KEGG (0, kegg) and Reactome (1, reactome) found for the respective input.

Return type:

Union[pd.DataFrame, List[MetaboliteIdentification]]

map_id(id_: str, id_type: str, map_to: str | List[str], **kwargs) List[str] | List[tuple] | Set[tuple][source]#

Wrapper for name mapping functions. Takes an ID from a supported database and converts it to corresponding identifiers from other databases.

Parameters:
  • id (str) – ID to map

  • id_type (str) – database/ID type from which id_ is originating

  • map_to (Union[str, List[str]]) – String or list of strings specifying to which databases id_ should be mapped to

Returns:

List of strings is map_to is a string, where each element represents a match with the target database. List or set of tuples, if map_to is a list. Each element represents a mapping, where the first element is the mapped ID and the second element is the database from which this ID is coming

Return type:

Union[List[str], List[tuple], Set[tuple]]

Examples

>>> from pymantra.namemapping import NameMapper
>>> name_map = NameMapper()
>>>
>>> hmdb_ids = ["HMDB0003255", "HMDB0001051", "HMDB0006404"]
>>> hmdb_mapping = [name_map.map_id(id_, "hmdb", "internal")]
>>>
>>> kegg_ids = ["C00317", "C02154", "C05274"]
>>> for id_ in kegg_ids:
>>>     print(name_map.map_id(id_, "kegg", "internal"))
map_to_many(ids: Dict[str, List[str]], map_to: str | List[str], remove_duplicates: bool = True) Dict[str, List[List[str]]] | Dict[str, List[str]] | Dict[str, List[Tuple[str, str]]][source]#

Mapping multiple entries of multiple databases onto multiple other databases.

Parameters:
  • ids (Dict[str, List[str]]) – All ids to query (values) by id type (keys)

  • map_to (Union[str, List[str]]) – ID type to map

  • remove_duplicates (bool, default True) – If True only the first match will be returned if multiple matches are found

Returns:

Dict[str, List[str]], Dict[str, List[Tuple[str, str]]]]

Return type:

Union[Dict[str, List[List[str]]],

print_conversion_options()[source]#

Print all conversion options

reactome_to_inchi(reactome_id: str) List[str][source]#

Map Reactome ID to InCHI key

Parameters:

reactome_id (str) – Reactome ID

Returns:

List of mapped InCHI keys

Return type:

List[str]

reactome_to_kegg(reactome_id: str) List[str][source]#

Map Reactome ID to KEGG ID(s)

Parameters:

reactome_id (str) – Reactome ID

Returns:

List of mapped KEGG IDs

Return type:

List[str]

property conversion_options#

List all conversion options

Returns:

List of conversion options

Return type:

List[str]

Base Classes#

class pymantra.namemapping.ChEBIQuery(*args, **kwargs)[source]#

Query class to map from and to ChEBI

__init__(*args, **kwargs)[source]#
active_connection() bool[source]#

Verifying the connection to the database

Returns:

True if connection is established, else False

Return type:

bool

class pymantra.namemapping.HMDBQuery(*args, **kwargs)[source]#

Query class to map IDs between databases

__init__(*args, **kwargs)[source]#
active_connection() bool[source]#

Verifying the connection to the database

Returns:

True if connection is established, else False

Return type:

bool

multi_taxonomy_from_foreign_id(id_colum: str, taxonomy_columns: List[str], foreign_id: str) List[tuple] | Set[tuple][source]#

Get a HMDB entries for multiple hierarchies from a non-HMDB ID

taxonomy_from_foreign_id(id_colum: str, taxonomy_column: str, foreign_id: str) List[str][source]#

Get a HMDB hierarchy entry from a non-HMDB ID

class pymantra.namemapping.ReactomeQuery(*args, **kwargs)[source]#

Queries to map Reactome IDs to ChEBI and NCBI

__init__(*args, **kwargs)[source]#
active_connection() bool[source]#

Verifying the connection to the database

Returns:

True if connection is established, else False

Return type:

bool

class pymantra.namemapping.MantraDBQuery(*args, **kwargs)[source]#

Query class to map mantra IDs from and to other databases

__init__(*args, **kwargs)[source]#
active_connection() bool[source]#

Verifying the connection to the database

Returns:

True if connection is established, else False

Return type:

bool

chebi_to_internal(chebi_id: str, single_value: bool = False) List[str] | str | None[source]#

Map a ChEBI ID to a mantra ID

hmdb_to_internal(hmdb_id: str, single_value: bool = False) List[str] | str | None[source]#

Map a HMDB ID to a mantra ID

kegg_to_internal(kegg_id: str, single_value: bool = False) List[str] | str | None[source]#

Map a KEGG ID to a mantra ID

reactome_to_internal(reactome_id: str, single_value: bool = False) List[str] | str | None[source]#

Map a Reactome ID to a mantra ID

vmh_to_internal(vmh_id: str, single_value: bool = False) List[str] | str | None[source]#

Map a VMH ID to a mantra ID

Exceptions#

class pymantra.namemapping.UnknownMappingError(message: str)[source]#
__init__(message: str)[source]#