Name Mapping#
The name mapping module implements all functionalities required to map external IDs like HMDB, ChEBI, KEGG, InCHI keys etc. to the internal IDs used in the mantra database.
While it is possible to use the base classes for mapping between certain databases, we recommend using the wrapper class (NameMapper).
Wrapper#
- class pymantra.namemapping.NameMapper(**sqlite3_args)[source]#
Metabolite ID mapping class
Mapping between HMDB, ChEBI, NCBI, KEGG, Reactome, InChI, SMILES, VMH and the internal database IDs. The sources of mapping are coming from the resources themselves.
- Parameters:
hmdb (HMDBQuery) – Interface to query from or to HMDB IDs
chebi (ChEBIQuery) – Interface to query from or to ChEBI IDs
reactome (ReactomeQuery) – Interface to query from or to Reactome IDs
mantra_db (MantraDBQuery) – Interface to query from or to mantra-internal IDs
query_functions (Dict[Tuple[str, str], callable]) – Dictionary mapping (source ID type, target ID type) to the correct mapping function
- __init__(**sqlite3_args)[source]#
Construct NameMaper instance
- Parameters:
sqlite3_args – Optional keywora arguments to be passed to
sqlite3.connect()for all database connections
- map_data(data: DataFrame | List[Dict[str, str]], remove_na: bool = False) DataFrame | List[MetaboliteIdentification][source]#
Maps multiple metabolites with one or multiple database identifiers to KEGG and Reactome IDs
- Parameters:
data (Union[pd.DataFrame, List[Dict[str, str]]) – database identifiers to map. If a pandas DataFrame, each row represents a metabolite and each column a database (i.e. each cell is a database identifier) If a list, each list entry represents a metabolite and each key - value pair a database - ID pair.
remove_na (bool, default False) – If True metabolites for which no match was found are removed. This option is only considered when data is a DataFrame.
- Returns:
If data is a DataFrame, so is the output. Columns in this case are ‘kegg’ and ‘reactome’ and index is the same as data.index. Else a list of 2-tuples, where each tuple represents the database ids for KEGG (0, kegg) and Reactome (1, reactome) found for the respective input.
- Return type:
Union[pd.DataFrame, List[MetaboliteIdentification]]
- map_id(id_: str, id_type: str, map_to: str | List[str], **kwargs) List[str] | List[tuple] | Set[tuple][source]#
Wrapper for name mapping functions. Takes an ID from a supported database and converts it to corresponding identifiers from other databases.
- Parameters:
- Returns:
List of strings is map_to is a string, where each element represents a match with the target database. List or set of tuples, if map_to is a list. Each element represents a mapping, where the first element is the mapped ID and the second element is the database from which this ID is coming
- Return type:
Examples
>>> from pymantra.namemapping import NameMapper >>> name_map = NameMapper() >>> >>> hmdb_ids = ["HMDB0003255", "HMDB0001051", "HMDB0006404"] >>> hmdb_mapping = [name_map.map_id(id_, "hmdb", "internal")] >>> >>> kegg_ids = ["C00317", "C02154", "C05274"] >>> for id_ in kegg_ids: >>> print(name_map.map_id(id_, "kegg", "internal"))
- map_to_many(ids: Dict[str, List[str]], map_to: str | List[str], remove_duplicates: bool = True) Dict[str, List[List[str]]] | Dict[str, List[str]] | Dict[str, List[Tuple[str, str]]][source]#
Mapping multiple entries of multiple databases onto multiple other databases.
- Parameters:
- Returns:
Dict[str, List[str]], Dict[str, List[Tuple[str, str]]]]
- Return type:
Base Classes#
- class pymantra.namemapping.ChEBIQuery(*args, **kwargs)[source]#
Query class to map from and to ChEBI
- class pymantra.namemapping.HMDBQuery(*args, **kwargs)[source]#
Query class to map IDs between databases
- active_connection() bool[source]#
Verifying the connection to the database
- Returns:
True if connection is established, else False
- Return type:
- class pymantra.namemapping.ReactomeQuery(*args, **kwargs)[source]#
Queries to map Reactome IDs to ChEBI and NCBI
- class pymantra.namemapping.MantraDBQuery(*args, **kwargs)[source]#
Query class to map mantra IDs from and to other databases
- active_connection() bool[source]#
Verifying the connection to the database
- Returns:
True if connection is established, else False
- Return type:
- chebi_to_internal(chebi_id: str, single_value: bool = False) List[str] | str | None[source]#
Map a ChEBI ID to a mantra ID
- hmdb_to_internal(hmdb_id: str, single_value: bool = False) List[str] | str | None[source]#
Map a HMDB ID to a mantra ID
- kegg_to_internal(kegg_id: str, single_value: bool = False) List[str] | str | None[source]#
Map a KEGG ID to a mantra ID