Command Line Interface
The pysec2pri CLI provides easy commands for each supported database.
pysec2pri
pysec2pri: Secondary to Primary ID mapping tool.
Usage
pysec2pri [OPTIONS] COMMAND [ARGS]...
Options
- --version
Show the version and exit.
all
Export all formats for specified datasources.
Usage
pysec2pri all [OPTIONS]
Options
- -o, --output-dir <output_dir>
Output directory
- --datasources <datasources>
Comma-separated list of datasources
chebi
Parse ChEBI data and generate mappings.
Usage
pysec2pri chebi [OPTIONS] COMMAND [ARGS]...
ids
Generate ChEBI ID mappings (secondary to primary ChEBI IDs).
Formats: sec2pri (dict), pri_ids (set of current IDs), sssom/rdf/json/owl/all.
Usage
pysec2pri chebi ids [OPTIONS] [INPUT_FILE]
Options
- -o, --output <output>
Output file or directory
- --version <data_version>
Datasource release version.
- --subset <subset>
Compound subset.
- Default:
'3star'- Options:
3star | complete
- --format <output_format>
- Default:
'sssom'- Options:
sssom | sec2pri | pri_ids | rdf | json | owl | all
Arguments
- INPUT_FILE
Optional argument
synonyms
Generate ChEBI synonym mappings (previous/alias name to current name).
Formats: symbol_sec2pri (dict), name2synonym, pri_symbols (set of current names), sssom/rdf/json/owl/all.
Usage
pysec2pri chebi synonyms [OPTIONS] [INPUT_FILE]
Options
- -o, --output <output>
Output file or directory
- --version <data_version>
Datasource release version.
- --subset <subset>
Compound subset.
- Default:
'3star'- Options:
3star | complete
- --format <output_format>
- Default:
'sssom'- Options:
sssom | symbol_sec2pri | name2synonym | pri_symbols | rdf | json | owl | all
Arguments
- INPUT_FILE
Optional argument
diff
Compare two SSSOM mapping files and show differences.
Usage
pysec2pri diff [OPTIONS] FILE1 FILE2
Options
- -o, --output <output>
Output file for diff results (TSV)
- --show-all
Show all differences
- --datasource <datasource>
Datasource name for diff summary
Arguments
- FILE1
Required argument
- FILE2
Required argument
hgnc
Parse HGNC files and generate mappings.
Usage
pysec2pri hgnc [OPTIONS] COMMAND [ARGS]...
ids
Generate HGNC ID mappings (secondary to primary HGNC IDs).
Formats: sec2pri (dict), pri_ids (set of all current IDs), sssom/rdf/json/owl/all.
Usage
pysec2pri hgnc ids [OPTIONS] [INPUT_FILE]
Options
- -o, --output <output>
Output file or directory
- --version <data_version>
Datasource release version.
- --format <output_format>
- Default:
'sssom'- Options:
sssom | sec2pri | pri_ids | rdf | json | owl | all
Arguments
- INPUT_FILE
Optional argument
symbols
Generate HGNC symbol mappings (previous/alias to current symbol).
Formats: symbol_sec2pri (dict), pri_symbols (set of current symbols), sssom/rdf/json/owl/all.
Usage
pysec2pri hgnc symbols [OPTIONS] [COMPLETE_SET_FILE]
Options
- -o, --output <output>
Output file or directory
- --version <data_version>
Datasource release version.
- --format <output_format>
- Default:
'sssom'- Options:
sssom | symbol_sec2pri | pri_symbols | rdf | json | owl | all
Arguments
- COMPLETE_SET_FILE
Optional argument
hmdb
Parse HMDB XML files and generate secondary-to-primary mappings.
Usage
pysec2pri hmdb [OPTIONS]
Options
- --metabolites-file <metabolites_file>
- --proteins-file <proteins_file>
- --metabolites-only
- --proteins-only
- -o, --output <output>
Output file or directory
- --version <data_version>
Datasource release version.
- --format <output_format>
- Default:
'sssom'- Options:
sssom | sec2pri | pri_ids | rdf | json | owl | all
ncbi
Parse NCBI Gene files and generate mappings.
Usage
pysec2pri ncbi [OPTIONS] COMMAND [ARGS]...
ids
Generate NCBI Gene ID mappings (discontinued to current Gene IDs).
Formats: sec2pri (dict), pri_ids (set of all current IDs), sssom/rdf/json/owl/all.
Usage
pysec2pri ncbi ids [OPTIONS] [INPUT_FILE]
Options
- -o, --output <output>
Output file or directory
- --tax-id <tax_id>
Taxonomy ID.
- Default:
'9606'
- --version <data_version>
Datasource release version.
- --format <output_format>
- Default:
'sssom'- Options:
sssom | sec2pri | pri_ids | rdf | json | owl | all
Arguments
- INPUT_FILE
Optional argument
symbols
Generate NCBI Gene symbol mappings (previous to current gene symbols).
Formats: symbol_sec2pri (dict), pri_symbols (set of current symbols), sssom/rdf/json/owl/all.
Usage
pysec2pri ncbi symbols [OPTIONS] [INPUT_FILE]
Options
- -o, --output <output>
Output file or directory
- --tax-id <tax_id>
Taxonomy ID.
- Default:
'9606'
- --version <data_version>
Datasource release version.
- --format <output_format>
- Default:
'sssom'- Options:
sssom | symbol_sec2pri | pri_symbols | rdf | json | owl | all
Arguments
- INPUT_FILE
Optional argument
uniprot
Parse UniProt secondary accessions and generate mappings.
Usage
pysec2pri uniprot [OPTIONS] [INPUT_FILE]
Options
- -o, --output <output>
Output file or directory
- --version <data_version>
Datasource release version.
- --format <output_format>
- Default:
'sssom'- Options:
sssom | sec2pri | pri_ids | rdf | json | owl | all
- --delac-file <delac_file>
Arguments
- INPUT_FILE
Optional argument
update-ids
Resolve secondary IDs in INPUT_FILE to primary IDs using DATASOURCE mappings.
For each column specified with –at, a new column <col><suffix> is
added to the output containing the resolved primary identifiers.
Identifiers not found in the mapping are kept unchanged.
Pass –mapping to skip downloading/regenerating the mapping set and use an existing sec2pri TSV file instead.
Example:
pysec2pri update-ids my_genes.tsv hgnc --at gene_id -o my_genes_updated.tsv
pysec2pri update-ids my_genes.tsv hgnc --at gene_id --mapping hgnc_sec2pri.tsv
Usage
pysec2pri update-ids [OPTIONS] INPUT_FILE
{chebi|hgnc|hmdb|ncbi|uniprot|wikidata}
Options
- --at <COLUMN>
Required Column name(s) containing identifiers to resolve. Repeat for multiple columns.
- -o, --output <output_path>
Output file path (TSV or CSV).
- --suffix <suffix>
Suffix for new columns.
- Default:
'_primary'
- --sep <sep>
Delimiter (inferred from extension if omitted).
- --mapping <mapping_file>
Pre-built sec2pri TSV mapping file to use instead of regenerating.
- --version <data_version>
Datasource release version.
- --no-progress
Suppress progress bars.
Arguments
- INPUT_FILE
Required argument
- DATASOURCE
Required argument
update-symbols
Resolve previous/alias labels in INPUT_FILE to current labels using DATASOURCE.
For each column specified with –at, a new column <col><suffix> is
added containing the resolved current labels. Labels not found in the
mapping are kept unchanged.
Pass –mapping to skip downloading/regenerating the mapping set and use an existing symbol2prev TSV file instead.
Example:
pysec2pri update-symbols my_genes.tsv hgnc --at symbol -o my_genes_updated.tsv
pysec2pri update-symbols my_genes.tsv hgnc --at symbol --mapping hgnc_symbol2prev.tsv
Usage
pysec2pri update-symbols [OPTIONS] INPUT_FILE {chebi|hgnc|ncbi|wikidata}
Options
- --at <COLUMN>
Required Column name(s) containing symbols to resolve. Repeat for multiple columns.
- -o, --output <output_path>
Output file path (TSV or CSV).
- --suffix <suffix>
Suffix for new columns.
- Default:
'_current'
- --sep <sep>
Delimiter (inferred from extension if omitted).
- --mapping <mapping_file>
Pre-built symbol2prev TSV mapping file to use instead of regenerating.
- --tax-id <tax_id>
Taxonomy ID.
- Default:
'9606'
- --entity-type <entity_type>
Entity type to query. Queries all if omitted.
- Options:
metabolites | chemicals | genes | proteins
- --subset <subset>
Compound subset.
- Default:
'3star'- Options:
3star | complete
- --version <data_version>
Datasource release version.
- --no-progress
Suppress progress bars.
Arguments
- INPUT_FILE
Required argument
- DATASOURCE
Required argument
wikidata
Query Wikidata SPARQL for redirect mappings.
Usage
pysec2pri wikidata [OPTIONS] COMMAND [ARGS]...
ids
Generate Wikidata ID mappings (redirected to current Wikidata QIDs).
Formats: sec2pri (dict), pri_ids (set of current QIDs), sssom/rdf/json/owl/all.
Usage
pysec2pri wikidata ids [OPTIONS] [INPUT_FILE]
Options
- -o, --output <output>
Output file or directory
- --format <output_format>
- Default:
'sssom'- Options:
sssom | sec2pri | pri_ids | rdf | json | owl | all
- --entity-type <entity_type>
Entity type to query. Queries all if omitted.
- Options:
metabolites | chemicals | genes | proteins
- --test-subset
Use test queries (LIMIT 10)
Arguments
- INPUT_FILE
Optional argument
symbols
Generate Wikidata label mappings (previous label to current label).
Formats: symbol_sec2pri (dict), pri_symbols (set of current labels), sssom/rdf/json/owl/all.
Usage
pysec2pri wikidata symbols [OPTIONS] [INPUT_FILE]
Options
- -o, --output <output>
Output file or directory
- --format <output_format>
- Default:
'sssom'- Options:
sssom | symbol_sec2pri | pri_symbols | rdf | json | owl | all
- --entity-type <entity_type>
Entity type to query. Queries all if omitted.
- Options:
metabolites | chemicals | genes | proteins
- --test-subset
Use test queries (LIMIT 10)
Arguments
- INPUT_FILE
Optional argument