Diff

Compare SSSOM MappingSets to detect changes between versions.

Diff operations for comparing MappingSets between releases.

Uses Polars for efficient comparison of large mapping datasets.

class MappingDiff(old_version: str | None, new_version: str | None, datasource: str, added: ~polars.dataframe.frame.DataFrame = <factory>, removed: ~polars.dataframe.frame.DataFrame = <factory>, changed: ~polars.dataframe.frame.DataFrame = <factory>, intersection: ~polars.dataframe.frame.DataFrame = <factory>)[source]

Result of comparing two MappingSets.

property added_count: int

Number of added mappings.

property removed_count: int

Number of removed mappings.

property changed_count: int

Number of changed mappings.

property intersection_count: int

Number of mappings in both sets.

property total_changes: int

Total number of changes.

property has_changes: bool

Whether there are any changes.

diff_mapping_sets(old_set: MappingSet, new_set: MappingSet, datasource: str = 'unknown') MappingDiff[source]

Compare two MappingSets and find differences.

Parameters:
  • old_set – The older/previous MappingSet (sssom_schema.MappingSet).

  • new_set – The newer/current MappingSet (sssom_schema.MappingSet).

  • datasource – Name of the datasource for the diff.

Returns:

MappingDiff with added, removed, and changed mappings.

diff_sssom_files(old_file: Path | str, new_file: Path | str, datasource: str = 'unknown') MappingDiff[source]

Compare two SSSOM TSV files and find differences.

Parameters:
  • old_file – Path to the older SSSOM file.

  • new_file – Path to the newer SSSOM file.

  • datasource – Name of the datasource.

Returns:

MappingDiff with added, removed, and changed mappings.

summarize_diff(diff: MappingDiff) str[source]

Generate a human-readable summary of a diff.

Parameters:

diff – The MappingDiff to summarize.

Returns:

A formatted string summary.