Analysis¶

analysis ¶

Classes¶

SHAPAnalyzer ¶

SHAPAnalyzer(backend: Backend, min_abs_shap: float = 0.0)

Analyze SHAP explanations stored in a backend.

Provides methods for computing summary statistics, comparing time periods, and detecting changes in feature importance over time.

Parameters:

Name	Type	Description	Default
`backend`	`Backend`	Backend for retrieving stored SHAP explanations.	required
`min_abs_shap`	`float`	Minimum mean absolute SHAP value threshold (default: 0.0). Features below this threshold are excluded from results. Useful for filtering out low-impact features and reducing noise.	`0.0`

Examples:

>>> from datetime import datetime
>>> from shapmonitor.backends import ParquetBackend
>>> from shapmonitor.analysis import SHAPAnalyzer
>>> backend = ParquetBackend("/path/to/shap_logs")
>>> analyzer = SHAPAnalyzer(backend, min_abs_shap=0.01)
>>> summary = analyzer.summary(datetime(2025, 1, 1), datetime(2025, 1, 31))

Source code in shapmonitor/analysis/_analyzer.py

def __init__(self, backend: Backend, min_abs_shap: float = 0.0) -> None:
    self._backend = backend
    self._min_abs_shap = min_abs_shap

Attributes¶

min_abs_shap `property` ¶

min_abs_shap: float

Get the minimum absolute SHAP value threshold.

backend `property` ¶

backend: Backend

Get the backend for retrieving explanations.

Functions¶

fetch_shap_values ¶

fetch_shap_values(**kwargs) -> DFrameLike

Fetch raw SHAP values from the backend within a date range.

Parameters:

Name	Type	Description	Default
`kwargs`			`{}`

Returns:

Type	Description
`DataFrame`	Raw SHAP values indexed by timestamp.

Source code in shapmonitor/analysis/_analyzer.py

def fetch_shap_values(self, **kwargs) -> DFrameLike:
    """Fetch raw SHAP values from the backend within a date range.

    Parameters
    ----------
    kwargs: Backend read parameters

    Returns
    -------
    DataFrame
        Raw SHAP values indexed by timestamp.
    """
    df = self._backend.read(**kwargs)

    if df.empty:
        _logger.warning("No data found for kwargs: %s", kwargs)
        return pd.DataFrame()

    return df.filter(like="shap_")

summary ¶

summary(
    start_dt: datetime | date | None = None,
    end_dt: datetime | date | None = None,
    batch_id: str | None = None,
    model_version: str | None = None,
    sort_by: str = "mean_abs",
    top_k: int | None = None,
) -> DFrameLike

Compute summary statistics for SHAP values in a date range.

Parameters:

Name	Type	Description	Default
`start_dt`	`datetime \| date`	Start of the date range (inclusive).	`None`
`end_dt`	`datetime \| date`	End of the date range (inclusive).	`None`
`batch_id`	`str`	Batch ID to filter results to a specific batch.	`None`
`model_version`	`str`	Model version to filter results to a specific model version.	`None`
`sort_by`	`str`	Column to sort results by (default: 'mean_abs'). Options: 'mean_abs', 'mean', 'std', 'min', 'max'.	`'mean_abs'`
`top_k`	`int \| None`	If set, return only the top k features after sorting. Must be a positive integer. Default is None (return all features).	`None`

Returns:

Type	Description
`DataFrame`	Summary statistics indexed by feature name (dtype: float32). Columns: - mean_abs: Mean of absolute SHAP values (feature importance) - mean: Mean SHAP value (contribution direction) - std: Standard deviation of SHAP values - min: Minimum SHAP value - max: Maximum SHAP value Attributes: - n_samples: Total number of samples in the date range

Notes

Features with mean_abs below min_abs_shap threshold are excluded.

Source code in shapmonitor/analysis/_analyzer.py

def summary(
    self,
    start_dt: datetime | date | None = None,
    end_dt: datetime | date | None = None,
    batch_id: str | None = None,
    model_version: str | None = None,
    sort_by: str = "mean_abs",
    top_k: int | None = None,
) -> DFrameLike:
    """Compute summary statistics for SHAP values in a date range.

    Parameters
    ----------
    start_dt : datetime | date, optional
        Start of the date range (inclusive).
    end_dt : datetime | date, optional
        End of the date range (inclusive).
    batch_id : str, optional
        Batch ID to filter results to a specific batch.
    model_version : str, optional
        Model version to filter results to a specific model version.
    sort_by : str, optional
        Column to sort results by (default: 'mean_abs').
        Options: 'mean_abs', 'mean', 'std', 'min', 'max'.
    top_k : int | None, optional
        If set, return only the top k features after sorting.
        Must be a positive integer. Default is None (return all features).

    Returns
    -------
    DataFrame
        Summary statistics indexed by feature name (dtype: float32).

        Columns:
            - mean_abs: Mean of absolute SHAP values (feature importance)
            - mean: Mean SHAP value (contribution direction)
            - std: Standard deviation of SHAP values
            - min: Minimum SHAP value
            - max: Maximum SHAP value

        Attributes:
            - n_samples: Total number of samples in the date range

    Notes
    -----
    Features with mean_abs below `min_abs_shap` threshold are excluded.
    """
    self._validate_top_k(top_k)

    shap_df = self._fetch_and_strip_shap_values(
        start_dt=start_dt, end_dt=end_dt, batch_id=batch_id, model_version=model_version
    )
    result = self._construct_summary(shap_df)

    if sort_by not in result.columns:
        raise ValueError(
            f"Invalid sort_by value: {sort_by}. Must be one of {list(result.columns)}"
        )

    # TODO: Add relationship correlation with target if feature values and predictions are available

    result = result.sort_values(by=sort_by, ascending=False)
    if top_k is not None:
        result = result.head(top_k)
    return result

compare_time_periods ¶

compare_time_periods(
    period_ref: Period,
    period_curr: Period,
    sort_by: str = "psi",
    top_k: int | None = None,
) -> DFrameLike

Compare SHAP explanations between two time periods.

Useful for detecting feature importance drift, ranking changes, and sign flips in model behavior over time.

Parameters:

Name	Type	Description	Default
`period_ref`	`Period`	Tuple of (start_dt, end_dt) defining the reference date range (both inclusive).	required
`period_curr`	`Period`	Tuple of (start_dt, end_dt) defining the current date range (both inclusive).	required
`sort_by`	`str`	Column to sort results by (default: 'psi').	`'psi'`
`top_k`	`int \| None`	If set, return only the top k features after sorting. Must be a positive integer. Default is None (return all features).	`None`

Returns:

Type Description

DataFrame

Comparison statistics indexed by feature name.

Columns: - psi: Population Stability Index between periods - mean_abs_1, mean_abs_2: Feature importance per period - delta_mean_abs: Absolute change (period_2 - period_1) - pct_delta_mean_abs: Percentage change from period_1 - mean_1, mean_2: Mean SHAP value (direction) per period - rank_1, rank_2: Feature importance rank per period - delta_rank: Rank change (positive = less important) - rank_change: 'increased', 'decreased', or 'no_change' - sign_flip: True if contribution direction changed

Attributes: - n_samples_1: Sample count in period 1 - n_samples_2: Sample count in period 2

Notes

Features with mean_abs below min_abs_shap threshold are excluded. Uses outer join, so features appearing in only one period will have NaN.

Below is a guideline for interpreting PSI values:

PSI Value	Interpretation
0	Identical distributions
< 0.1	No significant shift
0.1 - 0.25	Moderate shift, investigate
0.25 - 0.5	Significant shift
> 0.5	Severe shift

Source code in shapmonitor/analysis/_analyzer.py

def compare_time_periods(
    self,
    period_ref: Period,
    period_curr: Period,
    sort_by: str = "psi",
    top_k: int | None = None,
) -> DFrameLike:
    """Compare SHAP explanations between two time periods.

    Useful for detecting feature importance drift, ranking changes,
    and sign flips in model behavior over time.

    Parameters
    ----------
    period_ref : Period
        Tuple of (start_dt, end_dt) defining the reference date range (both inclusive).
    period_curr : Period
        Tuple of (start_dt, end_dt) defining the current date range (both inclusive).
    sort_by : str, optional
        Column to sort results by (default: 'psi').
    top_k : int | None, optional
        If set, return only the top k features after sorting.
        Must be a positive integer. Default is None (return all features).

    Returns
    -------
    DataFrame
        Comparison statistics indexed by feature name.

        Columns:
            - psi: Population Stability Index between periods
            - mean_abs_1, mean_abs_2: Feature importance per period
            - delta_mean_abs: Absolute change (period_2 - period_1)
            - pct_delta_mean_abs: Percentage change from period_1
            - mean_1, mean_2: Mean SHAP value (direction) per period
            - rank_1, rank_2: Feature importance rank per period
            - delta_rank: Rank change (positive = less important)
            - rank_change: 'increased', 'decreased', or 'no_change'
            - sign_flip: True if contribution direction changed

        Attributes:
            - n_samples_1: Sample count in period 1
            - n_samples_2: Sample count in period 2

    Notes
    -----
    Features with mean_abs below `min_abs_shap` threshold are excluded.
    Uses outer join, so features appearing in only one period will have NaN.

    Below is a guideline for interpreting PSI values:

      | PSI Value  | Interpretation              |
      |------------|-----------------------------|
      | 0          | Identical distributions     |
      | < 0.1      | No significant shift        |
      | 0.1 - 0.25 | Moderate shift, investigate |
      | 0.25 - 0.5 | Significant shift           |
      | > 0.5      | Severe shift                |


    """
    self._validate_top_k(top_k)

    shap_df_ref = self._fetch_and_strip_shap_values(
        start_dt=period_ref[0], end_dt=period_ref[1]
    )
    shap_df_curr = self._fetch_and_strip_shap_values(
        start_dt=period_curr[0], end_dt=period_curr[1]
    )

    return self._compare_shap_dataframes(shap_df_ref, shap_df_curr, sort_by, top_k)

compare_batches ¶

compare_batches(
    batch_ref: str,
    batch_curr: str,
    sort_by: str = "psi",
    top_k: int | None = None,
) -> DFrameLike

Compare SHAP explanations between two batches.

Parameters:

Name	Type	Description	Default
`batch_ref`	`str`	Identifier for the first batch.	required
`batch_curr`	`str`	Identifier for the second batch.	required
`sort_by`	`str`	Column to sort results by (default: 'psi').	`'psi'`
`top_k`	`int \| None`	If set, return only the top k features after sorting. Must be a positive integer. Default is None (return all features).	`None`

Returns:

Type Description

DataFrame

Comparison of SHAP statistics between the two batches.

Columns: - psi: Population Stability Index between periods - mean_abs_1, mean_abs_2: Feature importance per period - delta_mean_abs: Absolute change (period_2 - period_1) - pct_delta_mean_abs: Percentage change from period_1 - mean_1, mean_2: Mean SHAP value (direction) per period - rank_1, rank_2: Feature importance rank per period - delta_rank: Rank change (positive = less important) - rank_change: 'increased', 'decreased', or 'no_change' - sign_flip: True if contribution direction changed

Attributes: - n_samples_1: Sample count in period 1 - n_samples_2: Sample count in period 2

Notes

Features with mean_abs below min_abs_shap threshold are excluded. Uses outer join, so features appearing in only one period will have NaN.

Below is a guideline for interpreting PSI values:

PSI Value	Interpretation
0	Identical distributions
< 0.1	No significant shift
0.1 - 0.25	Moderate shift, investigate
0.25 - 0.5	Significant shift
> 0.5	Severe shift

Source code in shapmonitor/analysis/_analyzer.py

def compare_batches(
    self,
    batch_ref: str,
    batch_curr: str,
    sort_by: str = "psi",
    top_k: int | None = None,
) -> DFrameLike:
    """Compare SHAP explanations between two batches.

    Parameters
    ----------
    batch_ref : str
        Identifier for the first batch.
    batch_curr : str
        Identifier for the second batch.
    sort_by : str, optional
        Column to sort results by (default: 'psi').
    top_k : int | None, optional
        If set, return only the top k features after sorting.
        Must be a positive integer. Default is None (return all features).

    Returns
    -------
    DataFrame
        Comparison of SHAP statistics between the two batches.

        Columns:
            - psi: Population Stability Index between periods
            - mean_abs_1, mean_abs_2: Feature importance per period
            - delta_mean_abs: Absolute change (period_2 - period_1)
            - pct_delta_mean_abs: Percentage change from period_1
            - mean_1, mean_2: Mean SHAP value (direction) per period
            - rank_1, rank_2: Feature importance rank per period
            - delta_rank: Rank change (positive = less important)
            - rank_change: 'increased', 'decreased', or 'no_change'
            - sign_flip: True if contribution direction changed

        Attributes:
            - n_samples_1: Sample count in period 1
            - n_samples_2: Sample count in period 2

    Notes
    -----
    Features with mean_abs below `min_abs_shap` threshold are excluded.
    Uses outer join, so features appearing in only one period will have NaN.

    Below is a guideline for interpreting PSI values:

      | PSI Value  | Interpretation              |
      |------------|-----------------------------|
      | 0          | Identical distributions     |
      | < 0.1      | No significant shift        |
      | 0.1 - 0.25 | Moderate shift, investigate |
      | 0.25 - 0.5 | Significant shift           |
      | > 0.5      | Severe shift                |
    """
    self._validate_top_k(top_k)

    shap_df_ref = self._fetch_and_strip_shap_values(batch_id=batch_ref)
    shap_df_curr = self._fetch_and_strip_shap_values(batch_id=batch_curr)

    return self._compare_shap_dataframes(shap_df_ref, shap_df_curr, sort_by, top_k)

compare_versions ¶

compare_versions(
    model_version_ref: str,
    model_version_curr: str,
    sort_by: str = "psi",
    top_k: int | None = None,
) -> DFrameLike

Compare SHAP explanations across different model versions.

Parameters:

Name	Type	Description	Default
`model_version_ref`	`str`	Reference model version identifier.	required
`model_version_curr`	`str`	Current model version identifier.	required
`sort_by`	`str`	Column to sort results by (default: 'psi').	`'psi'`
`top_k`	`int \| None`	If set, return only the top k features after sorting. Must be a positive integer. Default is None (return all features).	`None`

Returns:

Type Description

DataFrame

Comparison of SHAP statistics across model versions.

Columns: - psi: Population Stability Index between periods - mean_abs_1, mean_abs_2: Feature importance per period - delta_mean_abs: Absolute change (period_2 - period_1) - pct_delta_mean_abs: Percentage change from period_1 - mean_1, mean_2: Mean SHAP value (direction) per period - rank_1, rank_2: Feature importance rank per period - delta_rank: Rank change (positive = less important) - rank_change: 'increased', 'decreased', or 'no_change' - sign_flip: True if contribution direction changed

Attributes: - n_samples_1: Sample count in period 1 - n_samples_2: Sample count in period 2

Notes

Features with mean_abs below min_abs_shap threshold are excluded. Uses outer join, so features appearing in only one period will have NaN.

Below is a guideline for interpreting PSI values:

PSI Value	Interpretation
0	Identical distributions
< 0.1	No significant shift
0.1 - 0.25	Moderate shift, investigate
0.25 - 0.5	Significant shift
> 0.5	Severe shift

Source code in shapmonitor/analysis/_analyzer.py

def compare_versions(
    self,
    model_version_ref: str,
    model_version_curr: str,
    sort_by: str = "psi",
    top_k: int | None = None,
) -> DFrameLike:
    """Compare SHAP explanations across different model versions.

    Parameters
    ----------
    model_version_ref : str
        Reference model version identifier.
    model_version_curr : str
        Current model version identifier.
    sort_by : str, optional
        Column to sort results by (default: 'psi').
    top_k : int | None, optional
        If set, return only the top k features after sorting.
        Must be a positive integer. Default is None (return all features).

    Returns
    -------
    DataFrame
        Comparison of SHAP statistics across model versions.

        Columns:
            - psi: Population Stability Index between periods
            - mean_abs_1, mean_abs_2: Feature importance per period
            - delta_mean_abs: Absolute change (period_2 - period_1)
            - pct_delta_mean_abs: Percentage change from period_1
            - mean_1, mean_2: Mean SHAP value (direction) per period
            - rank_1, rank_2: Feature importance rank per period
            - delta_rank: Rank change (positive = less important)
            - rank_change: 'increased', 'decreased', or 'no_change'
            - sign_flip: True if contribution direction changed

        Attributes:
            - n_samples_1: Sample count in period 1
            - n_samples_2: Sample count in period 2

    Notes
    -----
    Features with mean_abs below `min_abs_shap` threshold are excluded.
    Uses outer join, so features appearing in only one period will have NaN.

    Below is a guideline for interpreting PSI values:

      | PSI Value  | Interpretation              |
      |------------|-----------------------------|
      | 0          | Identical distributions     |
      | < 0.1      | No significant shift        |
      | 0.1 - 0.25 | Moderate shift, investigate |
      | 0.25 - 0.5 | Significant shift           |
      | > 0.5      | Severe shift                |
    """
    self._validate_top_k(top_k)

    shap_df_ref = self._fetch_and_strip_shap_values(model_version=model_version_ref)
    shap_df_curr = self._fetch_and_strip_shap_values(model_version=model_version_curr)

    return self._compare_shap_dataframes(shap_df_ref, shap_df_curr, sort_by, top_k)

compare_adversarial ¶

compare_adversarial(
    period_ref: Period,
    period_curr: Period,
    classifier: Any | None = None,
    cv: int = 5,
    sort_by: str = "adv_importance",
    top_k: int | None = None,
    random_state: int | None = None,
) -> DFrameLike

Compare SHAP distributions between two periods using adversarial validation.

Trains a binary classifier to distinguish SHAP values from period_ref (label 0) vs period_curr (label 1). The cross-validated AUC measures overall distributional shift; per-feature importances reveal which SHAP dimensions drive the separability — complementing the univariate PSI score.

Parameters:

Name	Type	Description	Default
`period_ref`	`Period`	Tuple of (start_dt, end_dt) defining the reference date range.	required
`period_curr`	`Period`	Tuple of (start_dt, end_dt) defining the current date range.	required
`classifier`	`sklearn estimator`	Sklearn-compatible classifier with `predict_proba` and `feature_importances_`. Defaults to `RandomForestClassifier`.	`None`
`cv`	`int`	Number of stratified k-fold splits (default: 5).	`5`
`sort_by`	`str`	Column to sort results by (default: 'adv_importance').	`'adv_importance'`
`top_k`	`int \| None`	If set, return only the top k features. Must be a positive integer.	`None`
`random_state`	`int \| None`	Random state for the default classifier and CV splitter.	`None`

Returns:

Type Description

DataFrame

Comparison statistics indexed by feature name.

Columns: - adv_importance: Feature's contribution to classifier separability - mean_abs_1, mean_abs_2: Mean absolute SHAP value per period - delta_mean_abs: Absolute importance change (period_2 - period_1)

Attributes: - adversarial_auc: Cross-validated AUC (0.5 = no shift, 1.0 = max shift) - n_samples_ref: Sample count in the reference period - n_samples_curr: Sample count in the current period

Raises:

Type	Description
`ValueError`	If top_k < 1 or sort_by is not a valid column name.

Notes

Returns an empty DataFrame if either period contains no data.

To run adversarial validation on raw input feature distributions (not SHAP), use adversarial_auc from shapmonitor.analysis.metrics directly with backend.read(...).filter(like="feat_").

AUC interpretation guide:

AUC	Interpretation
0.50	Distributions are indistinguishable
0.50–0.65	Minor differences, likely noise
0.65–0.80	Moderate shift — worth investigating
0.80–0.90	Strong shift detected
> 0.90	Severe — clearly different regimes

Source code in shapmonitor/analysis/_analyzer.py

def compare_adversarial(
    self,
    period_ref: Period,
    period_curr: Period,
    classifier: Any | None = None,
    cv: int = 5,
    sort_by: str = "adv_importance",
    top_k: int | None = None,
    random_state: int | None = None,
) -> DFrameLike:
    """Compare SHAP distributions between two periods using adversarial validation.

    Trains a binary classifier to distinguish SHAP values from ``period_ref``
    (label 0) vs ``period_curr`` (label 1). The cross-validated AUC measures
    overall distributional shift; per-feature importances reveal which SHAP
    dimensions drive the separability — complementing the univariate PSI score.

    Parameters
    ----------
    period_ref : Period
        Tuple of (start_dt, end_dt) defining the reference date range.
    period_curr : Period
        Tuple of (start_dt, end_dt) defining the current date range.
    classifier : sklearn estimator, optional
        Sklearn-compatible classifier with ``predict_proba`` and
        ``feature_importances_``. Defaults to ``RandomForestClassifier``.
    cv : int, optional
        Number of stratified k-fold splits (default: 5).
    sort_by : str, optional
        Column to sort results by (default: 'adv_importance').
    top_k : int | None, optional
        If set, return only the top k features. Must be a positive integer.
    random_state : int | None, optional
        Random state for the default classifier and CV splitter.

    Returns
    -------
    DataFrame
        Comparison statistics indexed by feature name.

        Columns:
            - adv_importance: Feature's contribution to classifier separability
            - mean_abs_1, mean_abs_2: Mean absolute SHAP value per period
            - delta_mean_abs: Absolute importance change (period_2 - period_1)

        Attributes:
            - adversarial_auc: Cross-validated AUC (0.5 = no shift, 1.0 = max shift)
            - n_samples_ref: Sample count in the reference period
            - n_samples_curr: Sample count in the current period

    Raises
    ------
    ValueError
        If top_k < 1 or sort_by is not a valid column name.

    Notes
    -----
    Returns an empty DataFrame if either period contains no data.

    To run adversarial validation on raw input feature distributions (not SHAP),
    use ``adversarial_auc`` from ``shapmonitor.analysis.metrics`` directly with
    ``backend.read(...).filter(like="feat_")``.

    AUC interpretation guide:

      | AUC        | Interpretation                        |
      |------------|---------------------------------------|
      | 0.50       | Distributions are indistinguishable   |
      | 0.50–0.65  | Minor differences, likely noise       |
      | 0.65–0.80  | Moderate shift — worth investigating  |
      | 0.80–0.90  | Strong shift detected                 |
      | > 0.90     | Severe — clearly different regimes    |
    """
    self._validate_top_k(top_k)

    shap_df_ref = self._fetch_and_strip_shap_values(
        start_dt=period_ref[0], end_dt=period_ref[1]
    )
    shap_df_curr = self._fetch_and_strip_shap_values(
        start_dt=period_curr[0], end_dt=period_curr[1]
    )

    return self._run_adversarial_comparison(
        shap_df_ref, shap_df_curr, classifier, cv, sort_by, top_k, random_state
    )

compare_adversarial_batches ¶

compare_adversarial_batches(
    batch_ref: str,
    batch_curr: str,
    classifier: Any | None = None,
    cv: int = 5,
    sort_by: str = "adv_importance",
    top_k: int | None = None,
    random_state: int | None = None,
) -> DFrameLike

Compare SHAP distributions between two batches using adversarial validation.

Trains a binary classifier to distinguish SHAP values from batch_ref (label 0) vs batch_curr (label 1). The cross-validated AUC measures overall distributional shift; per-feature importances reveal which SHAP dimensions drive the separability — complementing the univariate PSI score.

Parameters:

Name	Type	Description	Default
`batch_ref`	`str`	Identifier for the reference batch.	required
`batch_curr`	`str`	Identifier for the current batch.	required
`classifier`	`sklearn estimator`	Sklearn-compatible classifier with `predict_proba` and `feature_importances_`. Defaults to `RandomForestClassifier`.	`None`
`cv`	`int`	Number of stratified k-fold splits (default: 5).	`5`
`sort_by`	`str`	Column to sort results by (default: 'adv_importance').	`'adv_importance'`
`top_k`	`int \| None`	If set, return only the top k features. Must be a positive integer.	`None`
`random_state`	`int \| None`	Random state for the default classifier and CV splitter.	`None`

Returns:

Type Description

DataFrame

Comparison statistics indexed by feature name.

Columns: - adv_importance: Feature's contribution to classifier separability - mean_abs_1, mean_abs_2: Mean absolute SHAP value per batch - delta_mean_abs: Absolute importance change (batch_2 - batch_1)

Attributes: - adversarial_auc: Cross-validated AUC (0.5 = no shift, 1.0 = max shift) - n_samples_ref: Sample count in the reference batch - n_samples_curr: Sample count in the current batch

Raises:

Type	Description
`ValueError`	If top_k < 1 or sort_by is not a valid column name.

Notes

Returns an empty DataFrame if either batch contains no data.

Batch sizes sampled via sample_rate may be small. Ensure each batch has enough rows for the chosen cv splits (at least 2 * cv samples total is recommended) for stable AUC estimates.

AUC interpretation guide:

AUC	Interpretation
0.50	Distributions are indistinguishable
0.50–0.65	Minor differences, likely noise
0.65–0.80	Moderate shift — worth investigating
0.80–0.90	Strong shift detected
> 0.90	Severe — clearly different regimes

Source code in shapmonitor/analysis/_analyzer.py

def compare_adversarial_batches(
    self,
    batch_ref: str,
    batch_curr: str,
    classifier: Any | None = None,
    cv: int = 5,
    sort_by: str = "adv_importance",
    top_k: int | None = None,
    random_state: int | None = None,
) -> DFrameLike:
    """Compare SHAP distributions between two batches using adversarial validation.

    Trains a binary classifier to distinguish SHAP values from ``batch_ref``
    (label 0) vs ``batch_curr`` (label 1). The cross-validated AUC measures
    overall distributional shift; per-feature importances reveal which SHAP
    dimensions drive the separability — complementing the univariate PSI score.

    Parameters
    ----------
    batch_ref : str
        Identifier for the reference batch.
    batch_curr : str
        Identifier for the current batch.
    classifier : sklearn estimator, optional
        Sklearn-compatible classifier with ``predict_proba`` and
        ``feature_importances_``. Defaults to ``RandomForestClassifier``.
    cv : int, optional
        Number of stratified k-fold splits (default: 5).
    sort_by : str, optional
        Column to sort results by (default: 'adv_importance').
    top_k : int | None, optional
        If set, return only the top k features. Must be a positive integer.
    random_state : int | None, optional
        Random state for the default classifier and CV splitter.

    Returns
    -------
    DataFrame
        Comparison statistics indexed by feature name.

        Columns:
            - adv_importance: Feature's contribution to classifier separability
            - mean_abs_1, mean_abs_2: Mean absolute SHAP value per batch
            - delta_mean_abs: Absolute importance change (batch_2 - batch_1)

        Attributes:
            - adversarial_auc: Cross-validated AUC (0.5 = no shift, 1.0 = max shift)
            - n_samples_ref: Sample count in the reference batch
            - n_samples_curr: Sample count in the current batch

    Raises
    ------
    ValueError
        If top_k < 1 or sort_by is not a valid column name.

    Notes
    -----
    Returns an empty DataFrame if either batch contains no data.

    Batch sizes sampled via ``sample_rate`` may be small. Ensure each batch
    has enough rows for the chosen ``cv`` splits (at least ``2 * cv`` samples
    total is recommended) for stable AUC estimates.

    AUC interpretation guide:

      | AUC        | Interpretation                        |
      |------------|---------------------------------------|
      | 0.50       | Distributions are indistinguishable   |
      | 0.50–0.65  | Minor differences, likely noise       |
      | 0.65–0.80  | Moderate shift — worth investigating  |
      | 0.80–0.90  | Strong shift detected                 |
      | > 0.90     | Severe — clearly different regimes    |
    """
    self._validate_top_k(top_k)

    shap_df_ref = self._fetch_and_strip_shap_values(batch_id=batch_ref)
    shap_df_curr = self._fetch_and_strip_shap_values(batch_id=batch_curr)

    return self._run_adversarial_comparison(
        shap_df_ref, shap_df_curr, classifier, cv, sort_by, top_k, random_state
    )

Analysis¶

analysis ¶

Classes¶

SHAPAnalyzer ¶

Attributes¶

min_abs_shap property ¶

backend property ¶

Functions¶

fetch_shap_values ¶

summary ¶

compare_time_periods ¶

compare_batches ¶

compare_versions ¶

compare_adversarial ¶

compare_adversarial_batches ¶

min_abs_shap `property` ¶

backend `property` ¶