Visualization Module#

Plotting helpers for before/after batch-effect inspection. All plotting functions use lazy matplotlib imports, so matplotlib is only required when a plot function is actually called.

plot_batch_umap additionally requires umap-learn and can be installed via pip install maldibatchkit[viz].

UMAP Before/After#

maldibatchkit.viz.plot_batch_umap(before, after, batch, *, color_by='batch', species=None, random_state=42, pca_preprocess=50, ax=None)[source]#

Plot UMAP embeddings of X before and after batch correction.

Parameters:
Return type:

tuple[Any, Any]

Returns:

  • fig (matplotlib.figure.Figure) – The figure used (or the parent figure of the provided axes).

  • axes (tuple of matplotlib.axes.Axes) – The two axes that were drawn on.

Peak-Shape Overlay#

maldibatchkit.viz.plot_peak_shift(batches, X, reference=None, *, mz_values=None, ax=None, max_batches=6)[source]#

Overlay per-batch median spectra against a reference spectrum.

Parameters:
Returns:

fig, ax

Return type:

tuple[Any, Any]

Diagnostic Summary#

maldibatchkit.viz.plot_diagnostic_summary(report_df, *, scope='overall', ncols=None, figsize_per_plot=(3.2, 3.0), axes=None)[source]#

Plot before/after diagnostic values, one subplot per metric.

Parameters:
  • report_df (DataFrame) – Output of maldibatchkit.diagnostics.diagnostic_report().

  • scope (Union[str, Iterable[str]]) – Which slice(s) of the report to plot. Pass a single scope name ("overall", "batch_00", …) to render one pair of bars per metric, or a list (["batch_00", "batch_01"]) to group bars by scope inside each metric’s subplot.

  • ncols (int | None) – Number of subplot columns. Defaults to min(n_metrics, 4).

  • figsize_per_plot (tuple[float, float]) – Width, height of each metric’s subplot (inches). The returned figure has size (ncols * w, nrows * h).

  • axes (Any | None) – Pre-built axes grid. Must have at least n_metrics entries when flattened.

Return type:

tuple[Any, ndarray]

Returns:

  • fig (matplotlib.figure.Figure)

  • axes (np.ndarray of matplotlib.axes.Axes) – Flattened array of the axes actually used. Unused slots from a non-rectangular grid are turned off.

Example#

from maldibatchkit import SpeciesAwareComBat
from maldibatchkit.diagnostics import diagnostic_report
from maldibatchkit.viz import (
    plot_batch_umap,
    plot_peak_shift,
    plot_diagnostic_summary,
)

corrector = SpeciesAwareComBat(batch=batch, species=species)
X_corrected = corrector.fit_transform(X)

# Side-by-side UMAP of raw vs. corrected matrices
plot_batch_umap(X, X_corrected, batch, random_state=0)

# Per-batch median spectra overlaid on a reference
plot_peak_shift(batch, X_corrected, mz_values=mz)

# Before/after bar chart built from a diagnostic_report DataFrame
report = diagnostic_report(X, X_corrected, batch)
plot_diagnostic_summary(report, scope="overall")