API Reference#

Complete reference for all public classes and functions in MaldiBatchKit, organised by module.

Correctors#

Every corrector exposes the scikit-learn fit / transform / fit_transform API, accepts batch and covariates at construction time, and aligns them to X.index at each call - so the same object is safe inside Pipeline and cross_val_score without leakage.

Base Class#

maldibatchkit.BaseBatchCorrector

Base class for batch correctors that store batch at construction.

Subclass this to ship a custom corrector. See Extending MaldiBatchKit for a walkthrough.

ComBat Family#

maldibatchkit.ComBat

Pipeline-friendly wrapper around ComBatModel.

maldibatchkit.SpeciesAwareComBat

ComBat-Fortin preset with species as a protected biological covariate.

maldibatchkit.QualityWeightedComBat

Weighted empirical-Bayes extension of Johnson-ComBat.

Linear / Non-Parametric#

maldibatchkit.Limma

Linear-model batch subtraction following limma::removeBatchEffect.

Single-Cell-Style Integration#

maldibatchkit.Harmony

Sklearn-compatible Harmony with a closed-form transform.

Simple Baselines#

maldibatchkit.MedianCentering

Subtract per-batch medians from each feature.

maldibatchkit.ZScorePerBatch

Per-batch z-score normalisation.

maldibatchkit.ReferenceScaling

Rescale each batch so its per-feature mean matches a reference batch.

MALDI-Specific Corrections#

maldibatchkit.BatchAwareWarping

Per-batch m/z warping to a single global reference.

Diagnostics#

Generic Batch-Mixing Metrics#

maldibatchkit.diagnostics.silhouette_batch

Silhouette coefficient using batch labels as clusters.

maldibatchkit.diagnostics.kbet

k-nearest-neighbours Batch Effect Test (kBET; Büttner et al. 2019).

maldibatchkit.diagnostics.lisi

Local Inverse Simpson's Index for batch mixing.

MALDI-Specific Metrics#

maldibatchkit.diagnostics.peak_position_drift

Per-batch peak-position drift relative to a global reference.

maldibatchkit.diagnostics.tic_cov_per_batch

Per-batch coefficient of variation of the Total Ion Count.

maldibatchkit.diagnostics.per_batch_spectrum_count

Return the number of spectra per batch as a sorted Series.

Combined Report#

maldibatchkit.diagnostics.diagnostic_report

Run every diagnostic on a (before, after) pair.

Benchmark#

maldibatchkit.diagnostics.BatchCorrectionBenchmark

Diagnostic comparison of multiple batch correctors.

Metrics#

Batch-aware downstream classifier metrics for model selection that generalises across sites. See Metrics Module for the full reference.

Per-batch Metric Functions#

maldibatchkit.batch_roc_auc_score

Per-batch AUROC, aggregated with weights.

maldibatchkit.batch_average_precision_score

Per-batch average precision (PR-AUC), aggregated with weights.

maldibatchkit.batch_balanced_accuracy_score

Per-batch balanced accuracy, aggregated with weights.

maldibatchkit.batch_matthews_corrcoef

Per-batch Matthews correlation coefficient, aggregated with weights.

maldibatchkit.batch_f1_score

Per-batch F1 score, aggregated with weights.

maldibatchkit.batch_precision_score

Per-batch precision, aggregated with weights.

maldibatchkit.batch_recall_score

Per-batch recall, aggregated with weights.

Scorer Factory#

maldibatchkit.make_batch_scorer

Return a sklearn-compatible scorer(estimator, X, y).

Visualization#

maldibatchkit.viz.plot_batch_umap

Plot UMAP embeddings of X before and after batch correction.

maldibatchkit.viz.plot_peak_shift

Overlay per-batch median spectra against a reference spectrum.

maldibatchkit.viz.plot_diagnostic_summary

Plot before/after diagnostic values, one subplot per metric.

Integrations#