Diagnostics Module ================== Quantitative checks that a correction step actually removed batch structure. The subpackage exposes two families of metrics plus a convenience report helper: - :mod:`maldibatchkit.diagnostics.generic` - classic batch-mixing metrics (silhouette by batch, kBET, LISI). - :mod:`maldibatchkit.diagnostics.maldi` - MALDI-specific summaries (per-batch peak drift, TIC coefficient of variation, spectrum count). - :func:`maldibatchkit.diagnostics.diagnostic_report` - run every metric on a ``(before, after)`` pair and collapse it into a tidy DataFrame suitable for downstream tables and plots. All metrics take the same ``(X, batch)`` signature and return scalars or tidy ``pandas`` objects. .. warning:: Batch-mixing metrics say nothing about whether biological signal is preserved. Always pair them with a supervised metric (AMR classifier AUROC, VME, ...) in any real comparison of correctors. Generic Batch-Mixing Metrics ---------------------------- .. autofunction:: maldibatchkit.diagnostics.silhouette_batch .. autofunction:: maldibatchkit.diagnostics.kbet .. autofunction:: maldibatchkit.diagnostics.lisi MALDI-Specific Metrics ---------------------- .. autofunction:: maldibatchkit.diagnostics.peak_position_drift .. autofunction:: maldibatchkit.diagnostics.tic_cov_per_batch .. autofunction:: maldibatchkit.diagnostics.per_batch_spectrum_count Combined Report --------------- .. autofunction:: maldibatchkit.diagnostics.diagnostic_report Benchmark --------- :class:`~maldibatchkit.diagnostics.BatchCorrectionBenchmark` runs a fixed set of metrics across multiple correctors under a single protocol, returning tidy per-(method, metric) summaries plus the raw long-form observations. See :doc:`/choosing` for the recipe. .. autoclass:: maldibatchkit.diagnostics.BatchCorrectionBenchmark :members: :undoc-members: :show-inheritance: Example ------- .. code-block:: python from maldibatchkit import SpeciesAwareComBat from maldibatchkit.diagnostics import diagnostic_report from maldibatchkit.viz import plot_diagnostic_summary # Correct and summarise corrector = SpeciesAwareComBat(batch=batch, species=species) X_corrected = corrector.fit_transform(X) report = diagnostic_report( X, X_corrected, batch, mz_values=mz, top_k_peaks=40, ) print(report.head()) # metric scope value_before value_after delta better # 0 silhouette_batch overall 0.311 0.042 -0.269 lower # 1 kbet_acceptance overall 0.124 0.561 0.437 higher # 2 lisi overall 1.420 2.310 0.890 higher # Quick bar-chart visualisation of the overall metrics plot_diagnostic_summary(report, scope="overall")