Integrations Module#

Bridges between MaldiBatchKit and sibling toolkits in the MaldiSuite ecosystem.

MaldiSet Adapter#

class maldibatchkit.integrations.MaldiSetAdapter(*, batch_column, species_column=None, quality_column=None)[source]#

Bases: object

Bridge maldiamrkit.MaldiSet and MaldiBatchKit correctors.

The adapter does three things:

  1. Extracts the feature matrix ds.X and any metadata columns (batch / species / quality) from a MaldiSet.

  2. Delegates to a MaldiBatchKit corrector (or any sklearn-compatible transformer with the right constructor signature) to produce a corrected matrix.

  3. Returns a new MaldiSet whose X property yields the corrected matrix, leaving the original dataset untouched.

Batch / covariate slicing follows the rest of the package: pass the metadata column names at construction time; the adapter pulls the aligned series from ds.meta itself so users do not have to rebuild arrays manually.

Parameters:
  • batch_column (str) – Column in ds.meta with the batch labels.

  • species_column (str | None) – Column in ds.meta with species labels (used when the chosen corrector needs a species covariate).

  • quality_column (str | None) – Column in ds.meta with per-sample quality scores (used by maldibatchkit.QualityWeightedComBat).

Examples

>>> from maldiamrkit import MaldiSet
>>> from maldibatchkit.integrations import MaldiSetAdapter
>>> from maldibatchkit import SpeciesAwareComBat
>>> adapter = MaldiSetAdapter(batch_column="Batch", species_column="Species")
>>> corrected_ds = adapter.correct(ds, SpeciesAwareComBat)
>>> corrected_ds.X.head()   # corrected feature matrix
__init__(*, batch_column, species_column=None, quality_column=None)[source]#
Parameters:
Return type:

None

extract(ds)[source]#

Pull X, batch, species, quality from a MaldiSet.

Returns:

Dictionary with keys X (DataFrame), batch, species (or None), quality (or None). The series are aligned to X.index.

Return type:

dict[str, Any]

Parameters:

ds (MaldiSet)

correct(ds, transformer_cls, *, transformer_kwargs=None)[source]#

Run transformer_cls(batch=..., ...) and return a new MaldiSet.

Parameters:
  • ds (MaldiSet) – Source dataset.

  • transformer_cls (type) – A MaldiBatchKit corrector class (or any transformer whose constructor takes batch= and, optionally, a species-style protected-covariate argument named species / discrete_covariates / design / covariates and/or a quality= argument).

  • transformer_kwargs (dict[str, Any] | None) – Extra keyword arguments forwarded to transformer_cls.

Returns:

Shallow-copied dataset with its _X_cache replaced by the corrected feature matrix. The spectra list is the same object in the returned MaldiSet; labels / metadata are unchanged.

Return type:

MaldiSet

Example#

from maldiamrkit import MaldiSet
from maldibatchkit.integrations import MaldiSetAdapter
from maldibatchkit import SpeciesAwareComBat

# Load dataset (aggregated by an antibiotic label)
ds = MaldiSet.from_directory(
    "spectra/", "metadata.csv",
    aggregate_by=dict(antibiotics="Ceftriaxone"),
)

# The adapter reads batch / species / quality from ds.meta
adapter = MaldiSetAdapter(
    batch_column="Batch",
    species_column="Species",
    quality_column="SNR",
)

# Returns a NEW MaldiSet with corrected X and untouched labels
corrected_ds = adapter.correct(ds, SpeciesAwareComBat)

corrected_ds.X     # harmonised feature matrix
corrected_ds.y     # AMR labels, unchanged