Skip to content

Multi-slide pooled fit_stain_reference#1209

Open
timtreis wants to merge 4 commits into
mainfrom
stain/pr4-pooled-fit
Open

Multi-slide pooled fit_stain_reference#1209
timtreis wants to merge 4 commits into
mainfrom
stain/pr4-pooled-fit

Conversation

@timtreis

Copy link
Copy Markdown
Member

fit_stain_reference now accepts image_key: str | list[str] — a list pools several images' tissue pixels into one fit (one reference for a cohort in one SpatialData).

ref = sq.experimental.im.fit_stain_reference(sdata, ["s1", "s2"], method="macenko")

tissue_mask_key takes an order-matched list (or None for the {key}_tissue convention). Single-image fit unchanged.

`fit_stain_reference` now accepts `image_key: str | list[str]`. A list pools
the tissue pixels of several same-dtype images into one fit (decomposition
pools tissue OD, Reinhard pools tissue Lab pixels) - one reference for a cohort
held in a single SpatialData. `tissue_mask_key` accepts an order-matched list
(or None for the `{key}_tissue` convention). A blank slide raises, named; no
silent skip. Single-image fit is unchanged.

KISS replacement for the old cohort-aggregation + persistence design: no new
module, no provenance fields, no serialization format (references are plain
Python objects users serialize themselves).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@codecov

codecov Bot commented Jun 15, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 76.92%. Comparing base (ae5ceb4) to head (2b24b2c).

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1209      +/-   ##
==========================================
+ Coverage   76.81%   76.92%   +0.11%     
==========================================
  Files          63       63              
  Lines        9267     9315      +48     
  Branches     1565     1577      +12     
==========================================
+ Hits         7118     7166      +48     
  Misses       1547     1547              
  Partials      602      602              
Files with missing lines Coverage Δ
...c/squidpy/experimental/im/_stain/_decomposition.py 92.18% <100.00%> (+0.31%) ⬆️
src/squidpy/experimental/im/_stain/_normalize.py 94.96% <100.00%> (+1.21%) ⬆️
src/squidpy/experimental/im/_stain/_reinhard.py 100.00% <100.00%> (ø)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@timtreis timtreis marked this pull request as ready for review June 15, 2026 14:06
@timtreis timtreis requested a review from selmanozleyen June 16, 2026 14:26
@timtreis timtreis self-assigned this Jun 23, 2026
Comment on lines +345 to +353
def test_validation(self, image_key, tissue_mask_key, match: str) -> None:
sdata, _ = self._cohort(n=2)
with pytest.raises(ValueError, match=match):
fit_stain_reference(sdata, image_key, method="macenko", tissue_mask_key=tissue_mask_key)

def test_list_mask_with_str_image_raises(self) -> None:
sdata, _ = self._cohort(n=1)
with pytest.raises(ValueError, match="single `tissue_mask_key`"):
fit_stain_reference(sdata, "img0", method="macenko", tissue_mask_key=["img0_tissue"])

@selmanozleyen selmanozleyen Jun 29, 2026

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can these tests be merged into a parameterized one?

Comment on lines +334 to +337
if not image_keys:
raise ValueError("`image_key` list is empty; pass at least one image key.")
if len(set(image_keys)) != len(image_keys):
raise ValueError("`image_key` list has duplicate keys.")

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we use _unique_order_preserving here somehow?

Comment on lines 296 to 303
validate_rgb_range(da)
params = _resolve_method_params(method, method_params)
tissue_mask = _resolve_tissue_bool_mask(sdata, image_key, da, tissue_mask_key)
if method == "reinhard":
return fit_reinhard(da, params, tissue_mask=tissue_mask)
bg = default_white_point(da) if white_point is None else np.asarray(white_point, np.float64)
reference = RUIFROK_HE if canonical_reference is None else dict(canonical_reference)
return fit_decomposition(

@selmanozleyen selmanozleyen Jun 29, 2026

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are lines duplicated here:

You should normalize to [str] if it's str. Then you'd do this procedure for free on both

   das, masks = [], []
    for k, mk in zip(image_keys, mask_keys, strict=True):
        da = _resolve_image(sdata, k, scale, prefer="coarsest")
        validate_rgb_range(da)
        das.append(da)
        masks.append(_resolve_tissue_bool_mask(sdata, k, da, mk))
    dtypes = {str(da.dtype) for da in das}
    if len(dtypes) != 1:
        raise ValueError(f"pooled images must share a dtype; got {sorted(dtypes)}.")
    if method == "reinhard":
        return fit_reinhard_pooled(das, masks, params, image_keys)
    bg = default_white_point(das[0]) if white_point is None else np.asarray(white_point, np.float64)
    reference = RUIFROK_HE if canonical_reference is None else dict(canonical_reference)
    return fit_decomposition_pooled(
        das, masks, image_keys, method, params, bg, reference=reference, max_angle_deg=max_angle_deg
    )

I'd also do

def fit_reinhard_pooled(
    das: list[xr.DataArray],
    masks: list[np.ndarray | None],
    params: ReinhardParams,
    image_keys: list[str | None],
) -> StainReference:
    """Fit Reinhard stats from the pooled tissue Lab pixels of one or more slides."""
    cols = [_tissue_lab_pixels(da, params, m, image_key=k) for da, m, k in zip(das, masks, image_keys, strict=True)]
    return _reinhard_from_pixels(np.concatenate(cols, axis=1))


def fit_reinhard(
    image_rgb: xr.DataArray, params: ReinhardParams, *, tissue_mask: np.ndarray | None = None
) -> StainReference:
    """Single-image Reinhard fit — a pool of one."""
    return fit_reinhard_pooled([image_rgb], [tissue_mask], params, [None])

or just have fit_reainhard(str|list[str]) tbh

@selmanozleyen selmanozleyen left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please do the changes I mentioned. Overall writing the code as if all the inputs are [str] is a good idea and normalizing str to [str] will save duplicate computation and code

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants