Skip to content

Claude/storage codec bugfixes#4074

Merged
d-v-b merged 14 commits into
zarr-developers:mainfrom
d-v-b:claude/storage-codec-bugfixes
Jun 18, 2026
Merged

Claude/storage codec bugfixes#4074
d-v-b merged 14 commits into
zarr-developers:mainfrom
d-v-b:claude/storage-codec-bugfixes

Conversation

@d-v-b

@d-v-b d-v-b commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

I had claude do a comprehensive review of zarr-python and look for bugs, optimizations, etc. The tracking issue is over on my fork. I will be opening claude-authored PRs that fix the issues.

Summary

This PR addresses five pre-verified bugs (B1–B5 of #181):

B1 — MemoryStore SuffixByteRequest clamp (src/zarr/storage/_utils.py)
When SuffixByteRequest.suffix > len(data), the computed start index went
negative, causing numpy slicing to return too few bytes. Fixed by clamping:
start = max(0, len(data) - byte_range.suffix). This matches how ZipStore
and LocalStore already handle this case.

B2 — LoggingStore.get_partial_values generator exhaustion (src/zarr/storage/_logging.py)
The log-hint string was built by iterating over key_ranges with a list
comprehension, which exhausted a one-shot generator before passing it to
the wrapped store. Fixed by materialising key_ranges = list(key_ranges)
first.

B3 — getsize_prefix substring over-count (src/zarr/abc/store.py)
getsize_prefix("foo") called list_prefix("foo"), which matched both
"foo/a" and "foobar/b" in prefix-matching stores. Fixed by normalising
the prefix to end with "/" (matching delete_dir / is_empty), so only
keys under the exact directory are counted.

B4 — ZipStore.close() AttributeError when never opened (src/zarr/storage/_zip.py)
_lock is only created in _sync_open(). If close() was called on a
store that was never opened (e.g. with ZipStore(...): pass), it raised
AttributeError: _lock. Fixed by returning early from close() when
_is_open is False.

B5 — Missing raise in codecs_from_list (src/zarr/core/codec_pipeline.py)
In the BytesBytesCodec branch, when prev_codec is an ArrayArrayCodec,
the error message was assigned to msg but never raised. The sibling
branches (ArrayArrayCodec and ArrayBytesCodec cases) both do
raise TypeError(msg); this one was missing the raise. Fixed by adding
raise TypeError(msg).

Notes

  • A towncrier changelog entry should be added before un-drafting this PR.
  • Addresses B1–B5 of #181.

🤖 Generated with Claude Code

dependabot Bot and others added 8 commits May 31, 2026 19:28
…#176)

Bumps the actions group with 8 updates in the / directory:

| Package | From | To |
| --- | --- | --- |
| [prefix-dev/setup-pixi](https://github.com/prefix-dev/setup-pixi) | `0.9.5` | `0.9.6` |
| [codecov/codecov-action](https://github.com/codecov/codecov-action) | `6.0.0` | `6.0.1` |
| [github/issue-metrics](https://github.com/github/issue-metrics) | `4.2.2` | `4.2.7` |
| [j178/prek-action](https://github.com/j178/prek-action) | `2.0.3` | `2.0.4` |
| [actions/upload-artifact](https://github.com/actions/upload-artifact) | `7.0.0` | `7.0.1` |
| [actions/download-artifact](https://github.com/actions/download-artifact) | `7.0.0` | `8.0.1` |
| [pypa/gh-action-pypi-publish](https://github.com/pypa/gh-action-pypi-publish) | `1.13.0` | `1.14.0` |
| [zizmorcore/zizmor-action](https://github.com/zizmorcore/zizmor-action) | `0.5.3` | `0.5.6` |



Updates `prefix-dev/setup-pixi` from 0.9.5 to 0.9.6
- [Release notes](https://github.com/prefix-dev/setup-pixi/releases)
- [Commits](prefix-dev/setup-pixi@1b2de7f...5185adf)

Updates `codecov/codecov-action` from 6.0.0 to 6.0.1
- [Release notes](https://github.com/codecov/codecov-action/releases)
- [Changelog](https://github.com/codecov/codecov-action/blob/main/CHANGELOG.md)
- [Commits](codecov/codecov-action@57e3a13...e79a696)

Updates `github/issue-metrics` from 4.2.2 to 4.2.7
- [Release notes](https://github.com/github/issue-metrics/releases)
- [Commits](github-community-projects/issue-metrics@c9e9838...1e38d5e)

Updates `j178/prek-action` from 2.0.3 to 2.0.4
- [Release notes](https://github.com/j178/prek-action/releases)
- [Commits](j178/prek-action@6ad8027...bdca6f1)

Updates `actions/upload-artifact` from 7.0.0 to 7.0.1
- [Release notes](https://github.com/actions/upload-artifact/releases)
- [Commits](actions/upload-artifact@v7...043fb46)

Updates `actions/download-artifact` from 7.0.0 to 8.0.1
- [Release notes](https://github.com/actions/download-artifact/releases)
- [Commits](actions/download-artifact@v7...3e5f45b)

Updates `pypa/gh-action-pypi-publish` from 1.13.0 to 1.14.0
- [Release notes](https://github.com/pypa/gh-action-pypi-publish/releases)
- [Commits](pypa/gh-action-pypi-publish@v1.13.0...cef2210)

Updates `zizmorcore/zizmor-action` from 0.5.3 to 0.5.6
- [Release notes](https://github.com/zizmorcore/zizmor-action/releases)
- [Commits](zizmorcore/zizmor-action@b1d7e1f...5f14fd0)

---
updated-dependencies:
- dependency-name: prefix-dev/setup-pixi
  dependency-version: 0.9.6
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: actions
- dependency-name: codecov/codecov-action
  dependency-version: 6.0.1
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: actions
- dependency-name: github/issue-metrics
  dependency-version: 4.2.7
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: actions
- dependency-name: j178/prek-action
  dependency-version: 2.0.4
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: actions
- dependency-name: actions/upload-artifact
  dependency-version: 7.0.1
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: actions
- dependency-name: actions/download-artifact
  dependency-version: 8.0.1
  dependency-type: direct:production
  update-type: version-update:semver-major
  dependency-group: actions
- dependency-name: pypa/gh-action-pypi-publish
  dependency-version: 1.14.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: actions
- dependency-name: zizmorcore/zizmor-action
  dependency-version: 0.5.6
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: actions
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
… close, codec order validation

- storage/_utils.py: clamp SuffixByteRequest start to max(0, ...) so
  suffix > len(data) returns all available bytes (B1)
- storage/_logging.py: materialise key_ranges into a list before building
  the log hint string, preventing one-shot generator exhaustion (B2)
- abc/store.py: normalise getsize_prefix argument to end with "/" before
  calling list_prefix, matching delete_dir/is_empty behaviour so sibling
  keys are not over-counted (B3)
- storage/_zip.py: guard ZipStore.close() with an early return when the
  store was never opened, avoiding AttributeError on missing _lock (B4)
- core/codec_pipeline.py: add the missing raise TypeError(msg) in the
  BytesBytesCodec-after-ArrayArrayCodec branch of codecs_from_list (B5)

Add regression tests for all five fixes.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@github-actions github-actions Bot added the needs release notes Automatically applied to PRs which haven't added release notes label Jun 16, 2026
@d-v-b d-v-b requested a review from dcherian June 16, 2026 12:32
@d-v-b

d-v-b commented Jun 16, 2026

Copy link
Copy Markdown
Contributor Author

@dcherian do you have any ideas for better property-based tests that we could add, given these bugs?

@codecov

codecov Bot commented Jun 16, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 60.65574% with 24 lines in your changes missing coverage. Please review.
✅ Project coverage is 93.46%. Comparing base (7bc3017) to head (591ef45).

Files with missing lines Patch % Lines
src/zarr/testing/stateful.py 23.07% 20 Missing ⚠️
src/zarr/testing/strategies.py 25.00% 3 Missing ⚠️
src/zarr/storage/_fsspec.py 95.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4074      +/-   ##
==========================================
- Coverage   93.52%   93.46%   -0.07%     
==========================================
  Files          90       90              
  Lines       11926    11954      +28     
==========================================
+ Hits        11154    11173      +19     
- Misses        772      781       +9     
Files with missing lines Coverage Δ
src/zarr/abc/store.py 96.17% <100.00%> (+0.04%) ⬆️
src/zarr/core/codec_pipeline.py 95.03% <100.00%> (+2.86%) ⬆️
src/zarr/storage/_logging.py 100.00% <100.00%> (ø)
src/zarr/storage/_utils.py 96.42% <100.00%> (ø)
src/zarr/storage/_zip.py 97.71% <100.00%> (+0.02%) ⬆️
src/zarr/testing/store.py 99.43% <100.00%> (+<0.01%) ⬆️
src/zarr/storage/_fsspec.py 91.32% <95.00%> (ø)
src/zarr/testing/strategies.py 94.42% <25.00%> (-0.32%) ⬇️
src/zarr/testing/stateful.py 35.57% <23.07%> (-0.89%) ⬇️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

assert result.scheme == "c"


class TestNormalizeByteRangeIndex:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a missing property test

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 I will add one

d-v-b and others added 4 commits June 18, 2026 12:27
Augment the existing property-based test infrastructure (and convert two
example-based regression tests to property tests) so that each of the five
bugs fixed in this PR is caught by a property/stateful test, verified
red-green against the pre-fix source:

- B1 (suffix clamp): broaden the `key_ranges` strategy to emit
  SuffixByteRequest / OffsetByteRequest / None with bounds that can exceed
  the value length, and give the stateful `get_partial_values` rule an
  independent oracle for every ByteRequest variant.
- B2 (generator exhaustion): the shared `StoreTests.test_get_partial_values`
  now passes `key_ranges` as a one-shot generator and asserts one result per
  request, so a store/wrapper that exhausts the iterable early is caught
  (previously masked because the expected loop was driven by the observed
  count). The stateful rule likewise passes a generator.
- B3 (getsize_prefix sibling over-counting): add a `getsize_prefix` rule to
  the store state machine that asserts only keys under `node/` are counted.
- B4 (ZipStore.close before open): replace the two example-based regression
  tests with a stateful lifecycle machine asserting `close()` never raises,
  regardless of open/IO history.
- B5 (missing raise in codec order validation): replace the single example
  with a property test over random {AA,AB,BB} codec orderings whose expected
  outcome (TypeError / ValueError / ok) is modelled independently.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Make each of the five bugs fixed in this PR catchable by a property/stateful
or shared-harness test (verified red-green against the pre-fix source), and
remove the per-store special-case tests whose coverage is now subsumed:

- B1 (suffix clamp): broaden the `key_ranges` strategy to emit
  SuffixByteRequest / OffsetByteRequest / None with bounds that can exceed
  the value length, and give the stateful `get_partial_values` rule an
  independent oracle for every ByteRequest variant. The pure-function unit
  test `TestNormalizeByteRangeIndex` is kept as a fast deterministic guard.
- B2 (generator exhaustion): the shared `StoreTests.test_get_partial_values`
  now passes `key_ranges` as a one-shot generator and asserts one result per
  request, so any store/wrapper that exhausts the iterable early is caught
  across all stores. The logging-only regression test is removed as redundant.
- B3 (getsize_prefix sibling over-counting): the shared
  `StoreTests.test_getsize_prefix` now includes a sibling key ("cc/0") that
  must be excluded, covering every store deterministically. Add a matching
  `getsize_prefix` rule to the store state machine. The memory-only
  regression test is removed as redundant.
- B4 (ZipStore.close before open): replace the two example-based regression
  tests with a stateful lifecycle machine asserting `close()` never raises,
  regardless of open/IO history.
- B5 (missing raise in codec order validation): replace the single example
  with a property test over random {AA,AB,BB} codec orderings whose expected
  outcome (TypeError / ValueError / ok) is modelled independently.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
FsspecStore.get_partial_values guarded its body with a bare `if key_ranges:`
before materialising the argument. Because `key_ranges` is typed as a generic
Iterable, a one-shot generator is always truthy even when empty, so the
`else: return []` fast path was unreachable for generator inputs (it happened
to be harmless only because fsspec's _cat_ranges returns [] for empty input).

Materialise `key_ranges` into a list first, then check `if not key_ranges`,
matching the iterable contract for any Iterable (list or generator).

Also enable mypy's `truthy-iterable` error code, which flags exactly this
class of bug (truthiness test on a non-Collection Iterable). There are no
existing violations in src/zarr or tests, so this is a zero-cost guard against
recurrence.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Comment thread pyproject.toml
strict = true
warn_unreachable = true
enable_error_code = ["ignore-without-code", "redundant-expr", "truthy-bool"]
enable_error_code = ["ignore-without-code", "redundant-expr", "truthy-bool", "truthy-iterable"]

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

flagging this change to our mypy config. the fsspecstore codebase was relying on if Iterator: ... which I think we should avoid. so now mypy helps us avoid it.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@github-actions github-actions Bot removed the needs release notes Automatically applied to PRs which haven't added release notes label Jun 18, 2026
The pre-fix MemoryStore behavior returned incorrect data (a wrong slice via a
negative start index), not merely fewer bytes; note the HTTP suffix-range
semantics that define the correct behavior.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@d-v-b d-v-b enabled auto-merge (squash) June 18, 2026 11:23
@d-v-b d-v-b merged commit de89d4d into zarr-developers:main Jun 18, 2026
30 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants