Claude/storage codec bugfixes#4074
Merged
d-v-b merged 14 commits intoJun 18, 2026
Merged
Conversation
…#176) Bumps the actions group with 8 updates in the / directory: | Package | From | To | | --- | --- | --- | | [prefix-dev/setup-pixi](https://github.com/prefix-dev/setup-pixi) | `0.9.5` | `0.9.6` | | [codecov/codecov-action](https://github.com/codecov/codecov-action) | `6.0.0` | `6.0.1` | | [github/issue-metrics](https://github.com/github/issue-metrics) | `4.2.2` | `4.2.7` | | [j178/prek-action](https://github.com/j178/prek-action) | `2.0.3` | `2.0.4` | | [actions/upload-artifact](https://github.com/actions/upload-artifact) | `7.0.0` | `7.0.1` | | [actions/download-artifact](https://github.com/actions/download-artifact) | `7.0.0` | `8.0.1` | | [pypa/gh-action-pypi-publish](https://github.com/pypa/gh-action-pypi-publish) | `1.13.0` | `1.14.0` | | [zizmorcore/zizmor-action](https://github.com/zizmorcore/zizmor-action) | `0.5.3` | `0.5.6` | Updates `prefix-dev/setup-pixi` from 0.9.5 to 0.9.6 - [Release notes](https://github.com/prefix-dev/setup-pixi/releases) - [Commits](prefix-dev/setup-pixi@1b2de7f...5185adf) Updates `codecov/codecov-action` from 6.0.0 to 6.0.1 - [Release notes](https://github.com/codecov/codecov-action/releases) - [Changelog](https://github.com/codecov/codecov-action/blob/main/CHANGELOG.md) - [Commits](codecov/codecov-action@57e3a13...e79a696) Updates `github/issue-metrics` from 4.2.2 to 4.2.7 - [Release notes](https://github.com/github/issue-metrics/releases) - [Commits](github-community-projects/issue-metrics@c9e9838...1e38d5e) Updates `j178/prek-action` from 2.0.3 to 2.0.4 - [Release notes](https://github.com/j178/prek-action/releases) - [Commits](j178/prek-action@6ad8027...bdca6f1) Updates `actions/upload-artifact` from 7.0.0 to 7.0.1 - [Release notes](https://github.com/actions/upload-artifact/releases) - [Commits](actions/upload-artifact@v7...043fb46) Updates `actions/download-artifact` from 7.0.0 to 8.0.1 - [Release notes](https://github.com/actions/download-artifact/releases) - [Commits](actions/download-artifact@v7...3e5f45b) Updates `pypa/gh-action-pypi-publish` from 1.13.0 to 1.14.0 - [Release notes](https://github.com/pypa/gh-action-pypi-publish/releases) - [Commits](pypa/gh-action-pypi-publish@v1.13.0...cef2210) Updates `zizmorcore/zizmor-action` from 0.5.3 to 0.5.6 - [Release notes](https://github.com/zizmorcore/zizmor-action/releases) - [Commits](zizmorcore/zizmor-action@b1d7e1f...5f14fd0) --- updated-dependencies: - dependency-name: prefix-dev/setup-pixi dependency-version: 0.9.6 dependency-type: direct:production update-type: version-update:semver-patch dependency-group: actions - dependency-name: codecov/codecov-action dependency-version: 6.0.1 dependency-type: direct:production update-type: version-update:semver-patch dependency-group: actions - dependency-name: github/issue-metrics dependency-version: 4.2.7 dependency-type: direct:production update-type: version-update:semver-patch dependency-group: actions - dependency-name: j178/prek-action dependency-version: 2.0.4 dependency-type: direct:production update-type: version-update:semver-patch dependency-group: actions - dependency-name: actions/upload-artifact dependency-version: 7.0.1 dependency-type: direct:production update-type: version-update:semver-patch dependency-group: actions - dependency-name: actions/download-artifact dependency-version: 8.0.1 dependency-type: direct:production update-type: version-update:semver-major dependency-group: actions - dependency-name: pypa/gh-action-pypi-publish dependency-version: 1.14.0 dependency-type: direct:production update-type: version-update:semver-minor dependency-group: actions - dependency-name: zizmorcore/zizmor-action dependency-version: 0.5.6 dependency-type: direct:production update-type: version-update:semver-patch dependency-group: actions ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
… close, codec order validation - storage/_utils.py: clamp SuffixByteRequest start to max(0, ...) so suffix > len(data) returns all available bytes (B1) - storage/_logging.py: materialise key_ranges into a list before building the log hint string, preventing one-shot generator exhaustion (B2) - abc/store.py: normalise getsize_prefix argument to end with "/" before calling list_prefix, matching delete_dir/is_empty behaviour so sibling keys are not over-counted (B3) - storage/_zip.py: guard ZipStore.close() with an early return when the store was never opened, avoiding AttributeError on missing _lock (B4) - core/codec_pipeline.py: add the missing raise TypeError(msg) in the BytesBytesCodec-after-ArrayArrayCodec branch of codecs_from_list (B5) Add regression tests for all five fixes. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Contributor
Author
|
@dcherian do you have any ideas for better property-based tests that we could add, given these bugs? |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #4074 +/- ##
==========================================
- Coverage 93.52% 93.46% -0.07%
==========================================
Files 90 90
Lines 11926 11954 +28
==========================================
+ Hits 11154 11173 +19
- Misses 772 781 +9
🚀 New features to boost your workflow:
|
dcherian
reviewed
Jun 16, 2026
| assert result.scheme == "c" | ||
|
|
||
|
|
||
| class TestNormalizeByteRangeIndex: |
Contributor
There was a problem hiding this comment.
Looks like a missing property test
dcherian
approved these changes
Jun 16, 2026
Augment the existing property-based test infrastructure (and convert two
example-based regression tests to property tests) so that each of the five
bugs fixed in this PR is caught by a property/stateful test, verified
red-green against the pre-fix source:
- B1 (suffix clamp): broaden the `key_ranges` strategy to emit
SuffixByteRequest / OffsetByteRequest / None with bounds that can exceed
the value length, and give the stateful `get_partial_values` rule an
independent oracle for every ByteRequest variant.
- B2 (generator exhaustion): the shared `StoreTests.test_get_partial_values`
now passes `key_ranges` as a one-shot generator and asserts one result per
request, so a store/wrapper that exhausts the iterable early is caught
(previously masked because the expected loop was driven by the observed
count). The stateful rule likewise passes a generator.
- B3 (getsize_prefix sibling over-counting): add a `getsize_prefix` rule to
the store state machine that asserts only keys under `node/` are counted.
- B4 (ZipStore.close before open): replace the two example-based regression
tests with a stateful lifecycle machine asserting `close()` never raises,
regardless of open/IO history.
- B5 (missing raise in codec order validation): replace the single example
with a property test over random {AA,AB,BB} codec orderings whose expected
outcome (TypeError / ValueError / ok) is modelled independently.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Make each of the five bugs fixed in this PR catchable by a property/stateful
or shared-harness test (verified red-green against the pre-fix source), and
remove the per-store special-case tests whose coverage is now subsumed:
- B1 (suffix clamp): broaden the `key_ranges` strategy to emit
SuffixByteRequest / OffsetByteRequest / None with bounds that can exceed
the value length, and give the stateful `get_partial_values` rule an
independent oracle for every ByteRequest variant. The pure-function unit
test `TestNormalizeByteRangeIndex` is kept as a fast deterministic guard.
- B2 (generator exhaustion): the shared `StoreTests.test_get_partial_values`
now passes `key_ranges` as a one-shot generator and asserts one result per
request, so any store/wrapper that exhausts the iterable early is caught
across all stores. The logging-only regression test is removed as redundant.
- B3 (getsize_prefix sibling over-counting): the shared
`StoreTests.test_getsize_prefix` now includes a sibling key ("cc/0") that
must be excluded, covering every store deterministically. Add a matching
`getsize_prefix` rule to the store state machine. The memory-only
regression test is removed as redundant.
- B4 (ZipStore.close before open): replace the two example-based regression
tests with a stateful lifecycle machine asserting `close()` never raises,
regardless of open/IO history.
- B5 (missing raise in codec order validation): replace the single example
with a property test over random {AA,AB,BB} codec orderings whose expected
outcome (TypeError / ValueError / ok) is modelled independently.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…-python into claude/storage-codec-bugfixes
FsspecStore.get_partial_values guarded its body with a bare `if key_ranges:` before materialising the argument. Because `key_ranges` is typed as a generic Iterable, a one-shot generator is always truthy even when empty, so the `else: return []` fast path was unreachable for generator inputs (it happened to be harmless only because fsspec's _cat_ranges returns [] for empty input). Materialise `key_ranges` into a list first, then check `if not key_ranges`, matching the iterable contract for any Iterable (list or generator). Also enable mypy's `truthy-iterable` error code, which flags exactly this class of bug (truthiness test on a non-Collection Iterable). There are no existing violations in src/zarr or tests, so this is a zero-cost guard against recurrence. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
d-v-b
commented
Jun 18, 2026
| strict = true | ||
| warn_unreachable = true | ||
| enable_error_code = ["ignore-without-code", "redundant-expr", "truthy-bool"] | ||
| enable_error_code = ["ignore-without-code", "redundant-expr", "truthy-bool", "truthy-iterable"] |
Contributor
Author
There was a problem hiding this comment.
flagging this change to our mypy config. the fsspecstore codebase was relying on if Iterator: ... which I think we should avoid. so now mypy helps us avoid it.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The pre-fix MemoryStore behavior returned incorrect data (a wrong slice via a negative start index), not merely fewer bytes; note the HTTP suffix-range semantics that define the correct behavior. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
I had claude do a comprehensive review of zarr-python and look for bugs, optimizations, etc. The tracking issue is over on my fork. I will be opening claude-authored PRs that fix the issues.
Summary
This PR addresses five pre-verified bugs (B1–B5 of #181):
B1 — MemoryStore SuffixByteRequest clamp (
src/zarr/storage/_utils.py)When
SuffixByteRequest.suffix > len(data), the computedstartindex wentnegative, causing numpy slicing to return too few bytes. Fixed by clamping:
start = max(0, len(data) - byte_range.suffix). This matches howZipStoreand
LocalStorealready handle this case.B2 — LoggingStore.get_partial_values generator exhaustion (
src/zarr/storage/_logging.py)The log-hint string was built by iterating over
key_rangeswith a listcomprehension, which exhausted a one-shot generator before passing it to
the wrapped store. Fixed by materialising
key_ranges = list(key_ranges)first.
B3 — getsize_prefix substring over-count (
src/zarr/abc/store.py)getsize_prefix("foo")calledlist_prefix("foo"), which matched both"foo/a"and"foobar/b"in prefix-matching stores. Fixed by normalisingthe prefix to end with
"/"(matchingdelete_dir/is_empty), so onlykeys under the exact directory are counted.
B4 — ZipStore.close() AttributeError when never opened (
src/zarr/storage/_zip.py)_lockis only created in_sync_open(). Ifclose()was called on astore that was never opened (e.g.
with ZipStore(...): pass), it raisedAttributeError: _lock. Fixed by returning early fromclose()when_is_openisFalse.B5 — Missing raise in codecs_from_list (
src/zarr/core/codec_pipeline.py)In the
BytesBytesCodecbranch, whenprev_codecis anArrayArrayCodec,the error message was assigned to
msgbut never raised. The siblingbranches (
ArrayArrayCodecandArrayBytesCodeccases) both doraise TypeError(msg); this one was missing theraise. Fixed by addingraise TypeError(msg).Notes
🤖 Generated with Claude Code