Use np.load(..., mmap_mode="r") for time_vector in `BaseRecording._extra_metadata_from_folder` by grahamfindlay · Pull Request #4608 · SpikeInterface/spikeinterface

grahamfindlay · 2026-06-08T00:57:44Z

BaseRecording._extra_metadata_from_folder was eagerly loading a BinaryFolderRecording's time vector, which can be tens of GB on long recordings.

I ran into this because run_sorter_by_property (I was sorting tetrodes) reconstructs the recording once per joblib worker, so every worker was loading the whole time vector, even when only a short frame slice was needed.

Another consequence of the eager loading was that si.load on a 48h recording with a time_vector was using 1.6GB of memory, even if the time vector was never touched.

I switched the load to mmap_mode="r", so memory use no longer scales with both recording duration and number of workers. It is bounded to what's actually touched, and the memmap can be shared by the joblib workers.

The cost is that this time vector is read-only, so in-place operations on it would raise an error.

Thankfully there was only 1: TimeSeries.shift_times! I made it shift in place only when the vector is writeable and fall back to an out-of-place op for the now read-only memmap. (For simplicity, we could scrap the conditional altogether and just keep the out-of-place path, since this is presumably a pretty rare op -- but I chose to keep the in-place, no-copy path for best performance).

…ta_from_folder` `BaseRecording._extra_metadata_from_folder` was eagerly loading a `BinaryFolderRecording`'s time vector, which can be tens of GB on long recordings. I ran into this because `run_sorter_by_property` (I was sorting tetrodes) reconstructs the recording _once per joblib worker_, so every worker was loading the whole time vector, even when only a short frame slice was needed. Another consequence of the eager loading was that `si.load` on a 48h recording with a time_vector was using 1.6GB of memory, even if the time vector was never touched. I switched the load to `mmap_mode="r"`, so memory use no longer scales with both recording duration and number of workers. It is bounded to what's actually touched, and the memmap can be shared by the joblib workers. The cost is that this time vector is read-only, so in-place operations on it would raise an error. Thankfully there was only 1: `TimeSeries.shift_times`! I made it shift in place only when the vector is writeable and fall back to an out-of-place op for the now read-only memmap.

for more information, see https://pre-commit.ci

alejoe91 · 2026-06-08T12:41:22Z


            if self.has_time_vector(segment_index=segment_index):
-                rs.time_vector += shift
+                if rs.time_vector.flags.writeable:


we should check if zarr.Array also implements this flag

actually, it doesn't have this flag, so we should do:

Suggested change

if rs.time_vector.flags.writeable:

if isinstance(rs.time_vector, np.ndarray) and rs.time_vector.flags.writeable:

Oh, wow, I didn't even realize time_vector could be a zarr.Array! In that case, the out-of-place op is also going to fail. And the in-place op is probably already failing -- I guess it just never surfaced.

So I think we actually need something like:

if isinstance(rs.time_vector, np.ndarray) and rs.time_vector.flags.writeable: rs.time_vector += shift # in-place, no copy else: rs.time_vector = np.asarray(rs.time_vector) + shift # read-only memmap or zarr.Array

Recordings loaded from disk may have time vectors that are out-of-memory (OOM) arrays: read-only mem-maps or zarr.Array. These can't be shifted in place, and need to be materialized as as numpy arrays before shifting.

for more information, see https://pre-commit.ci

grahamfindlay and others added 2 commits June 7, 2026 19:41

[pre-commit.ci] auto fixes from pre-commit.com hooks

46a717b

for more information, see https://pre-commit.ci

alejoe91 reviewed Jun 8, 2026

View reviewed changes

Comment thread src/spikeinterface/core/time_series.py Outdated

alejoe91 added the core Changes to core module label Jun 8, 2026

alejoe91 reviewed Jun 8, 2026

View reviewed changes

grahamfindlay and others added 3 commits June 9, 2026 11:34

Shift times using writable copies of OOM arrays

3e42928

Recordings loaded from disk may have time vectors that are out-of-memory (OOM) arrays: read-only mem-maps or zarr.Array. These can't be shifted in place, and need to be materialized as as numpy arrays before shifting.

[pre-commit.ci] auto fixes from pre-commit.com hooks

9b7e836

for more information, see https://pre-commit.ci

Document time_vector types

18e10e1

alejoe91 approved these changes Jun 10, 2026

View reviewed changes

alejoe91 merged commit 1d2fab0 into SpikeInterface:main Jun 10, 2026
15 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use np.load(..., mmap_mode="r") for time_vector in `BaseRecording._extra_metadata_from_folder`#4608

Use np.load(..., mmap_mode="r") for time_vector in `BaseRecording._extra_metadata_from_folder`#4608
alejoe91 merged 5 commits into
SpikeInterface:mainfrom
grahamfindlay:bugfix/excessive_eager_time_vector

grahamfindlay commented Jun 8, 2026

Uh oh!

Uh oh!

alejoe91 Jun 8, 2026

Uh oh!

alejoe91 Jun 8, 2026 •

edited

Loading

Uh oh!

grahamfindlay Jun 9, 2026 •

edited

Loading

Uh oh!

alejoe91 Jun 10, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	if rs.time_vector.flags.writeable:
	if isinstance(rs.time_vector, np.ndarray) and rs.time_vector.flags.writeable:

Conversation

grahamfindlay commented Jun 8, 2026

Uh oh!

Uh oh!

alejoe91 Jun 8, 2026

Choose a reason for hiding this comment

Uh oh!

alejoe91 Jun 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

grahamfindlay Jun 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

alejoe91 Jun 10, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

alejoe91 Jun 8, 2026 •

edited

Loading

grahamfindlay Jun 9, 2026 •

edited

Loading