Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
55 changes: 55 additions & 0 deletions docs/nwb_contents.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
# Final NWB Contents

This document describes the contents of the NWB file produced by this
repository. It is a companion to issue
[#12](https://github.com/AllenNeuralDynamics/dynamic-foraging-processing/issues/12),
which serves as the authoritative changelog for content decisions.

When the NWB contents change, update both the [Changelog](#changelog) below
and the relevant section in this document. Each entry should record at minimum
the date, what changed, and why.

## Acquisition

The `acquisition` container holds the HARP streams from the rig (e.g.
VR Foraging) along with four behavior-derived series carried over from the
NWB produced by the combined dynamic foraging + FIP pipeline:

- `left_lick_time`
- `right_lick_time`
- `left_reward_delivery_time`
- `right_reward_delivery_time`

Each series stores both timestamps and a parallel `data` array. For the
reward delivery series, `data` annotates each reward as `earned`, `manual`,
or `automatic`.

See [`trials_table_mapping.md`](trials_table_mapping.md#acquisition-container)
for the raw sources backing each of these four series.

## Events

The `events` container follows the conventions in
[aind-physio-arch#1072](https://github.com/AllenNeuralDynamics/aind-physio-arch/issues/1072).

The events sidecar will be version-controlled in this repository for now so that
changes can be tracked alongside the code.

Events are on pause pending validation by the HED team. See the
[Changelog](#changelog) for details.

## Trials

The `trials` table is built from the raw acquisition streams. The full
column-by-column mapping is documented in
[`trials_table_mapping.md`](trials_table_mapping.md), and the source-of-truth
discussion lives in issue
[#5](https://github.com/AllenNeuralDynamics/dynamic-foraging-processing/issues/5).

## Changelog

| Date | Section | Change | Reason |
| --- | --- | --- | --- |
| 2026-06-03 | acquisition / trials | Initial scope confirmed: HARP streams + `{left,right}_lick_time` and `{left,right}_reward_delivery_time` in `acquisition`; trials mapping per issue #5. | Meeting with Alex. |
| 2026-06-05 | events | Events on pause. | Pending validation by the HED team. |
| 2026-06-08 | acquisition | Documented `data` arrays alongside timestamps; reward delivery series annotate each reward as `earned`, `manual`, or `automatic`. | Clarify acquisition contents. |
212 changes: 212 additions & 0 deletions docs/qc_upgrade_plan.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,212 @@
# QC Upgrade Plan

Plan for upgrading
[`aind-dynamic-foraging-qc/code/run_capsule.py`](https://github.com/AllenNeuralDynamics/aind-dynamic-foraging-qc/blob/main/code/run_capsule.py)
to:

1. Conform to the current
[`aind_data_schema.core.quality_control`](https://github.com/AllenNeuralDynamics/aind-data-schema/blob/dev/src/aind_data_schema/core/quality_control.py)
schema (v2.4.1).
2. Operate on primitive structures (numpy arrays, pandas DataFrames). QC
functions are agnostic to where the data came from — the caller is free
to load from
[`RawDataLoader`](../src/dynamic_foraging_processing/raw_data_loader/loader.py),
an NWB file, or anything else, as long as the primitives match the
expected shape.

This document is a design reference only. Implementation will happen on a
separate branch.

## 1. Schema changes

The schema removed `QCEvaluation`. The new `QualityControl` object holds a flat
`metrics: List[QCMetric | CurationMetric]` and groups metrics via per-metric
`tags`. Each `QCMetric` now requires `modality` and `stage` directly (these
moved off `QCEvaluation`), and `QualityControl` requires `default_grouping`.

### Field-by-field migration

| Old (capsule) | New (schema v2.4.1) |
| --- | --- |
| `QCEvaluation(name, modality, stage, metrics, description, allow_failed_metrics)` | Removed. Replace each evaluation with one or more `QCMetric`s sharing a tag. |
| `QCMetric(name, value, status_history, description?, reference?)` | `QCMetric(name, modality, stage, value, status_history, description?, reference?, tags={}, evaluated_assets?)` |
| `QualityControl(evaluations=[...])` | `QualityControl(metrics=[...], default_grouping=[...], key_experimenters?, notes?, allow_tag_failures?)` |
| n/a | `Status.PENDING` is now a valid third state alongside `PASS` / `FAIL`. |
| `allow_failed=True` on an evaluation | `allow_tag_failures=["<tag value>"]` on the top-level `QualityControl`. |

### Tag convention

Each ported behavior metric is tagged with `{"behavior": "<metric name>"}` —
the key is the group, the value is the metric's name. Contraqctor results
use a fixed `"test_suite"` key plus a dynamic per-suite key (see
[Contraqctor-based QA suites](#contraqctor-based-qa-suites-per-meeting-with-alex-2026-06-03)).

### Helper rewrites

`Bool2Status` keeps its shape but must produce timezone-aware timestamps
(schema uses `AwareDatetimeWithDefault`). The existing `datetime.now(seattle_tz)`
already satisfies this.

`create_evaluation(...)` is deleted. Replace with a small `make_metric(...)`
helper that stamps `modality`, `stage`, and `tags` onto each `QCMetric`.

## 2. Data inputs

The old capsule consumed a single `behavior.json` (e.g. `B_Bias`,
`B_LeftLickTime`, `B_RightLickTime`, `B_StagePositions`, `drop_frames_tag`,
`Experimenter`, `dirty_files`, ...). The new pipeline does not produce this
file.

QC functions now take primitive structures directly. The entry point is
responsible for producing those primitives — whether it pulls them from
`RawDataLoader.get_all_raw_data()`, an NWB file, or any other source is
out of scope for the QC module. This keeps the QC logic testable without
any dataset on disk.

### Primitive inputs per metric

| Primitive | Type | Old `behavior.json` analogue |
| --- | --- | --- |
| `left_lick_times` | `np.ndarray` of seconds | `B_LeftLickTime` |
| `right_lick_times` | `np.ndarray` of seconds | `B_RightLickTime` |
| `animal_response` | `np.ndarray` of `{0,1,2}` per trial | `B_AnimalResponseHistory` |
| `go_cue_times` | `np.ndarray` of seconds | `B_GoCueTimeSoundCard` |
| `rewarded_history` | `pd.DataFrame` with `left`/`right` boolean columns | `B_RewardedHistory` |
| `stage_positions` | `pd.DataFrame` with `x`/`y`/`z` columns per trial | `B_StagePositions` |

### Out-of-scope (no equivalent in the new data, drop the check)

- `drop_frames_tag`, `frame_num`, `trigger_length` — dropped-frames check.
- `Experimenter`, `dirty_files`, `repo_dirty_flag` — basic-configuration check.
- `B_Bias`, `B_Bias_CI` — pre-computed side bias; recompute from
`animal_response` instead (rolling fraction of right vs. left choices).

## 3. Metrics in the new capsule

Keep only what maps cleanly. All metrics get `stage=Stage.RAW` and
`modality=Modality.BEHAVIOR` unless noted.

### Side bias (`tags={"behavior": "average side bias"}`)

- Input: `animal_response: np.ndarray` (`0=left`, `1=right`, `2=ignore`).
- Average bias = `mean(is_right) - mean(is_left)` over responded trials (or
the rolling form, matching the old `B_Bias`).
- Metric: `"average side bias"`, pass when `abs(mean_bias) < 0.5`.
- `reference="side_bias.png"`.

### Lick intervals

Port `calculate_lick_intervals` verbatim. Inputs are
`left_lick_times: np.ndarray` and `right_lick_times: np.ndarray`, extracted
from the `Behavior.Lickometer` stream at the entry point.

Emit the same four metrics, each tagged with its own name under the
`behavior` key:

| Metric | Tag | Pass rule |
| --- | --- | --- |
| `Left Lick Interval (%)` | `{"behavior": "Left Lick Interval (%)"}` | `< 10` |
| `Right Lick Interval (%)` | `{"behavior": "Right Lick Interval (%)"}` | `< 10` |
| `Cross Side Lick Interval (%)` | `{"behavior": "Cross Side Lick Interval (%)"}` | `< 10` |
| `Artifact Percent (%)` | `{"behavior": "Artifact Percent (%)"}` | `< 1` |

All carry `reference="lick_intervals.png"`.

### Plots to keep

- `lick_intervals.png` — five-panel histogram of inter-lick intervals
(`left licks`, `right licks`, `left to right licks`, `right to left licks`,
`all licks`); inputs are `left_lick_times` and `right_lick_times`.
- `side_bias.png` — four-panel figure:
- Side bias trace (with confidence interval band) — rolling `B_Bias` /
`B_Bias_CI` recomputed from `animal_response`.
- Lickspout position over trials — `stage_positions` (x / y1 / y2 / z,
relative to session start, in mm).
- Behavior event raster — `animal_response` (L/R choice, ignore),
`rewarded_history` (L/R earned water), manual water times, and
`auto_water` (L/R) per trial.
- Reward probabilities — `reward_probabilityL` / `reward_probabilityR`
per trial.

### Contraqctor-based QA suites (per meeting with Alex, 2026-06-03)

Same approach as VR foraging QA.

The runner is provided by
[`aind_behavior_dynamic_foraging.data_qc.suite.make_qc_runner(dataset)`](https://github.com/AllenNeuralDynamics/Aind.Behavior.DynamicForaging/blob/main/src/aind_behavior_dynamic_foraging/data_qc/suite.py),
so just needs to call it on `loader.dataset` and convert
the results. `make_qc_runner` already wires up:

- `ContractTestSuite` (dataset loading errors, excluding Harp command streams)
- `HarpDeviceTestSuite` for every `HarpDevice` under `Behavior`
- `HarpHubTestSuite`
- `HarpLicketySplitTestSuite` for the left and right lickometers
- `HarpSniffDetectorTestSuite` / `HarpEnvironmentSensorTestSuite` (conditional on the rig)
- `CameraTestSuite` for every camera in `BehaviorVideos` (uses `rig.triggered_camera_controller.frame_rate`)
- `CsvTestSuite` for every CSV stream
- `DynamicForagingQcSuite` (currently `test_end_session_exists`)

#### Result → `QCMetric` conversion

Map contraqctor statuses onto schema statuses:

```python
status_converter = {
qc.Status.PASSED: Status.PASS,
qc.Status.SKIPPED: Status.PASS,
qc.Status.WARNING: Status.PENDING,
qc.Status.FAILED: Status.FAIL,
qc.Status.ERROR: Status.FAIL,
}
```

For each `qc.Result`:

- `name = f"{result.suite_name}::{result.test_name}"`
- `description = f"Test: {result.description} // Message: {result.message}"`
- `value = convert_numpy_to_python_data_type(result.result)`
- `status_history = [QCStatus(evaluator="Automated", status=..., timestamp=now_utc)]`
- `modality = Modality.BEHAVIOR`, `stage = Stage.RAW`
- `tags = {"test_suite": result.suite_name, result.suite_name: group_name}`
— one fixed `"test_suite"` key whose value is the suite name, plus a
dynamic key (the suite name) whose value is the runner group (defaulting
to `"NoGroup"`).
- `reference`: if `result.context["asset"]` is a `matplotlib.figure.Figure`,
save it under the results folder and store the relative path.

#### Updated tag / grouping plan

| Tag key | Values |
| --- | --- |
| `behavior` | metric name (e.g. `average side bias`, `Left Lick Interval (%)`) |
| `test_suite` | only on contraqctor metrics; suite name (e.g. `HarpEnvironmentSensorTestSuite`) |

`default_grouping` tells the QC portal which tag *keys* to use when
laying out the metrics hierarchically (see the schema field's
[description](https://github.com/AllenNeuralDynamics/aind-data-schema/blob/dev/src/aind_data_schema/core/quality_control.py)).
Each entry is a tag key (or a list of tag keys at the same level); the
portal walks them in order and groups metrics by the values it finds for
those keys.

So `behavior` and `test_suite` are siblings at the top level; a metric
ends up under whichever one its tags match. They don't overlap because
the two groups of metrics carry disjoint tag keys.

Sample portal layout:

```
behavior
Metric...
Metric...

test_suite
Metric...
Metric...
```

## Changelog

| Date | Section | Change | Reason |
| --- | --- | --- | --- |
| 2026-06-03 | metrics | Confirmed kept QC metrics: side bias, lick intervals, and Harp/contract QA via `make_qc_runner`. Dropped checks tied to old `behavior.json` (dropped frames, basic configuration). | Meeting with Alex. |
| 2026-06-03 | qa | Adopt contraqctor `qc.Runner` output (`make_qc_runner(dataset)`) as the source for Harp / camera / contract / DynamicForaging QA, converted into `QCMetric`s. | Meeting with Alex. |
124 changes: 124 additions & 0 deletions docs/trials_table_mapping.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,124 @@
# Mapping of Raw Acquisition Streams to the NWB Trials Table

This document describes how the NWB `acquisition` container and the `trials` table
are constructed from the raw dynamic foraging acquisition streams.

Reference asset used while mapping:
[behavior_836626_2026-05-20_14-19-10_processed_2026-05-21_17-40-47](https://codeocean.allenneuraldynamics.org/data-assets/49d1b596-c1a0-4c52-a3dd-26181f4b2b55/behavior_836626_2026-05-20_14-19-10_processed_2026-05-21_17-40-47).

Trial column descriptions are derived from
[`nwb_trial_column_info.json`](https://github.com/AllenNeuralDynamics/aind-fip-nwb-base-capsule/blob/main/code/util/nwb_trial_column_info.json)
in the combined pipeline.

> **Note:** Any column related to `autoTrain` can be disregarded (per meeting with
> Alex on June 3rd, 2026).

## Acquisition Container

The NWB `acquisition` container holds four behavior-related time series:

| Acquisition series | Source stream | Notes |
| --- | --- | --- |
| `left_lick_time` | `Behavior/Lickometer` | |
| `right_lick_time` | `Behavior/Lickometer` | |
| `left_reward_delivery_time` | `Behavior/HarpBehavior` `OutputSet` (`SupplyPort0`, `WRITE` messages) | Same as left valve open. |
| `right_reward_delivery_time` | `Behavior/HarpBehavior` `OutputSet` (`SupplyPort1`, `WRITE` messages) | Same as right valve open. |

Earlier mapping used `Response.json` (`SoftwareEvents`) for lick times (where
`Item1` is the time and `Item2` is `left`/`right`) and `TrialOutcome.json`
(filtered on `is_rewarded`, then `left`/`right`) for reward delivery times.
Lick times now come from the `Behavior/Lickometer` stream, and reward delivery
times use the Harp valve open times.

## Trials Table

Columns are grouped by the raw source they map from.

### From `task_logic_input` (under `Logs`, `trial_generator` key)

| Trials column | Source field |
| --- | --- |
| `ITI_beta`, `ITI_min`, `ITI_max`, `ITI_duration` | `inter_trial_interval_duration` |
| `block_beta`, `block_duration`, `block_min`, `block_max` | `block_length` |
| `delay_beta`, `delay_duration`, `delay_min`, `delay_max` | `quiescent_duration_key` (scalar distribution, so no beta/min/max) |

### From `Response.json` (`SoftwareEvents` stream)

| Trials column | Mapping |
| --- | --- |
| `animal_response` | `0` = left choice, `1` = right choice, `2` = no response. |

### From `TrialOutcome.json` (`SoftwareEvents` stream)

> **Note:** For `is_auto_response_right`, `True` means right and `False`
> means left.

| Trials column | Mapping |
| --- | --- |
| `auto_waterL` / `auto_waterR` | From `is_auto_response_right`. `NULL` for None, `true` for right, `false` for left. Encoded `0`/`1`. |

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the next line, you refer to a variable called auto_response_right. here you refer to is_auto_response_right. Are these two different variables or is one of the names incorrect?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great catch! is_auto_response_right is the correct one. These are the same, it's a typo. Tried to address here: d645c2e

| `bait_left` / `bait_right` | Boolean. `bait_right` is `True` if `p_reward_right == 1` and `is_auto_response_right` is `None` or `False`. `bait_left` is `True` if `p_reward_left == 1` and `is_auto_response_right` is `None` or `True`. |
| `response_duration` | `response_deadline_duration`. |
| `reward_consumption_duration` | `Trial -> reward_consumption_duration`. |
| `reward_probabilityL` / `reward_probabilityR` | Most likely the block probability: `Trial -> Metadata -> p_reward_left` / `p_reward_right`. Confirm with Alex whether the actual lickspout probability is intended. |

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you clean up the "Confirm with Alex whether the actual lickspout probability is intended." by getting that info from @alexpiet prior to merging this PR?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added Alex as a reviewer. I think we may have talked about this already and I forgot to write it down. But from discussing with Micah, I think she said it is the probability within the block if I am understanding and remembering right

| `rewarded_historyL` / `rewarded_historyR` | Filter `is_rewarded == True`, then on `is_right_choice`. |

### From `TrialGeneratorSpec.json` (`SoftwareEvents` stream)

| Trials column | Mapping |
| --- | --- |
| `base_reward_probability_sum` | If `type == "CoupledTrialGenerator"`, look at `reward_probability_parameters`. |
| `min_reward_each_block` | Present when `type == "CoupledTrialGenerator"`; otherwise `None`. |

### From `QuiescentPeriod.json` (`SoftwareEvents` stream)

| Trials column | Mapping |
| --- | --- |
| `delay_start_time` | `timestamp`. |
| `start_time` | `timestamp` column. |

### From `ITI_period.json` (`SoftwareEvents` stream)

| Trials column | Mapping |
| --- | --- |
| `stop_time` | `timestamp` column. Possible QC check: length should match `QuiescentPeriod.json`. |

### From `HarpBehavior` (`OutputSet`)

| Trials column | Mapping |
| --- | --- |
| `left_valve_open_time` | `SupplyPort0`. |
| `right_valve_open_time` | `SupplyPort1`. |

Cross-correlate with software-event manual-reward times from the UI against
trial `start_time`/`stop_time` to disambiguate manual valve opens. Double-check

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does your "Double-check this" note here mean? Can you confirm before commiting this file?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added Alex as reviewer for this. I think from meeting with Micah, this is asking if manual valve openings need to be kept track of as well

this.

### From `SoundCard` (`WRITE` messages)

| Trials column | Mapping |
| --- | --- |
| `goCue_start_time` | `PlaySoundOrFrequency` `WRITE` message. |

### From `InitialManipulatorPosition` (software event)

| Trials column | Mapping |
| --- | --- |
| `lickspout_positions` | `data` field. |

### From `trainer_state.json` and `acquisition.json` (autoTrain — can be disregarded)

These were mapped during exploration but are no longer in scope:

- `auto_train_curriculum_name` / `auto_train_curriculum_schema_version` —
`trainer_state.json` (top level).
- `auto_train_engaged` — Boolean flag in `acquisition.json` indicating whether
the curriculum is running.
- `auto_train_stage` — `stage` in `trainer_state.json` (should always exist).
- `auto_train_stage_overridden` — `True` when `on_curriculum` in
`acquisition.json` is `False`.

### Not applicable to this task

| Trials column | Mapping |
| --- | --- |
| `reward_random_L` / `reward_random_R` | None — no task component drives these. |
Loading