-
Notifications
You must be signed in to change notification settings - Fork 0
docs: add markdown documentation #13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
b6d5504
3acea0c
f9f3eb1
c4baf3f
d645c2e
be5c327
b581220
fc6d892
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,55 @@ | ||
| # Final NWB Contents | ||
|
|
||
| This document describes the contents of the NWB file produced by this | ||
| repository. It is a companion to issue | ||
| [#12](https://github.com/AllenNeuralDynamics/dynamic-foraging-processing/issues/12), | ||
| which serves as the authoritative changelog for content decisions. | ||
|
|
||
| When the NWB contents change, update both the [Changelog](#changelog) below | ||
| and the relevant section in this document. Each entry should record at minimum | ||
| the date, what changed, and why. | ||
|
|
||
| ## Acquisition | ||
|
|
||
| The `acquisition` container holds the HARP streams from the rig (e.g. | ||
| VR Foraging) along with four behavior-derived series carried over from the | ||
| NWB produced by the combined dynamic foraging + FIP pipeline: | ||
|
|
||
| - `left_lick_time` | ||
| - `right_lick_time` | ||
| - `left_reward_delivery_time` | ||
| - `right_reward_delivery_time` | ||
|
|
||
| Each series stores both timestamps and a parallel `data` array. For the | ||
| reward delivery series, `data` annotates each reward as `earned`, `manual`, | ||
| or `automatic`. | ||
|
|
||
| See [`trials_table_mapping.md`](trials_table_mapping.md#acquisition-container) | ||
| for the raw sources backing each of these four series. | ||
|
|
||
| ## Events | ||
|
|
||
| The `events` container follows the conventions in | ||
| [aind-physio-arch#1072](https://github.com/AllenNeuralDynamics/aind-physio-arch/issues/1072). | ||
|
|
||
| The events sidecar will be version-controlled in this repository for now so that | ||
| changes can be tracked alongside the code. | ||
|
|
||
| Events are on pause pending validation by the HED team. See the | ||
| [Changelog](#changelog) for details. | ||
|
|
||
| ## Trials | ||
|
|
||
| The `trials` table is built from the raw acquisition streams. The full | ||
| column-by-column mapping is documented in | ||
| [`trials_table_mapping.md`](trials_table_mapping.md), and the source-of-truth | ||
| discussion lives in issue | ||
| [#5](https://github.com/AllenNeuralDynamics/dynamic-foraging-processing/issues/5). | ||
|
|
||
| ## Changelog | ||
|
|
||
| | Date | Section | Change | Reason | | ||
| | --- | --- | --- | --- | | ||
| | 2026-06-03 | acquisition / trials | Initial scope confirmed: HARP streams + `{left,right}_lick_time` and `{left,right}_reward_delivery_time` in `acquisition`; trials mapping per issue #5. | Meeting with Alex. | | ||
| | 2026-06-05 | events | Events on pause. | Pending validation by the HED team. | | ||
| | 2026-06-08 | acquisition | Documented `data` arrays alongside timestamps; reward delivery series annotate each reward as `earned`, `manual`, or `automatic`. | Clarify acquisition contents. | |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,212 @@ | ||
| # QC Upgrade Plan | ||
|
|
||
| Plan for upgrading | ||
| [`aind-dynamic-foraging-qc/code/run_capsule.py`](https://github.com/AllenNeuralDynamics/aind-dynamic-foraging-qc/blob/main/code/run_capsule.py) | ||
| to: | ||
|
|
||
| 1. Conform to the current | ||
| [`aind_data_schema.core.quality_control`](https://github.com/AllenNeuralDynamics/aind-data-schema/blob/dev/src/aind_data_schema/core/quality_control.py) | ||
| schema (v2.4.1). | ||
| 2. Operate on primitive structures (numpy arrays, pandas DataFrames). QC | ||
| functions are agnostic to where the data came from — the caller is free | ||
| to load from | ||
| [`RawDataLoader`](../src/dynamic_foraging_processing/raw_data_loader/loader.py), | ||
| an NWB file, or anything else, as long as the primitives match the | ||
| expected shape. | ||
|
|
||
| This document is a design reference only. Implementation will happen on a | ||
| separate branch. | ||
|
|
||
| ## 1. Schema changes | ||
|
|
||
| The schema removed `QCEvaluation`. The new `QualityControl` object holds a flat | ||
| `metrics: List[QCMetric | CurationMetric]` and groups metrics via per-metric | ||
| `tags`. Each `QCMetric` now requires `modality` and `stage` directly (these | ||
| moved off `QCEvaluation`), and `QualityControl` requires `default_grouping`. | ||
|
|
||
| ### Field-by-field migration | ||
|
|
||
| | Old (capsule) | New (schema v2.4.1) | | ||
| | --- | --- | | ||
| | `QCEvaluation(name, modality, stage, metrics, description, allow_failed_metrics)` | Removed. Replace each evaluation with one or more `QCMetric`s sharing a tag. | | ||
| | `QCMetric(name, value, status_history, description?, reference?)` | `QCMetric(name, modality, stage, value, status_history, description?, reference?, tags={}, evaluated_assets?)` | | ||
| | `QualityControl(evaluations=[...])` | `QualityControl(metrics=[...], default_grouping=[...], key_experimenters?, notes?, allow_tag_failures?)` | | ||
| | n/a | `Status.PENDING` is now a valid third state alongside `PASS` / `FAIL`. | | ||
| | `allow_failed=True` on an evaluation | `allow_tag_failures=["<tag value>"]` on the top-level `QualityControl`. | | ||
|
|
||
| ### Tag convention | ||
|
|
||
| Each ported behavior metric is tagged with `{"behavior": "<metric name>"}` — | ||
| the key is the group, the value is the metric's name. Contraqctor results | ||
| use a fixed `"test_suite"` key plus a dynamic per-suite key (see | ||
| [Contraqctor-based QA suites](#contraqctor-based-qa-suites-per-meeting-with-alex-2026-06-03)). | ||
|
|
||
| ### Helper rewrites | ||
|
|
||
| `Bool2Status` keeps its shape but must produce timezone-aware timestamps | ||
| (schema uses `AwareDatetimeWithDefault`). The existing `datetime.now(seattle_tz)` | ||
| already satisfies this. | ||
|
|
||
| `create_evaluation(...)` is deleted. Replace with a small `make_metric(...)` | ||
| helper that stamps `modality`, `stage`, and `tags` onto each `QCMetric`. | ||
|
|
||
| ## 2. Data inputs | ||
|
|
||
| The old capsule consumed a single `behavior.json` (e.g. `B_Bias`, | ||
| `B_LeftLickTime`, `B_RightLickTime`, `B_StagePositions`, `drop_frames_tag`, | ||
| `Experimenter`, `dirty_files`, ...). The new pipeline does not produce this | ||
| file. | ||
|
|
||
| QC functions now take primitive structures directly. The entry point is | ||
| responsible for producing those primitives — whether it pulls them from | ||
| `RawDataLoader.get_all_raw_data()`, an NWB file, or any other source is | ||
| out of scope for the QC module. This keeps the QC logic testable without | ||
| any dataset on disk. | ||
|
|
||
| ### Primitive inputs per metric | ||
|
|
||
| | Primitive | Type | Old `behavior.json` analogue | | ||
| | --- | --- | --- | | ||
| | `left_lick_times` | `np.ndarray` of seconds | `B_LeftLickTime` | | ||
| | `right_lick_times` | `np.ndarray` of seconds | `B_RightLickTime` | | ||
| | `animal_response` | `np.ndarray` of `{0,1,2}` per trial | `B_AnimalResponseHistory` | | ||
| | `go_cue_times` | `np.ndarray` of seconds | `B_GoCueTimeSoundCard` | | ||
| | `rewarded_history` | `pd.DataFrame` with `left`/`right` boolean columns | `B_RewardedHistory` | | ||
| | `stage_positions` | `pd.DataFrame` with `x`/`y`/`z` columns per trial | `B_StagePositions` | | ||
|
|
||
| ### Out-of-scope (no equivalent in the new data, drop the check) | ||
|
|
||
| - `drop_frames_tag`, `frame_num`, `trigger_length` — dropped-frames check. | ||
| - `Experimenter`, `dirty_files`, `repo_dirty_flag` — basic-configuration check. | ||
| - `B_Bias`, `B_Bias_CI` — pre-computed side bias; recompute from | ||
| `animal_response` instead (rolling fraction of right vs. left choices). | ||
|
|
||
| ## 3. Metrics in the new capsule | ||
|
|
||
| Keep only what maps cleanly. All metrics get `stage=Stage.RAW` and | ||
| `modality=Modality.BEHAVIOR` unless noted. | ||
|
|
||
| ### Side bias (`tags={"behavior": "average side bias"}`) | ||
|
|
||
| - Input: `animal_response: np.ndarray` (`0=left`, `1=right`, `2=ignore`). | ||
| - Average bias = `mean(is_right) - mean(is_left)` over responded trials (or | ||
| the rolling form, matching the old `B_Bias`). | ||
| - Metric: `"average side bias"`, pass when `abs(mean_bias) < 0.5`. | ||
| - `reference="side_bias.png"`. | ||
|
|
||
| ### Lick intervals | ||
|
|
||
| Port `calculate_lick_intervals` verbatim. Inputs are | ||
| `left_lick_times: np.ndarray` and `right_lick_times: np.ndarray`, extracted | ||
| from the `Behavior.Lickometer` stream at the entry point. | ||
|
|
||
| Emit the same four metrics, each tagged with its own name under the | ||
| `behavior` key: | ||
|
|
||
| | Metric | Tag | Pass rule | | ||
| | --- | --- | --- | | ||
| | `Left Lick Interval (%)` | `{"behavior": "Left Lick Interval (%)"}` | `< 10` | | ||
| | `Right Lick Interval (%)` | `{"behavior": "Right Lick Interval (%)"}` | `< 10` | | ||
| | `Cross Side Lick Interval (%)` | `{"behavior": "Cross Side Lick Interval (%)"}` | `< 10` | | ||
| | `Artifact Percent (%)` | `{"behavior": "Artifact Percent (%)"}` | `< 1` | | ||
|
|
||
| All carry `reference="lick_intervals.png"`. | ||
|
|
||
| ### Plots to keep | ||
|
|
||
| - `lick_intervals.png` — five-panel histogram of inter-lick intervals | ||
| (`left licks`, `right licks`, `left to right licks`, `right to left licks`, | ||
| `all licks`); inputs are `left_lick_times` and `right_lick_times`. | ||
| - `side_bias.png` — four-panel figure: | ||
| - Side bias trace (with confidence interval band) — rolling `B_Bias` / | ||
| `B_Bias_CI` recomputed from `animal_response`. | ||
| - Lickspout position over trials — `stage_positions` (x / y1 / y2 / z, | ||
| relative to session start, in mm). | ||
| - Behavior event raster — `animal_response` (L/R choice, ignore), | ||
| `rewarded_history` (L/R earned water), manual water times, and | ||
| `auto_water` (L/R) per trial. | ||
| - Reward probabilities — `reward_probabilityL` / `reward_probabilityR` | ||
| per trial. | ||
|
|
||
| ### Contraqctor-based QA suites (per meeting with Alex, 2026-06-03) | ||
|
|
||
| Same approach as VR foraging QA. | ||
|
|
||
| The runner is provided by | ||
| [`aind_behavior_dynamic_foraging.data_qc.suite.make_qc_runner(dataset)`](https://github.com/AllenNeuralDynamics/Aind.Behavior.DynamicForaging/blob/main/src/aind_behavior_dynamic_foraging/data_qc/suite.py), | ||
| so just needs to call it on `loader.dataset` and convert | ||
| the results. `make_qc_runner` already wires up: | ||
|
|
||
| - `ContractTestSuite` (dataset loading errors, excluding Harp command streams) | ||
| - `HarpDeviceTestSuite` for every `HarpDevice` under `Behavior` | ||
| - `HarpHubTestSuite` | ||
| - `HarpLicketySplitTestSuite` for the left and right lickometers | ||
| - `HarpSniffDetectorTestSuite` / `HarpEnvironmentSensorTestSuite` (conditional on the rig) | ||
| - `CameraTestSuite` for every camera in `BehaviorVideos` (uses `rig.triggered_camera_controller.frame_rate`) | ||
| - `CsvTestSuite` for every CSV stream | ||
| - `DynamicForagingQcSuite` (currently `test_end_session_exists`) | ||
|
|
||
| #### Result → `QCMetric` conversion | ||
|
|
||
| Map contraqctor statuses onto schema statuses: | ||
|
|
||
| ```python | ||
| status_converter = { | ||
| qc.Status.PASSED: Status.PASS, | ||
| qc.Status.SKIPPED: Status.PASS, | ||
| qc.Status.WARNING: Status.PENDING, | ||
| qc.Status.FAILED: Status.FAIL, | ||
| qc.Status.ERROR: Status.FAIL, | ||
| } | ||
| ``` | ||
|
|
||
| For each `qc.Result`: | ||
|
|
||
| - `name = f"{result.suite_name}::{result.test_name}"` | ||
| - `description = f"Test: {result.description} // Message: {result.message}"` | ||
| - `value = convert_numpy_to_python_data_type(result.result)` | ||
| - `status_history = [QCStatus(evaluator="Automated", status=..., timestamp=now_utc)]` | ||
| - `modality = Modality.BEHAVIOR`, `stage = Stage.RAW` | ||
| - `tags = {"test_suite": result.suite_name, result.suite_name: group_name}` | ||
| — one fixed `"test_suite"` key whose value is the suite name, plus a | ||
| dynamic key (the suite name) whose value is the runner group (defaulting | ||
| to `"NoGroup"`). | ||
| - `reference`: if `result.context["asset"]` is a `matplotlib.figure.Figure`, | ||
| save it under the results folder and store the relative path. | ||
|
|
||
| #### Updated tag / grouping plan | ||
|
|
||
| | Tag key | Values | | ||
| | --- | --- | | ||
| | `behavior` | metric name (e.g. `average side bias`, `Left Lick Interval (%)`) | | ||
| | `test_suite` | only on contraqctor metrics; suite name (e.g. `HarpEnvironmentSensorTestSuite`) | | ||
|
|
||
| `default_grouping` tells the QC portal which tag *keys* to use when | ||
| laying out the metrics hierarchically (see the schema field's | ||
| [description](https://github.com/AllenNeuralDynamics/aind-data-schema/blob/dev/src/aind_data_schema/core/quality_control.py)). | ||
| Each entry is a tag key (or a list of tag keys at the same level); the | ||
| portal walks them in order and groups metrics by the values it finds for | ||
| those keys. | ||
|
|
||
| So `behavior` and `test_suite` are siblings at the top level; a metric | ||
| ends up under whichever one its tags match. They don't overlap because | ||
| the two groups of metrics carry disjoint tag keys. | ||
|
|
||
| Sample portal layout: | ||
|
|
||
| ``` | ||
| behavior | ||
| Metric... | ||
| Metric... | ||
|
|
||
| test_suite | ||
| Metric... | ||
| Metric... | ||
| ``` | ||
|
|
||
| ## Changelog | ||
|
|
||
| | Date | Section | Change | Reason | | ||
| | --- | --- | --- | --- | | ||
| | 2026-06-03 | metrics | Confirmed kept QC metrics: side bias, lick intervals, and Harp/contract QA via `make_qc_runner`. Dropped checks tied to old `behavior.json` (dropped frames, basic configuration). | Meeting with Alex. | | ||
| | 2026-06-03 | qa | Adopt contraqctor `qc.Runner` output (`make_qc_runner(dataset)`) as the source for Harp / camera / contract / DynamicForaging QA, converted into `QCMetric`s. | Meeting with Alex. | |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,124 @@ | ||
| # Mapping of Raw Acquisition Streams to the NWB Trials Table | ||
|
|
||
| This document describes how the NWB `acquisition` container and the `trials` table | ||
| are constructed from the raw dynamic foraging acquisition streams. | ||
|
|
||
| Reference asset used while mapping: | ||
| [behavior_836626_2026-05-20_14-19-10_processed_2026-05-21_17-40-47](https://codeocean.allenneuraldynamics.org/data-assets/49d1b596-c1a0-4c52-a3dd-26181f4b2b55/behavior_836626_2026-05-20_14-19-10_processed_2026-05-21_17-40-47). | ||
|
|
||
| Trial column descriptions are derived from | ||
| [`nwb_trial_column_info.json`](https://github.com/AllenNeuralDynamics/aind-fip-nwb-base-capsule/blob/main/code/util/nwb_trial_column_info.json) | ||
| in the combined pipeline. | ||
|
|
||
| > **Note:** Any column related to `autoTrain` can be disregarded (per meeting with | ||
| > Alex on June 3rd, 2026). | ||
|
|
||
| ## Acquisition Container | ||
|
|
||
| The NWB `acquisition` container holds four behavior-related time series: | ||
|
|
||
| | Acquisition series | Source stream | Notes | | ||
| | --- | --- | --- | | ||
| | `left_lick_time` | `Behavior/Lickometer` | | | ||
| | `right_lick_time` | `Behavior/Lickometer` | | | ||
| | `left_reward_delivery_time` | `Behavior/HarpBehavior` `OutputSet` (`SupplyPort0`, `WRITE` messages) | Same as left valve open. | | ||
| | `right_reward_delivery_time` | `Behavior/HarpBehavior` `OutputSet` (`SupplyPort1`, `WRITE` messages) | Same as right valve open. | | ||
|
|
||
| Earlier mapping used `Response.json` (`SoftwareEvents`) for lick times (where | ||
| `Item1` is the time and `Item2` is `left`/`right`) and `TrialOutcome.json` | ||
| (filtered on `is_rewarded`, then `left`/`right`) for reward delivery times. | ||
| Lick times now come from the `Behavior/Lickometer` stream, and reward delivery | ||
| times use the Harp valve open times. | ||
|
|
||
| ## Trials Table | ||
|
|
||
| Columns are grouped by the raw source they map from. | ||
|
|
||
| ### From `task_logic_input` (under `Logs`, `trial_generator` key) | ||
|
|
||
| | Trials column | Source field | | ||
| | --- | --- | | ||
| | `ITI_beta`, `ITI_min`, `ITI_max`, `ITI_duration` | `inter_trial_interval_duration` | | ||
| | `block_beta`, `block_duration`, `block_min`, `block_max` | `block_length` | | ||
| | `delay_beta`, `delay_duration`, `delay_min`, `delay_max` | `quiescent_duration_key` (scalar distribution, so no beta/min/max) | | ||
|
|
||
| ### From `Response.json` (`SoftwareEvents` stream) | ||
|
|
||
| | Trials column | Mapping | | ||
| | --- | --- | | ||
| | `animal_response` | `0` = left choice, `1` = right choice, `2` = no response. | | ||
|
|
||
| ### From `TrialOutcome.json` (`SoftwareEvents` stream) | ||
|
|
||
| > **Note:** For `is_auto_response_right`, `True` means right and `False` | ||
| > means left. | ||
|
|
||
| | Trials column | Mapping | | ||
| | --- | --- | | ||
| | `auto_waterL` / `auto_waterR` | From `is_auto_response_right`. `NULL` for None, `true` for right, `false` for left. Encoded `0`/`1`. | | ||
| | `bait_left` / `bait_right` | Boolean. `bait_right` is `True` if `p_reward_right == 1` and `is_auto_response_right` is `None` or `False`. `bait_left` is `True` if `p_reward_left == 1` and `is_auto_response_right` is `None` or `True`. | | ||
| | `response_duration` | `response_deadline_duration`. | | ||
| | `reward_consumption_duration` | `Trial -> reward_consumption_duration`. | | ||
| | `reward_probabilityL` / `reward_probabilityR` | Most likely the block probability: `Trial -> Metadata -> p_reward_left` / `p_reward_right`. Confirm with Alex whether the actual lickspout probability is intended. | | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you clean up the "Confirm with Alex whether the actual lickspout probability is intended." by getting that info from @alexpiet prior to merging this PR?
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I added Alex as a reviewer. I think we may have talked about this already and I forgot to write it down. But from discussing with Micah, I think she said it is the probability within the block if I am understanding and remembering right |
||
| | `rewarded_historyL` / `rewarded_historyR` | Filter `is_rewarded == True`, then on `is_right_choice`. | | ||
|
|
||
| ### From `TrialGeneratorSpec.json` (`SoftwareEvents` stream) | ||
|
|
||
| | Trials column | Mapping | | ||
| | --- | --- | | ||
| | `base_reward_probability_sum` | If `type == "CoupledTrialGenerator"`, look at `reward_probability_parameters`. | | ||
| | `min_reward_each_block` | Present when `type == "CoupledTrialGenerator"`; otherwise `None`. | | ||
|
|
||
| ### From `QuiescentPeriod.json` (`SoftwareEvents` stream) | ||
|
|
||
| | Trials column | Mapping | | ||
| | --- | --- | | ||
| | `delay_start_time` | `timestamp`. | | ||
| | `start_time` | `timestamp` column. | | ||
|
|
||
| ### From `ITI_period.json` (`SoftwareEvents` stream) | ||
|
|
||
| | Trials column | Mapping | | ||
| | --- | --- | | ||
| | `stop_time` | `timestamp` column. Possible QC check: length should match `QuiescentPeriod.json`. | | ||
|
|
||
| ### From `HarpBehavior` (`OutputSet`) | ||
|
|
||
| | Trials column | Mapping | | ||
| | --- | --- | | ||
| | `left_valve_open_time` | `SupplyPort0`. | | ||
| | `right_valve_open_time` | `SupplyPort1`. | | ||
|
|
||
| Cross-correlate with software-event manual-reward times from the UI against | ||
| trial `start_time`/`stop_time` to disambiguate manual valve opens. Double-check | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What does your "Double-check this" note here mean? Can you confirm before commiting this file?
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Added Alex as reviewer for this. I think from meeting with Micah, this is asking if manual valve openings need to be kept track of as well |
||
| this. | ||
|
|
||
| ### From `SoundCard` (`WRITE` messages) | ||
|
|
||
| | Trials column | Mapping | | ||
| | --- | --- | | ||
| | `goCue_start_time` | `PlaySoundOrFrequency` `WRITE` message. | | ||
|
|
||
| ### From `InitialManipulatorPosition` (software event) | ||
|
|
||
| | Trials column | Mapping | | ||
| | --- | --- | | ||
| | `lickspout_positions` | `data` field. | | ||
|
|
||
| ### From `trainer_state.json` and `acquisition.json` (autoTrain — can be disregarded) | ||
|
|
||
| These were mapped during exploration but are no longer in scope: | ||
|
|
||
| - `auto_train_curriculum_name` / `auto_train_curriculum_schema_version` — | ||
| `trainer_state.json` (top level). | ||
| - `auto_train_engaged` — Boolean flag in `acquisition.json` indicating whether | ||
| the curriculum is running. | ||
| - `auto_train_stage` — `stage` in `trainer_state.json` (should always exist). | ||
| - `auto_train_stage_overridden` — `True` when `on_curriculum` in | ||
| `acquisition.json` is `False`. | ||
|
|
||
| ### Not applicable to this task | ||
|
|
||
| | Trials column | Mapping | | ||
| | --- | --- | | ||
| | `reward_random_L` / `reward_random_R` | None — no task component drives these. | | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the next line, you refer to a variable called
auto_response_right. here you refer tois_auto_response_right. Are these two different variables or is one of the names incorrect?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great catch!
is_auto_response_rightis the correct one. These are the same, it's a typo. Tried to address here: d645c2e