fix: isolate anonymous file statistics cache by kumarUjjawal · Pull Request #22950 · apache/datafusion

kumarUjjawal · 2026-06-15T05:01:40Z

Which issue does this PR close?

Closes panic: ProjectionExprs::project_statistics index out of bounds #22935.

Rationale for this change

Anonymous file reads can read the same path with different explicit schemas in the same session. The shared file statistics cache was keyed by table/path metadata, but did not validate that cached statistics matched the schema used to compute them.

This could reuse narrower cached statistics for a later wider schema read and panic during statistics projection.

What changes are included in this PR?

This PR routes anonymous listing table statistics through a per-table cache instead of the shared session cache.

Named tables still use the shared session cache, since their table reference gives the cache a stable identity.

It also adds a regression test that first warms statistics with the physical schema, then reads the same Parquet file with a wider explicit schema.

Are these changes tested?

Yes

Are there any user-facing changes?

No API Change

fix: isolate anonymous file statistics cache

185eafa

github-actions Bot added core Core DataFusion crate catalog Related to the catalog crate labels Jun 15, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: isolate anonymous file statistics cache#22950

fix: isolate anonymous file statistics cache#22950
kumarUjjawal wants to merge 1 commit into
apache:mainfrom
kumarUjjawal:fix/project_statistics_bounds

kumarUjjawal commented Jun 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

kumarUjjawal commented Jun 15, 2026

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant