research(nightly): semantic-cache — HNSW-backed query result cache for RAG and agent memory by ruvnet · Pull Request #601 · ruvnet/RuVector

ruvnet · 2026-06-23T07:29:56Z

Summary

Adds nightly RuVector research for semantic-cache (2026-06-23).

13.49× speedup on near-duplicate query workloads (66.6 µs vs 899.2 µs)
100% hit rate on near-duplicate queries, zero false positives on random queries
103 KB cache memory for 200 entries × 128 dims
WASM-compatible, zero external dependencies beyond rand
SemanticCache trait: clean API for NoCache / FixedSemanticCache / AdaptiveSemanticCache

Deliverables

Working Rust PoC — crates/ruvector-semantic-cache/ with 4 source files, 7 acceptance tests, benchmark binary
ADR-268 — docs/adr/ADR-268-semantic-cache.md
Research document — docs/research/nightly/2026-06-23-semantic-cache/README.md
Public gist — docs/research/nightly/2026-06-23-semantic-cache/gist.md

Real Benchmark Numbers (cargo run --release)

Variant	Workload	Hit Rate	Mean (µs)	Speedup
NoCache	near_dup	0%	899.2	1.00×
FixedSemanticCache	near_dup	100%	66.6	13.49×
AdaptiveSemanticCache	near_dup	13.2%	1,032.9	0.87×
FixedSemanticCache	mixed (50% hit)	50%	572.4	1.55×

ACCEPTANCE: ALL PASS. Breakeven: ≥ 23% hit rate.

Research doc: docs/research/nightly/2026-06-23-semantic-cache/README.md
ADR: docs/adr/ADR-268-semantic-cache.md

🤖 Generated with claude-flow
https://claude.ai/code/session_01FW9sGTp6EzHqbyxKhvAG49

Generated by Claude Code

SOTA discovery: QVCache (EuroMLSys 2025), vCache (arXiv:2502.03771), GPTCache, and 5 other systems confirmed semantic caching is production- valuable. All are Python-first; no Rust-native HNSW-co-designed cache exists. Selected topic: HNSW-backed semantic query result cache. Co-Authored-By: claude-flow <ruv@ruv.net> Claude-Session: https://claude.ai/code/session_01FW9sGTp6EzHqbyxKhvAG49

Implements SemanticCache trait with three variants: - NoCache: pure baseline (always miss) - FixedSemanticCache: HNSW key index + fixed cosine threshold 0.92 - AdaptiveSemanticCache: HNSW key index + sliding-window percentile threshold Internal HNSW (src/hnsw.rs) stores L2-normalized query vectors as cache keys. Cosine similarity computed from L2-squared distances on unit vectors. LRU eviction when max_entries exceeded. All measurements from cargo run --release; no invented numbers. Co-Authored-By: claude-flow <ruv@ruv.net> Claude-Session: https://claude.ai/code/session_01FW9sGTp6EzHqbyxKhvAG49

Real cargo run --release numbers on x86_64 linux: - FixedSemanticCache near_dup: 100% hit rate, 66.6 µs mean (13.49× speedup) - FixedSemanticCache mixed: 50% hit rate, 572.4 µs mean (1.55× speedup) - NoCache near_dup: 899.2 µs mean (baseline) - Cache memory: 103.1 KB for 200 entries × 128 dims - Warmup: 23.4 ms for 200 entries - Breakeven: >= 23% hit rate for latency benefit - ACCEPTANCE: ALL PASS Co-Authored-By: claude-flow <ruv@ruv.net> Claude-Session: https://claude.ai/code/session_01FW9sGTp6EzHqbyxKhvAG49

ADR-268: proposes ruvector-semantic-cache as first-class RuVector capability with SemanticCache trait, 3 variants, benchmark evidence (13.49x speedup), failure modes, security considerations, migration path. Research doc covers: - 2026 SOTA survey (QVCache, vCache, GPTCache, CacheRAG, Bifrost) - Forward-looking 2036-2046 thesis (semantic manifolds for agents) - ruvnet ecosystem fit (agent-memory, lsm-ann, proof-gate, ruFlo, MCP) - Real benchmark results - Memory and performance math - 8 practical + 8 exotic applications - Production crate layout proposal - ADR-268 Co-Authored-By: claude-flow <ruv@ruv.net> Claude-Session: https://claude.ai/code/session_01FW9sGTp6EzHqbyxKhvAG49

claude and others added 4 commits June 23, 2026 07:28

ruvnet mentioned this pull request Jun 25, 2026

Nightly research PR triage (2026-06-25): merge #603/#602/#604, scope #601, pass #600 #606

Open

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

research(nightly): semantic-cache — HNSW-backed query result cache for RAG and agent memory#601

research(nightly): semantic-cache — HNSW-backed query result cache for RAG and agent memory#601
ruvnet wants to merge 4 commits into
mainfrom
research/nightly/2026-06-23-semantic-cache

ruvnet commented Jun 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ruvnet commented Jun 23, 2026

Summary

Deliverables

Real Benchmark Numbers (cargo run --release)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants