research(nightly): semantic-cache — HNSW-backed query result cache for RAG and agent memory#601
Draft
ruvnet wants to merge 4 commits into
Draft
research(nightly): semantic-cache — HNSW-backed query result cache for RAG and agent memory#601ruvnet wants to merge 4 commits into
ruvnet wants to merge 4 commits into
Conversation
SOTA discovery: QVCache (EuroMLSys 2025), vCache (arXiv:2502.03771), GPTCache, and 5 other systems confirmed semantic caching is production- valuable. All are Python-first; no Rust-native HNSW-co-designed cache exists. Selected topic: HNSW-backed semantic query result cache. Co-Authored-By: claude-flow <ruv@ruv.net> Claude-Session: https://claude.ai/code/session_01FW9sGTp6EzHqbyxKhvAG49
Implements SemanticCache trait with three variants: - NoCache: pure baseline (always miss) - FixedSemanticCache: HNSW key index + fixed cosine threshold 0.92 - AdaptiveSemanticCache: HNSW key index + sliding-window percentile threshold Internal HNSW (src/hnsw.rs) stores L2-normalized query vectors as cache keys. Cosine similarity computed from L2-squared distances on unit vectors. LRU eviction when max_entries exceeded. All measurements from cargo run --release; no invented numbers. Co-Authored-By: claude-flow <ruv@ruv.net> Claude-Session: https://claude.ai/code/session_01FW9sGTp6EzHqbyxKhvAG49
Real cargo run --release numbers on x86_64 linux: - FixedSemanticCache near_dup: 100% hit rate, 66.6 µs mean (13.49× speedup) - FixedSemanticCache mixed: 50% hit rate, 572.4 µs mean (1.55× speedup) - NoCache near_dup: 899.2 µs mean (baseline) - Cache memory: 103.1 KB for 200 entries × 128 dims - Warmup: 23.4 ms for 200 entries - Breakeven: >= 23% hit rate for latency benefit - ACCEPTANCE: ALL PASS Co-Authored-By: claude-flow <ruv@ruv.net> Claude-Session: https://claude.ai/code/session_01FW9sGTp6EzHqbyxKhvAG49
ADR-268: proposes ruvector-semantic-cache as first-class RuVector capability with SemanticCache trait, 3 variants, benchmark evidence (13.49x speedup), failure modes, security considerations, migration path. Research doc covers: - 2026 SOTA survey (QVCache, vCache, GPTCache, CacheRAG, Bifrost) - Forward-looking 2036-2046 thesis (semantic manifolds for agents) - ruvnet ecosystem fit (agent-memory, lsm-ann, proof-gate, ruFlo, MCP) - Real benchmark results - Memory and performance math - 8 practical + 8 exotic applications - Production crate layout proposal - ADR-268 Co-Authored-By: claude-flow <ruv@ruv.net> Claude-Session: https://claude.ai/code/session_01FW9sGTp6EzHqbyxKhvAG49
4 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds nightly RuVector research for
semantic-cache(2026-06-23).randSemanticCachetrait: clean API forNoCache/FixedSemanticCache/AdaptiveSemanticCacheDeliverables
crates/ruvector-semantic-cache/with 4 source files, 7 acceptance tests, benchmark binarydocs/adr/ADR-268-semantic-cache.mddocs/research/nightly/2026-06-23-semantic-cache/README.mddocs/research/nightly/2026-06-23-semantic-cache/gist.mdReal Benchmark Numbers (cargo run --release)
ACCEPTANCE: ALL PASS. Breakeven: ≥ 23% hit rate.
Research doc:
docs/research/nightly/2026-06-23-semantic-cache/README.mdADR:
docs/adr/ADR-268-semantic-cache.md🤖 Generated with claude-flow
https://claude.ai/code/session_01FW9sGTp6EzHqbyxKhvAG49
Generated by Claude Code