feat(ai): emit cost + full usage on otel spans#747
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (8)
✅ Files skipped from review due to trivial changes (3)
🚧 Files skipped from review as they are similar to previous changes (5)
📝 WalkthroughWalkthrough
ChangesOpenTelemetry full usage span attributes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 ESLint
ESLint install timed out. The project may have too many dependencies for the sandbox. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
…g details) otelMiddleware only emitted gen_ai.usage.input_tokens/output_tokens even though TokenUsage already carries provider-reported cost, total tokens, cache/reasoning breakdowns, and duration-based billing. Backends like PostHog had to re-derive cost from their own price tables, losing cache discounts and gateway markup (OpenRouter), and duration-billed activities had no cost signal at all. A shared usageAttributes() helper now builds the full guarded attribute set at all three emission sites (RUN_FINISHED chunk, onUsage, onFinish rollup): - gen_ai.usage.total_tokens / gen_ai.usage.cost (de-facto extensions consumed directly by PostHog and LiteLLM-style backends) - gen_ai.usage.cache_read.input_tokens, cache_creation.input_tokens, reasoning.output_tokens (official GenAI semconv names) - tanstack.ai.usage.duration_seconds and the upstream cost split (no semconv equivalent exists) E2E: new /api/otel-usage route drives the existing openai-usage-details and openrouter-cost aimock mounts through otelMiddleware with a local capture tracer; middleware.spec.ts asserts the attributes land on iteration and root spans. Fixes TanStack#721
10d8c42 to
c7df2a3
Compare
Summary
Fixes #721.
otelMiddlewareonly emittedgen_ai.usage.input_tokens/gen_ai.usage.output_tokens, even thoughTokenUsagealready carries provider-reported cost, total tokens, cache/reasoning breakdowns, and duration-based billing (the cost fields landed in #654). Backends like PostHog had to re-derive cost fromtokens × their own price table, losing cache discounts and gateway markup (OpenRouter), and duration-billed activities had no cost signal at all.What changed
A shared
usageAttributes()helper now builds the full attribute set at all three emission sites (RUN_FINISHEDchunk,onUsage,onFinishrollup). Every field is guarded, so spans are unchanged when a provider doesn't report it:gen_ai.usage.total_tokenstotalTokensgen_ai.usage.costcostgen_ai.usage.cache_read.input_tokenspromptTokensDetails.cachedTokensgen_ai.usage.cache_creation.input_tokenspromptTokensDetails.cacheWriteTokensgen_ai.usage.reasoning.output_tokenscompletionTokensDetails.reasoningTokenstanstack.ai.usage.duration_secondsdurationSecondstanstack.ai.usage.upstream_cost/_input_cost/_output_costcostDetailsDeliberately out of scope:
unitsBilledand per-modality token breakdowns (media activities don't flow through chat middleware — that's #720), andproviderUsageDetails(provider-shaped bag, unsafe to spread onto spans).Tests
packages/ai/tests/middlewares/otel.test.tscovering all three emission sites, absent-field omission, and empty detail objects./api/otel-usageroute drives the existingopenai-usage-details(cache/reasoning/totals) andopenrouter-cost(cost/cost split) aimock mounts throughotelMiddlewarewith an in-memory capture tracer;middleware.spec.tsasserts the attributes land on both iteration and root spans.docs/advanced/otel.md.pnpm test:prand the full middleware E2E suite pass locally.Summary by CodeRabbit
New Features
Documentation
Tests / E2E