fix(providers): lower GLM custom provider context#43
Conversation
📝 WalkthroughWalkthroughThe PR adjusts the maximum context length configuration for two Zhipu AI models ( ChangesZhipu Model Configuration
Estimated code review effort🎯 1 (Trivial) | ⏱️ ~2 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@manual-templates/custom-provider-overrides.json`:
- Line 846: The contextLength entries in
manual-templates/custom-provider-overrides.json for model overrides are
inconsistent with Zhipu specs: adjust the "contextLength" values for the
glm-5v-turbo and glm-4.7 overrides (currently 204800) to match Zhipu's
documented values (~202752 for glm-5v-turbo and ~203000 for glm-4.7) or add an
inline comment/README note near those model override objects explaining why the
override intentionally differs; update the JSON entries for the model keys
"glm-5v-turbo" and "glm-4.7" (and verify "glm-5.1" and "glm-5" remain at 200000)
so the file consistently reflects the platform specs.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 2d7eff1b-f20e-4731-b538-f530bac2ed6a
⛔ Files ignored due to path filters (2)
dist/all.jsonis excluded by!**/dist/**dist/custom-provider.jsonis excluded by!**/dist/**
📒 Files selected for processing (1)
manual-templates/custom-provider-overrides.json
| "name": "GLM-5.1", | ||
| "family": "glm", | ||
| "contextLength": 204800, | ||
| "contextLength": 200000, |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
What is the maximum context length for Zhipu glm-5.1 and glm-5 models?
💡 Result:
As of June 2, 2026, both the Zhipu GLM-5 and GLM-5.1 models support a context window of 200,000 (200K) tokens [1][2][3]. While some technical documentation or specific API implementations (such as the NVIDIA NIM reference for GLM-5.1) may list 131,072 tokens for specific integration environments [4], the models' native architecture is designed for and widely recognized as supporting a 200K-token context window [5][1][6][3]. This capacity is supported by the models' use of Sparse Attention mechanisms (specifically DeepSeek Sparse Attention) to maintain coherence across extended sequences, which is a key feature for their intended use in long-horizon agentic and engineering tasks [1][6][3].
Citations:
- 1: https://automatio.ai/models/glm-5-1
- 2: https://arxiv.org/html/2602.15763v1
- 3: https://huggingface.co/blog/mlabonne/glm-5
- 4: https://docs.api.nvidia.com/nim/reference/z-ai-glm5.1
- 5: https://docs.z.ai/guides/llm/glm-5.1
- 6: https://github.com/zai-org/GLM-5/blob/main/README.md
🌐 Web query:
What is the maximum context length for Zhipu glm-5v-turbo and glm-4.7 models?
💡 Result:
Both the Zhipu GLM-4.7 and GLM-5V-Turbo models support a maximum context window of approximately 203,000 tokens (often cited in documentation as 200K tokens) [1][2][3][4][5][6]. Specifically: - GLM-4.7: Features a context window of 203,000 tokens [2][3]. - GLM-5V-Turbo: Features a context window of 202,752 tokens [7][6]. For both models, the maximum output token limit is generally 128,000 to 131,072 tokens [3][8][7][5][6].
Citations:
- 1: https://docs.z.ai/guides/llm/glm-4.7
- 2: https://nerdleveltech.com/inside-glm4-capabilities-benchmarks-and-realworld-power
- 3: https://cloudprice.net/models/zhipu-glm-4-7-flash
- 4: https://docs.z.ai/guides/vlm/glm-5v-turbo
- 5: https://cloudprice.net/models/zhipu-glm-5v-turbo
- 6: https://wavespeed.ai/blog/posts/glm-5v-turbo-developers-2026/
- 7: https://apidog.com/blog/glm-5v-turbo-api/
- 8: https://medium.com/@leucopsis/a-technical-analysis-of-glm-4-7-db7fcc54210a
Align contextLength values with Zhipu platform specs.
manual-templates/custom-provider-overrides.json:glm-5.1(line 846) andglm-5(line 878)contextLength: 200000matches the current Zhipu docs.glm-5v-turbo(line 910) andglm-4.7(line 942) should not remain204800; Zhipu docs list ~202,752forglm-5v-turboand ~203,000forglm-4.7—update these values or document why the override intentionally differs.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@manual-templates/custom-provider-overrides.json` at line 846, The
contextLength entries in manual-templates/custom-provider-overrides.json for
model overrides are inconsistent with Zhipu specs: adjust the "contextLength"
values for the glm-5v-turbo and glm-4.7 overrides (currently 204800) to match
Zhipu's documented values (~202752 for glm-5v-turbo and ~203000 for glm-4.7) or
add an inline comment/README note near those model override objects explaining
why the override intentionally differs; update the JSON entries for the model
keys "glm-5v-turbo" and "glm-4.7" (and verify "glm-5.1" and "glm-5" remain at
200000) so the file consistently reflects the platform specs.
Summary
Verification
Summary by CodeRabbit