CodeRoute Model And Harness Plan
Updated July 1, 2026.
Decisions
- Treat
gpt-5.5as the premium OpenAI frontier target. - Evaluate hosted full GLM-5.2 before relying on local Q3_K_M for hard coding work.
- Prefer OpenRouter
z-ai/glm-5.2first because it is available as an OpenAI-compatible hosted model with a 1M-token context window. - Check Azure/Microsoft Foundry for GLM-5.2 availability, but do not block on it. Current public catalog evidence shows Azure GPT-5.5 and Fireworks-on-Foundry GLM-5, not a clearly listed Azure GLM-5.2 route.
- Keep local
glm-5.2-q3-k-mas a private fallback and offline route. - Add harness-aware routing metadata before trying to orchestrate harnesses automatically.
Target Routing
| Task | Preferred route | Fallback route |
|---|---|---|
| Repo search, summaries, docs | local cheap model | hosted GLM-5.2 |
| Simple bug fix | hosted GLM-5.2 | local GLM-5.2 Q3_K_M |
| CRUD feature | hosted GLM-5.2 | GPT-5.5 |
| Refactor | hosted GLM-5.2 | GPT-5.5 |
| Unit tests | hosted GLM-5.2 | local GLM-5.2 Q3_K_M |
| Architecture | GPT-5.5 | hosted GLM-5.2 |
| Security review | GPT-5.5 | hosted GLM-5.2 |
| Production debugging | GPT-5.5 | hosted GLM-5.2 |
| Final PR review | GPT-5.5 | hosted GLM-5.2 |
Model Work
1. Add LiteLLM alias openrouter-glm-5.2 for openrouter/z-ai/glm-5.2. 2. Add or update OpenAI alias cloud-openai-gpt55 for gpt-5.5. 3. If Azure OpenAI deployment exists, add azure-gpt-5.5 and prefer it for enterprise/data-bound routes. 4. If Azure or Foundry exposes full GLM-5.2, add azure-glm-5.2; otherwise record FW-GLM-5 separately and do not label it GLM-5.2. 5. Add CodeRoute registry entries:
cloud-glm-5.2-fullcloud-openai-gpt55azure-openai-gpt55when deployed
6. Rebalance step ladders so hosted GLM-5.2 handles most implementation and GPT-5.5 handles highest-risk work.
Harness Work
1. Add client profile detection from metadata, API key labels, and User-Agent:
opencodecodexclineroocursoraidercontinueclaude-code- OpenCode through
coding-auto - Codex CLI through
coding-auto - Aider through
coding-auto - Cline/Roo manual smoke checklist
/anthropic/v1/messages- streaming compatibility
- tool use translation
- Claude Code smoke test
2. Store (harness, alias, selected_model, task_type, codegraph_signal, outcome) for route evaluation. 3. Add harness smoke tests:
4. Add an Anthropic-compatible facade only after the OpenAI-compatible path is stable:
User Documentation
- Publish
/harnesseswith setup steps for OpenCode, Codex CLI, Cline, Roo Code, Cursor, Aider, Continue, and Claude Code caveats. - Link
/harnessesfrom/,/meta, README, and the quick start guide. - Keep
/quick-startfocused on the fastest path and/harnessesfocused on client-specific setup.
Immediate Next Steps
1. Add and deploy the harness setup guide. 2. Add OpenRouter GLM-5.2 to LiteLLM and run direct smoke tests. 3. Add GPT-5.5 to LiteLLM or update the existing OpenAI cloud alias. 4. Add CodeRoute registry entries and route ladders for hosted GLM-5.2 and GPT-5.5. 5. Run eval tasks comparing local GLM-5.2 Q3_K_M, hosted GLM-5.2, and GPT-5.5 through OpenCode and Codex.