CodeRoute Model And Harness Plan

Updated July 1, 2026.

Decisions

Treat gpt-5.5 as the premium OpenAI frontier target.
Evaluate hosted full GLM-5.2 before relying on local Q3_K_M for hard coding work.
Prefer OpenRouter z-ai/glm-5.2 first because it is available as an OpenAI-compatible hosted model with a 1M-token context window.
Check Azure/Microsoft Foundry for GLM-5.2 availability, but do not block on it. Current public catalog evidence shows Azure GPT-5.5 and Fireworks-on-Foundry GLM-5, not a clearly listed Azure GLM-5.2 route.
Keep local glm-5.2-q3-k-m as a private fallback and offline route.
Add harness-aware routing metadata before trying to orchestrate harnesses automatically.

Target Routing

Task	Preferred route	Fallback route
Repo search, summaries, docs	local cheap model	hosted GLM-5.2
Simple bug fix	hosted GLM-5.2	local GLM-5.2 Q3_K_M
CRUD feature	hosted GLM-5.2	GPT-5.5
Refactor	hosted GLM-5.2	GPT-5.5
Unit tests	hosted GLM-5.2	local GLM-5.2 Q3_K_M
Architecture	GPT-5.5	hosted GLM-5.2
Security review	GPT-5.5	hosted GLM-5.2
Production debugging	GPT-5.5	hosted GLM-5.2
Final PR review	GPT-5.5	hosted GLM-5.2

Model Work

1. Add LiteLLM alias openrouter-glm-5.2 for openrouter/z-ai/glm-5.2. 2. Add or update OpenAI alias cloud-openai-gpt55 for gpt-5.5. 3. If Azure OpenAI deployment exists, add azure-gpt-5.5 and prefer it for enterprise/data-bound routes. 4. If Azure or Foundry exposes full GLM-5.2, add azure-glm-5.2; otherwise record FW-GLM-5 separately and do not label it GLM-5.2. 5. Add CodeRoute registry entries:

cloud-glm-5.2-full
cloud-openai-gpt55
azure-openai-gpt55 when deployed

6. Rebalance step ladders so hosted GLM-5.2 handles most implementation and GPT-5.5 handles highest-risk work.

Harness Work

1. Add client profile detection from metadata, API key labels, and User-Agent:

opencode
codex
cline
roo
cursor
aider
continue
claude-code

2. Store (harness, alias, selected_model, task_type, codegraph_signal, outcome) for route evaluation. 3. Add harness smoke tests:

OpenCode through coding-auto
Codex CLI through coding-auto
Aider through coding-auto
Cline/Roo manual smoke checklist

4. Add an Anthropic-compatible facade only after the OpenAI-compatible path is stable:

/anthropic/v1/messages
streaming compatibility
tool use translation
Claude Code smoke test

User Documentation

Publish /harnesses with setup steps for OpenCode, Codex CLI, Cline, Roo Code, Cursor, Aider, Continue, and Claude Code caveats.
Link /harnesses from /, /meta, README, and the quick start guide.
Keep /quick-start focused on the fastest path and /harnesses focused on client-specific setup.

Immediate Next Steps

1. Add and deploy the harness setup guide. 2. Add OpenRouter GLM-5.2 to LiteLLM and run direct smoke tests. 3. Add GPT-5.5 to LiteLLM or update the existing OpenAI cloud alias. 4. Add CodeRoute registry entries and route ladders for hosted GLM-5.2 and GPT-5.5. 5. Run eval tasks comparing local GLM-5.2 Q3_K_M, hosted GLM-5.2, and GPT-5.5 through OpenCode and Codex.