ORPT-Bench model detail
Model benchmark profile

opencode/glm-5

This page uses opencode/glm-5 as the comparison baseline. Every chart and table below is intended to answer the same question: where this model leads, where it lags, and what it costs in quality, time, and request pressure.

z-ai low price tier standard balanced-general
Composite
0.623
Correctness-weighted overall standing
Success
78%
Tasks completed successfully
ORPT
11.57
Requests per solved task
Total cost
$6.4339
Observed benchmark spend
Baseline comparison

How the field moves relative to opencode/glm-5

These charts use opencode/glm-5 as zero. Positive bars mean other models are above the baseline on that metric; negative bars mean they trail it.

Composite delta vs baseline

Success delta vs baseline

Cost delta vs baseline

Wall time delta vs baseline

Decision table

Field comparison against the baseline

Use this to decide whether another model beats opencode/glm-5 enough to justify the change.

Model Composite Delta vs baseline Success Success delta ORPT ORPT delta Cost Cost delta Wall time
opencode/gpt-5.4-nano 0.789 +0.166 85% +7% 15.17 +3.60 $0.4215 -$6.0124 27m 33s
opencode/kimi-k2.5 0.785 +0.162 89% +11% 14.25 +2.68 $0.9122 -$5.5217 41m 05s
opencode/claude-opus-4-6 0.67 +0.047 89% +11% 14.88 +3.30 $21.8757 +$15.4417 40m 04s
opencode/glm-5 Baseline 0.623 +0.0 78% +0% 11.57 +0.00 $6.4339 +$0.0000 20m 10s
opencode/big-pickle 0.615 -0.008 67% -11% 15.39 +3.82 $0.0000 -$6.4339 36m 28s
opencode/gpt-5.4 0.609 -0.014 78% +0% 11.00 -0.57 $8.9827 +$2.5488 32m 47s
opencode/claude-sonnet-4-6 0.593 -0.03 78% +0% 16.43 +4.86 $11.8406 +$5.4066 42m 31s
opencode/glm-5.1 0.547 -0.076 67% -11% 12.06 +0.48 $1.8816 -$4.5523 64m 39s
opencode/minimax-m2.5 0.481 -0.142 56% -22% 18.87 +7.30 $0.6413 -$5.7927 32m 15s
opencode/gpt-5.4-mini 0.425 -0.198 48% -30% 9.54 -2.03 $1.0606 -$5.3733 21m 48s
opencode/minimax-m2.5-free 0.415 -0.208 59% -19% 16.19 +4.62 $0.0000 -$6.4339 41m 34s
opencode/gemini-3-flash 0.415 -0.208 59% -19% 21.81 +10.24 $2.4307 -$4.0033 62m 52s
opencode/gemini-3.1-pro 0.291 -0.332 37% -41% 12.70 +1.13 $5.8536 -$0.5803 51m 25s
opencode/nemotron-3-super-free 0.181 -0.441 26% -52% 19.43 +7.86 $0.0000 -$6.4339 109m 00s
Task story

Where opencode/glm-5 separates

This table puts the most revealing tasks first: unsolved tasks, single-solver tasks, and tasks where the baseline trails the winner by a meaningful margin.

Task Field read Baseline result Winner Gap to winner Baseline cost Baseline time
SELinux registry volume label repair Clear separation failed opencode/kimi-k2.5
1.0
1.0 $0.1133 32s
RHEL k3s node preparation repair Competitive split failed opencode/gpt-5.4-nano
1.0
1.0 $0.1018 21s
Bootstrap phase validation repair Competitive split failed opencode/kimi-k2.5
0.993
0.993 $0.1869 38s
Docker Compose observability fix Competitive split failed opencode/gpt-5.4-nano
0.975
0.975 $0.0964 16s
Pre-ArgoCD bootstrap sequencing Competitive split failed opencode/gpt-5.4-nano
0.967
0.967 $0.1649 42s
Kubernetes OIDC RBAC repair Competitive split failed opencode/gpt-5.4-nano
0.95
0.95 $0.1987 4m 43s
K3s registry mirror trust repair Competitive split passed opencode/big-pickle
1.0
0.247 $0.2926 25s
Terraform static site repair Competitive split passed opencode/kimi-k2.5
0.978
0.202 $0.2463 28s
Event status shell summary Competitive split passed opencode/big-pickle
1.0
0.188 $0.0954 37s
Kubernetes rollout repair Clear separation passed opencode/gpt-5.4-mini
1.0
0.181 $0.2735 33s
CNPG restore manifest repair Competitive split passed opencode/big-pickle
0.964
0.178 $0.2697 25s
Log audit shell script Competitive split passed opencode/gpt-5.4-nano
0.935
0.173 $0.3671 31s
ExternalDNS RFC2136 repair Competitive split passed opencode/kimi-k2.5
0.982
0.168 $0.3206 34s
MetalLB ingress address pool repair Competitive split passed opencode/gpt-5.4-nano
0.928
0.168 $0.5282 1m 08s
nftables router ingress repair Competitive split passed opencode/gpt-5.4-nano
0.98
0.163 $0.1201 29s
Workspace transplant bundle repair Competitive split passed opencode/big-pickle
0.985
0.162 $0.1595 27s
Log level rollup shell script Competitive split passed opencode/big-pickle
0.965
0.158 $0.1567 29s
RHEL NetworkManager bridge VLAN repair Competitive split passed opencode/gpt-5.4-nano
0.951
0.15 $0.1978 18s
MCP OpenBao contract repair Competitive split passed opencode/big-pickle
0.954
0.144 $0.2235 36s
Traefik forwarded header trust repair Competitive split passed opencode/kimi-k2.5
0.913
0.142 $0.5014 48s
Build workspace plane convergence Competitive split passed opencode/gpt-5.4-nano
0.942
0.141 $0.2917 28s
RHEL edge firewalld router repair Competitive split passed opencode/gpt-5.4-nano
0.953
0.132 $0.2802 1m 11s
Ansible nginx role completion Competitive split passed opencode/big-pickle
0.963
0.131 $0.1366 22s
AppArmor dnsmasq profile repair Competitive split passed opencode/gpt-5.4-nano
0.918
0.123 $0.2683 1m 01s
GitOps workspace render validation Competitive split passed opencode/big-pickle
0.941
0.117 $0.1990 29s
Wildcard TLS route coverage Competitive split passed opencode/kimi-k2.5
0.929
0.116 $0.2252 37s
Workspace runtime access convergence Competitive split passed opencode/gpt-5.4-nano
0.932
0.112 $0.4184 1m 00s
Head to head

Direct matchups

Pairwise task wins and top-line deltas show whether a challenger truly beats the baseline or just looks cheaper or faster in isolation.

Challenger Task record Composite edge Success edge Cost edge Time edge ORPT edge
opencode/nemotron-3-super-free 21-0
6 ties
+0.441 +52% +$6.4339 -88m 50s -7.86
opencode/gemini-3.1-pro 19-4
4 ties
+0.332 +41% +$0.5803 -31m 15s -1.13
opencode/minimax-m2.5-free 21-3
3 ties
+0.208 +19% +$6.4339 -21m 23s -4.62
opencode/gemini-3-flash 21-3
3 ties
+0.208 +19% +$4.0033 -42m 41s -10.24
opencode/gpt-5.4-mini 9-13
5 ties
+0.198 +30% +$5.3733 -1m 38s +2.03
opencode/gpt-5.4-nano 2-23
2 ties
-0.166 -7% +$6.0124 -7m 22s -3.60
opencode/kimi-k2.5 4-23
0 ties
-0.162 -11% +$5.5217 -20m 54s -2.68
opencode/minimax-m2.5 9-13
5 ties
+0.142 +22% +$5.7927 -12m 05s -7.30
opencode/glm-5.1 12-12
3 ties
+0.076 +11% +$4.5523 -44m 29s -0.48
opencode/claude-opus-4-6 20-6
1 ties
-0.047 -11% -$15.4417 -19m 54s -3.30
opencode/claude-sonnet-4-6 18-7
2 ties
+0.03 +0% -$5.4066 -22m 20s -4.86
opencode/gpt-5.4 16-7
4 ties
+0.014 +0% -$2.5488 -12m 37s +0.57
opencode/big-pickle 5-18
4 ties
+0.008 +11% +$6.4339 -16m 18s -3.82
Model context

Benchmark and catalog detail

The benchmark result only matters in context: this section pairs the observed benchmark outcome with the catalog metadata and operating characteristics behind it.

Requests280
Wall time20m 10s
Average task cost$0.2653
Benchmark supportunknown
Catalog blended price$1.6000 / 1M tok
Catalog speed70 tok/s
Intelligence50
Agenticn/a

OpenRouter reference blend for z-ai/glm-5 is 1.115 USD per 1M tokens using a 3:1 input:output mix.