# Codex Adversarial Review History · deep-comparison-5way.md

| Pass | P1 | P2 | P3 | Verdict | Key findings |
|------|----|----|----|---------|--------------|
| 1 | 4 | 4 | 1 | FAIL | weighted score math wrong 4/5; TG error 单独最差 contradicts §4.1; "Vue+RSC" 不存在; AP SecureLS UUID 不解决 XSS |
| 2 | 2 | 2 | 0 | FAIL | §15 "清 77 dead" 重现; §15 "CP 加 SecureLS" 自相矛盾; rubric 编造 "Sentry-ready"/"hook"; BP runtime claims 未标静态 |
| 3 | 1 | 3 | 0 | FAIL | §14 PWA 4 家 manifest 不全 vs §7 BP/PT 无 manifest 矛盾; "6% 命中率"=系统事实重复; SRI evidence 写成 header 实测; 30-40% 数字无依据 |
| 4 | 1 | 2 | 1 | FAIL | §2 table cell 还是"热路径仅 5"; §15 WebCrypto 列入 auth 方案; §8 "没有" vs "未观察到"; §10 "5 维" 实际 7 维 |
| 5 | 2 | 1 | 0 | FAIL | §15 给 AP 仍带 `+ WebCrypto subtle`; §13 仍说"6% 热路径命中率"; §4.1 末段把 BP 推断当 observation value |
| 6 | 1 | 1 | 0 | FAIL | BP `实际错误上报 = 0` 应改 `未观测`; §6 CDN 流量翻倍 / cache 命中率 错标为已实测 |
| **7** | **0** | **0** | **1** | **PASS** | 仅剩摘要层 BP runtime/静态分级 P3 表述 |

**Net delta**: 4+4 P1 fixed; 4+3 P2 fixed; report 长度从初版 ~16 sections 230 行 → 终版同结构但每个 claim 都有 evidence boundary 标注

**关键学习**:
1. Codex 在数学/契约/事实正确性上非常严（weighted score 我手算错了 4/5 个）
2. Codex 反复捕捉"软润色"伪修复（改个词但 substance 不变）
3. Evidence boundary 在 prose 修了但摘要表格 / list / one-liner 没同步，是高频回归点
4. "客户端加密 = 防 XSS" 是流行误区，codex 反复纠正