Adversarial Challenge Gate¶
After all five dimension-specialists return their findings, the adversarial specialist runs. It does not score dimensions. It owns one thing: whether every finding holds before the operator emits output.
Why it exists¶
A long audit pass has two drift vectors. Score drift — the model grows more lenient or strict as context accumulates, and scores at the end of the pass are calibrated differently than scores at the start. Cross-dimension contamination — a strong Structured Data result halos a weak Citability score because the context carries a positive impression of the page.
The adversarial specialist cuts both. It reads each dimension's evidence packet and finding in isolation, without the context that built up during the scoring pass, and checks each finding independently against the rubric, the decision table, and the fix-acceptance rubric.
What it checks¶
The adversarial specialist runs five checks. Checks 1–4 run per dimension; check 5 runs once across the assembled verdict.
| Check | What it catches |
|---|---|
| Score-rubric alignment | Score in a band the evidence packet cannot support — inflation or deflation relative to the rubric descriptors |
| Decision table compliance | Route (SHIP / FIX / ESCALATE / SHIP no producible gap) that does not match the canonical table given the score and evidence state |
| Evidence grounding | Facts in a FIX artifact not traceable to the packet; sameAs URLs described as confirmed but only derived from the entity name; ESCALATE blockers that are vague or invented |
| Acceptance rubric compliance | Artifacts that would not move the score, are generic enough to paste unchanged into any site, or carry a self-review that claims a check the artifact visibly fails |
| Internal consistency | FIX decisions without artifacts; orphan artifacts; score projection claiming lift on ESCALATE dimensions; split dimensions not stated in both the fix queue and escalation list |
Output format¶
ADVERSARIAL REVIEW
Dim 1 — Structured Data: CLEAR | <one sentence confirming the finding holds>
Dim 2 — Citability: CHALLENGE | <specific claim that does not hold> — violates <rule/rubric>.
Correct route: <what it should be>.
Dim 3 — Crawl Signal: CLEAR | ...
Dim 4 — Content Freshness: CLEAR | ...
Dim 5 — Entity Authority: CHALLENGE | ...
Consistency: CLEAR | All §9 checks pass.
Summary: N challenge(s), M clear(s). Orchestrator must resolve Dim X before output.
How challenges resolve¶
Every CHALLENGE requires a specific correction before the pass proceeds:
- Score challenge — re-score from the packet using the rubric band descriptors. Corrected score propagates to the total and band.
- Decision challenge — re-route per the canonical decision table. If the route changes, regenerate or remove the artifact and update the manifest.
- Evidence challenge — remove the unsupported fact. If the fix cannot be grounded without it, convert the artifact to an ESCALATE with a four-part flag.
- Acceptance challenge — regenerate the artifact or convert to ESCALATE if the gap cannot be produced as a verifiable text artifact from the packet.
- Consistency challenge — add the missing artifact, remove the orphan, correct the projection, or state the split in both lists.
The adversarial specialist does not re-run after resolution. The orchestrator applies corrections and the pass proceeds.
What it never does¶
Issues a vague challenge ("this could be more specific"). Re-scores a dimension. Generates a fix. Carries one dimension's result into another dimension's check. Blocks the pass on a CLEAR finding.
See it in action¶
The worked runs include adversarial challenge review blocks. The all-CLEAR case appears in Multi-Dimension FIX; a two-challenge case (score drift on Structured Data, unconfirmed sameAs URL on Entity Authority) appears in Schema-Stripped Fetch.