QA Gate 自動化架構

核心理念

  1. Enforce > Suggest — hook exit 2 強制,唔係文字提醒
  2. Agent 唔可以 judge 自己 — 獨立 sub-agent review
  3. 用工具驗證,唔好用腦驗證 — Playwright CLI DOM assertion
  4. Fail fast — syntax error 即時 block,唔等到最後
  5. 唔信 AI 自覺,信工具結果

觸發機制

  • PostToolUse Edit|Write hook — 只喺改 file 時觸發
  • qa-projects.conf whitelist — 只有列出嘅 project path 先觸發
  • 純傾計 / 改 skills / 改 hooks → 完全唔觸發

qa-projects.conf

/home/claude/whatsapp-ai
/var/www/html
/home/claude/cc-dashboard

新 project 加一行 path 就得。

兩層架構

Layer 1:Syntax Check(PostToolUse,blocking,2 秒)

  • .pypython3 -m py_compile
  • .jsnode --check
  • .jsonpython3 -m json.tool
  • .shbash -n
  • FAIL → exit 2 → AI 被迫即刻修

Layer 2:Evaluator Sub-Agent(background,獨立 context)

  • Spawn claude -p --model sonnet(獨立 session,唔知 generator 做咗咩)
  • Functional test(必跑):Playwright CLI 開 browser → click → fill → check DOM(零 Vision token)
  • Visual test(只喺改 UI files):截圖 key states → Vision checklist
  • Chatbot test(只喺 whatsapp-ai):send message → check reply
  • Max 20 turns
  • 用 Max Plan quota(唔係 API billing)

Evaluator 點決定做咩測試

改 whatsapp-ai/*.py → functional + chatbot test
改 *.html/css/js     → functional + visual test
改 *.py(其他)       → functional only

Visual QA 策略

  • visual-qa-map.json:定義需要截圖嘅 states(optional,冇就 agent 自己推導)
  • 只截關鍵 state:initial load、form submit、modal、error(最多 8 張)
  • 分批 Vision check:desktop 先 → mobile → tablet 只 check 有 FAIL 嘅
  • Vision checklist:layout / typography / components / mobile / colors

Loop 策略(Progress-Based)

  • 有進步(pass rate 提升)→ 繼續
  • 冇進步(連續 2 次不變)→ 停
  • 退步(pass rate 跌)→ git revert + 停
  • 唔設固定次數限制

Regression 防護(Git Checkpoint)

  • 做到某程度 work → git commit(checkpoint)
  • 每次 fix 跑全部 test(唔只係新嘅)
  • 有 regression → git revert
  • 冇 regression → commit 新 checkpoint

Playwright 工具選擇

工具用途Token 消耗
Playwright CLI(@playwright/cli)Evaluator functional test最低(~27K)
Playwright Python Librarytake-screenshots.js / visual-qa-hook.sh
Playwright MCP唔用高(~114K)

CLI Basic Auth

npx @playwright/cli -s=qa open
npx @playwright/cli -s=qa goto "http://admin:password@localhost:8001/admin"
npx @playwright/cli -s=qa snapshot
npx @playwright/cli -s=qa click "button:has-text('發送')"
npx @playwright/cli -s=qa screenshot --filename /home/claude/test.png
npx @playwright/cli -s=qa close

關鍵檔案

檔案用途
~/.claude/hooks/post-edit-qa.shPostToolUse hook — Layer 1 + tracking + checklist
~/scripts/qa-evaluator.shLayer 2 evaluator wrapper
~/scripts/parse-qa-verdict.pyJSON verdict parser
~/scripts/visual-qa-hook.shPlaywright 截圖 + hard checks
~/scripts/take-screenshots.jsNode.js 截圖工具
~/.claude/qa-projects.confProject whitelist
~/.claude/settings.jsonHook 配置

Harness 精神

  • 唔信 AI 自覺 → 用 hook enforce
  • 寫嘅同 review 嘅分開 → 獨立 sub-agent
  • 用 code 判斷唔好用腦判斷 → deterministic checks first
  • 最細 context → evaluator 唔知 generator 點諗
  • Fail fast → syntax error 2 秒 catch