Critiq team
- philosophy
What evidence over vibes means in code review
Review comments should be defensible: tied to a rule, a line, severity, and references, not just confident prose.
What evidence over vibes means in code review content
Pull request review is where a team decides what ships. That decision should rest on something you can inspect, reproduce, and explain, not on how convincingly a comment reads. At Critiq we call that distinction evidence over vibes: feedback that ties to a rule, a location, a severity, and external references you can follow.
This is not an argument against human judgment. Reviewers still decide what matters, what to fix now, and what to waive. The point is that the artifact on the PR should look more like a defensible engineering judgment than a chatbot opinion. When feedback is traceable, teams merge with confidence, audit with clarity, and turn repeated decisions into policy instead of re-litigating the same thread every sprint.
What vibes review looks like
Vibes review is fluent but unverifiable. The comment sounds authoritative. It may even be correct. But you cannot answer basic follow-up questions without asking the author to reconstruct their reasoning from memory.
- "This looks risky", no stated threat model, no cited pattern, no line anchor
- "Consider using a safer approach here", no definition of safer, no link to a standard
- "I would not ship this", no severity, no remediation path, no record for the next reviewer
- "The AI flagged this", no rule, no evidence excerpt, no way to reproduce locally
Vibes feedback fails in three practical ways. First, it does not travel: the next reviewer cannot tell whether the concern was security, correctness, or style preference. Second, it does not scale: the same debate returns on every similar diff because nothing was captured as policy. Third, it is hard to defend under scrutiny, during an incident review, a compliance audit, or a postmortem when someone asks why a change was approved.
Tools that optimize for readable comments can amplify the problem. A well-phrased paragraph feels like progress even when it omits the parts that make review actionable. Fluency is cheap. Traceability is work, and that work is what teams need when the merge button has real consequences.
What evidence means in practice
Evidence-backed review answers a fixed set of questions for every finding. What was detected? Why does it matter? Where in the code? How severe is it? What should change? What external standard supports the call? Critiq encodes those answers in deterministic output and in the rule catalog behind it.
- Rule ID, a stable identifier such as security.no-sql-interpolation that maps to a readable rule file
- Location, file path and line range so the finding is anchored in the diff
- Severity and confidence, explicit prioritization instead of tone in a comment thread
- Remediation, a concrete fix direction, not just disapproval
- References, metadata.references entries pointing at CWE, OWASP, or other sources the team can inspect
The rule file is the contract. You can open it, read the match logic, run specs against fixtures, and decide whether the team agrees with the default severity. That inspectability is the opposite of vibes: the finding is a conclusion with a paper trail, not a personality on the PR.
Why teams need defensible judgments
Most engineering organizations already operate under implicit pressure to justify merges, security champions, release managers, on-call engineers, and auditors all ask the same question in different words: why was this okay to ship?
Without evidence, review becomes a social process. The loudest reviewer wins. Waivers live in Slack. Identical patterns get blocked in one service and merged in another because no one wrote down the standard. That is fragile for teams of ten and worse for teams of a hundred.
Defensible judgments also shorten review cycles. When a finding includes a rule ID and references, disagreement moves to the right layer. Either the rule is wrong for your context, in which case you change the rule, disable it with documented intent, or fork a custom pack, or the code should change. You stop arguing about whether the comment "felt" right.
Critiq is built around that posture. The open source CLI runs locally, emits structured findings, and points at rules in @critiq/rules you can read in the repo. No hosted product is required to get traceable output in CI today. That is deliberate: evidence should not depend on a vendor dashboard you cannot inspect.
Walkthrough: one finding, fully traced
Imagine a TypeScript handler that builds a SQL string from request input and passes it to a raw query helper. A vibes comment might say "SQL injection risk, please fix." An evidence-backed finding names the rule, shows the matched expression, and links the standard.
Run critiq check locally (or in CI) with JSON output to see the structure:
npx critiq check src/db/users.ts --format jsonA simplified finding might look like this:
{
"findings": [
{
"rule": { "id": "security.no-sql-interpolation" },
"title": "Avoid interpolated SQL in query call",
"summary": "Request-driven values are concatenated into SQL text before execution.",
"severity": "high",
"confidence": "high",
"locations": {
"primary": {
"path": "src/db/users.ts",
"startLine": 42,
"endLine": 42
}
},
"evidence": [
{
"kind": "ast",
"label": "matched-call",
"excerpt": "db.raw(\`SELECT * FROM users WHERE id = ${req.params.id}\`)"
}
],
"remediation": {
"summary": "Use prepared statements or placeholder parameters instead of executing interpolated SQL."
},
"references": [
{ "kind": "cwe", "id": "CWE-89", "title": "SQL Injection" },
{
"kind": "owasp",
"title": "SQL Injection Prevention Cheat Sheet",
"url": "https://cheatsheetseries.owasp.org/cheatsheets/SQL_Injection_Prevention_Cheat_Sheet.html"
}
]
}
],
"summary": { "high": 1, "medium": 0, "low": 0 }
}Compare that to a vibes comment. Here the reviewer (human or tool) can point to security.no-sql-interpolation, open the rule YAML in the catalog, read the CWE-89 reference, and verify the AST excerpt matches the diff hunk. If the team uses parameterized queries in a wrapper the rule does not model yet, the debate is about the rule spec, not about whether someone sounded worried enough in thread.
Severity and remediation turn the finding into work the author can schedule. High severity with a clear fix beats a vague "blocking" reaction that leaves the author guessing. References give security partners vocabulary they already use in assessments, so review lines up with how the organization talks about risk elsewhere.
From repeated evidence to governance
One traced finding is useful. A pattern of traced findings is how teams build governance that lasts longer than a single PR.
When the same rule fires across services, you have signal: the catalog reflects a real recurring mistake, not a one-off nit. Teams can raise severity, add an org-specific tag, or document a waiver template for cases that genuinely differ. When the rule is too noisy, you adjust the match or scope with a versioned change, not a oral tradition that new hires never hear.
When human reviewers repeat a judgment the catalog does not capture, "we never call this API without a timeout", that is a candidate for a custom rule. Critiq rule packs are data: YAML with specs and fixtures. Promoting repeated review evidence into a rule means the next PR gets the same check automatically, with the same references and remediation text. Review memory becomes executable policy.
- Catalog rules cover common security and quality baselines with shared references
- Custom rules under .critiq/rules/ encode decisions your team already makes in review
- CI runs the same checks on every push so evidence is generated before merge, not after
- Waivers and overrides stay explicit because they attach to rule IDs, not forgotten comments
This is the durable governance half of evidence over vibes. Competitors often stop at comments. Critiq treats the review artifact as something you can compile forward, into rules, into CI gates, into metrics on what keeps failing, because each finding already carries the fields you need to aggregate.
What we are not claiming
Evidence over vibes is not "replace reviewers with automation." Humans still own context the static pass cannot see: product tradeoffs, rollout timing, and whether a pattern is dead code. Evidence makes that ownership easier by separating facts (what matched, where, under which rule) from decisions (merge, fix, waive).
It also does not mean every finding is correct. Rules can be wrong, incomplete, or mis-scoped. The win is that disagreement is inspectable, you edit a rule file and add a spec, instead of fighting over phrasing in a comment thread.
Critiq does not ship hosted AI review on the public site today. This post is about the deterministic layer that already runs in the OSS CLI: rule-backed findings you can reproduce on your machine. When AI-assisted review arrives in the product, it should extend the same evidence graph, claims linked to rules, excerpts, and references, not replace it with fluent summaries alone.
Try it on your repo
Install the CLI, run a scan, and read the rules behind the output. Start with the getting started guide at https://docs.critiq.dev/getting-started, browse rule references at https://docs.critiq.dev/rules, and see how the OSS product page describes local and CI workflows at https://critiq.dev/products/oss.
npm install -D @critiq/cli @critiq/rules
npx critiq check . --format json
critiq rules explain security.no-sql-interpolationIf a finding surprises you, open the rule YAML in the @critiq/rules catalog on GitHub. If it matches how your team already reviews, wire the same command into CI so every PR carries evidence, not vibes, into the merge decision.
More from the blog

- philosophy
The trust gap in AI-assisted coding, and what inspectable feedback looks like
Developers use AI assistants daily but often distrust review feedback. Inspectable rules, evidence, and local checks close that gap.
Read article
- philosophy
Why we open-sourced the rules engine (and what stays in the catalog)
Critiq ships the rule engine, DSL, and 435+ OSS catalog rules in the open. Here is what you get locally, what Pro adds, and how to inspect rules yourself.
Read article
- guides
Your first critiq check: reading pretty output like a senior review
Install the OSS CLI, run your first scan, and learn to read severity, rule IDs, file paths, and fix suggestions in pretty and JSON output.
Read article