Critiq is an open source code review toolchain. The public CLI (`@critiq/cli`) and rules catalog (`@critiq/rules`) scan your repository locally and return deterministic findings you can use in the terminal or CI.

Does Critiq send my code to the cloud?

The OSS CLI runs locally on your machine or in your CI runner. It analyzes files in your checkout and does not upload your source to Critiq by default. Hosted Critiq Cloud features are optional and separate from the open source CLI.

How is this different from AI review tools?

Critiq findings come from explicit, testable rules, not probabilistic model output. You can inspect each rule, reproduce results, and tune what runs in CI. That makes feedback consistent and auditable.

Is Critiq open source?

Yes. The core engine, public CLI, and OSS rules catalog are open source under the Critiq GitHub organization. Premium rule packs and hosted product features are separate offerings.

Back to blog

May 23, 2026Critiq team

ai
rules

What evidence over vibes means in code review

Evidence-based code review ties findings to a rule, line, severity, and references, not confident prose or AI vibes on the pull request.

Pull request review is where a team decides what ships. That decision should rest on something you can inspect, reproduce, and explain, not on how convincingly a comment reads. At Critiq we call that distinction evidence over vibes: feedback that ties to a rule, a location, a severity, and external references you can follow. The /blog/trust-gap-ai-assisted-coding post explains why that gap shows up when AI-assisted coding meets opaque PR bots.

This is not an argument against human judgment. Reviewers still decide what matters, what to fix now, and what to waive. The point is that the artifact on the PR should look more like a defensible engineering judgment than a chatbot opinion. When feedback is traceable, teams merge with confidence, audit with clarity, and turn repeated decisions into policy instead of re-litigating the same thread every sprint.

What vibes-based code review looks like

Vibes review is fluent but unverifiable. The comment sounds authoritative. It may even be correct. But you cannot answer basic follow-up questions without asking the author to reconstruct their reasoning from memory.

"This looks risky", no stated threat model, no cited pattern, no line anchor
"Consider using a safer approach here", no definition of safer, no link to a standard
"I would not ship this", no severity, no remediation path, no record for the next reviewer
"The AI flagged this", no rule, no evidence excerpt, no way to reproduce locally

Vibes feedback fails in three practical ways. First, it does not travel: the next reviewer cannot tell whether the concern was security, correctness, or style preference. Second, it does not scale: the same debate returns on every similar diff because nothing was captured as policy. Third, it is hard to defend under scrutiny, during an incident review, a compliance audit, or a postmortem when someone asks why a change was approved.

Tools that optimize for readable comments can amplify the problem. A well-phrased paragraph feels like progress even when it omits the parts that make review actionable. Fluency is cheap. Traceability is work, and that work is what teams need when the merge button has real consequences.

What evidence-based code review means in practice

Evidence-backed review answers a fixed set of questions for every finding. What was detected? Why does it matter? Where in the code? How severe is it? What should change? What external standard supports the call? Critiq encodes those answers in deterministic output and in the rule catalog behind it.

Rule ID, a stable identifier such as security.no-sql-interpolation that maps to a readable rule file
Location, file path and line range so the finding is anchored in the diff
Severity and confidence, explicit prioritization instead of tone in a comment thread
Remediation, a concrete fix direction, not just disapproval
References, metadata.references entries pointing at CWE, OWASP, or other sources the team can inspect

The rule file is the contract. You can open it, read the match logic, run specs against fixtures, and decide whether the team agrees with the default severity. That inspectability is the opposite of vibes: the finding is a conclusion with a paper trail, not a personality on the PR.

Why teams need defensible judgments

Most engineering organizations already operate under implicit pressure to justify merges, security champions, release managers, on-call engineers, and auditors all ask the same question in different words: why was this okay to ship?

Without evidence, review becomes a social process. The loudest reviewer wins. Waivers live in Slack. Identical patterns get blocked in one service and merged in another because no one wrote down the standard. That is fragile for teams of ten and worse for teams of a hundred.

Defensible judgments also shorten review cycles. When a finding includes a rule ID and references, disagreement moves to the right layer. Either the rule is wrong for your context, in which case you change the rule, disable it with documented intent, or fork a custom pack, or the code should change. You stop arguing about whether the comment "felt" right.

Critiq is built around that posture. The open source CLI on /products/oss runs locally, emits structured findings, and points at rules in @critiq/rules you can read in the repo. No hosted product is required to get traceable output in CI today. That is deliberate: evidence should not depend on a vendor dashboard you cannot inspect.

Walkthrough: one finding, fully traced

Imagine a TypeScript handler that builds a SQL string from request input and passes it to a raw query helper. A vibes comment might say "SQL injection risk, please fix." An evidence-backed finding names the rule, shows the matched expression, and links the standard.

Run critiq check locally (or in CI) with JSON output to see the structure:

npx @critiq/cli check src/db/users.ts --format json

A simplified finding might look like this:

{
  "findings": [
    {
      "rule": { "id": "security.no-sql-interpolation" },
      "title": "Avoid interpolated SQL in query call",
      "summary": "Request-driven values are concatenated into SQL text before execution.",
      "severity": "high",
      "confidence": "high",
      "locations": {
        "primary": {
          "path": "src/db/users.ts",
          "startLine": 42,
          "endLine": 42
        }
      },
      "evidence": [
        {
          "kind": "ast",
          "label": "matched-call",
          "excerpt": "db.raw(\`SELECT * FROM users WHERE id = ${req.params.id}\`)"
        }
      ],
      "remediation": {
        "summary": "Use prepared statements or placeholder parameters instead of executing interpolated SQL."
      },
      "references": [
        { "kind": "cwe", "id": "CWE-89", "title": "SQL Injection" },
        {
          "kind": "owasp",
          "title": "SQL Injection Prevention Cheat Sheet",
          "url": "https://cheatsheetseries.owasp.org/cheatsheets/SQL_Injection_Prevention_Cheat_Sheet.html"
        }
      ]
    }
  ],
  "summary": { "high": 1, "medium": 0, "low": 0 }
}

Compare that to a vibes comment. Here the reviewer (human or tool) can point to security.no-sql-interpolation, open the rule YAML in the catalog, read the CWE-89 reference, and verify the AST excerpt matches the diff hunk. If the team uses parameterized queries in a wrapper the rule does not model yet, the debate is about the rule spec, not about whether someone sounded worried enough in thread.

Severity and remediation turn the finding into work the author can schedule. High severity with a clear fix beats a vague "blocking" reaction that leaves the author guessing. References give security partners vocabulary they already use in assessments, so review lines up with how the organization talks about risk elsewhere.

From repeated evidence to governance

One traced finding is useful. A pattern of traced findings is how teams build governance that lasts longer than a single PR.

When the same rule fires across services, you have signal: the catalog reflects a real recurring mistake, not a one-off nit. Teams can raise severity, add an org-specific tag, or document a waiver template for cases that genuinely differ. When the rule is too noisy, you adjust the match or scope with a versioned change, not a oral tradition that new hires never hear.

When human reviewers repeat a judgment the catalog does not capture, "we never call this API without a timeout", that is a candidate for a custom rule. Critiq rule packs are data: YAML with specs and fixtures. Promoting repeated review evidence into a rule means the next PR gets the same check automatically, with the same references and remediation text. Review memory becomes executable policy.

Catalog rules cover common security and quality baselines with shared references
Custom rules under .critiq/rules/ encode decisions your team already makes in review
CI runs the same checks on every push so evidence is generated before merge, not after
Waivers and overrides stay explicit because they attach to rule IDs, not forgotten comments

This is the durable governance half of evidence over vibes. Competitors often stop at comments. Critiq treats the review artifact as something you can compile forward, into rules, into CI gates, into metrics on what keeps failing, because each finding already carries the fields you need to aggregate.

What we are not claiming

Evidence over vibes is not "replace reviewers with automation." Humans still own context the static pass cannot see: product tradeoffs, rollout timing, and whether a pattern is dead code. Evidence makes that ownership easier by separating facts (what matched, where, under which rule) from decisions (merge, fix, waive).

It also does not mean every finding is correct. Rules can be wrong, incomplete, or mis-scoped. The win is that disagreement is inspectable, you edit a rule file and add a spec, instead of fighting over phrasing in a comment thread.

Critiq does not ship hosted AI review on the public site today. This post is about the deterministic layer that already runs in the OSS CLI: rule-backed findings you can reproduce on your machine. When AI-assisted review arrives in the product, it should extend the same evidence graph, claims linked to rules, excerpts, and references, not replace it with fluent summaries alone.

Try it on your repo

Install the CLI, run a scan, and read the rules behind the output. Start with the getting started guide at https://docs.critiq.dev/getting-started, browse rule references at https://docs.critiq.dev/rules, and see how the OSS product page describes local and CI workflows at https://critiq.dev/products/oss.

npm install -D @critiq/cli @critiq/rules
npx @critiq/cli check . --format json
critiq rules explain security.no-sql-interpolation

If a finding surprises you, open the rule YAML in the @critiq/rules catalog on GitHub. If it matches how your team already reviews, wire the same command into CI so every PR carries evidence, not vibes, into the merge decision.