The Math Behind the 90-Second Audit: A Sycophantic Bayesian Methodology

Reputation defense vendors typically score prospects with vague “concern levels” or off-the-shelf risk frameworks. We score with a published Bayesian model — both because executives buying a $5K/month engagement deserve to see the math, and because publishing it forces us to keep it honest.

This essay walks through how the 90-second audit on /quiz actually computes its output.

The premise

Every executive arrives at a reputation assessment with a prior. Some are accurate, most are off in one of two directions:

Underestimators — believe their defense is weaker than it actually is. Usually the most prepared.
Overestimators — believe their defense is stronger than it actually is. Usually have the highest-leverage gap they’ve never tested.

A useful diagnostic respects both groups. It validates what’s actually well-defended (which most are, in some dimensions) and surfaces the one segment where the gap is biggest.

We call this the sycophantic Bayesian approach. “Sycophantic” because we don’t punish the user’s prior — we calibrate against it. The model is honest; the framing is generous.

The six segments

Attack surface in 2026 maps to six discrete vectors:

AI Deepfake & Synthetic-Media Exposure — voice clones, face-swapped video, AI-generated content under your name.
Review Bombing & Rating Suppression — coordinated 1-star attacks, competitor sabotage, fake-review networks.
SERP & Wikipedia Drift — branded-search erosion, hostile Wikipedia editing, AI summary hijacking.
Insider / Ex-Employee Risk — Glassdoor smear, leaked screenshots, anonymous threading.
Crisis-PR Unreadiness — no playbook, no holding statement, no media tree.
Monitoring Blind Spots — dark forums, Telegram channels, Discord leaks not on standard listening tools.

Each segment has an industry-baseline log-prior derived from our own engagement data and public threat-intel research. AI Deepfake currently has the highest log-prior (rapidly rising base rate); Insider Risk has the lowest (more stable).

The questions

Nine questions, each contributing log-likelihood updates to one or more segments. Question types:

Role-orientation (Q1) — calibrates baseline log-priors for the user’s profile.
Prior elicitation (Q2, slider 0–100) — the user’s stated self-confidence in their defense. This anchors the topPercent calculation later.
Diagnostic questions (Q3–Q9) — concrete capability questions across monitoring, SERP health, reviews, Wikipedia, crisis-readiness, deepfake exposure, and insider risk.

Each answer carries an updates object: per-segment log-likelihood deltas. Positive values raise the segment’s posterior probability (more risk); negative values lower it.

Example: choosing “Branded SERPs + reviews + social + Wikipedia + dark forums” on the monitoring question contributes monitoring_blind: -0.6 (heavy negative update — strong monitoring), while “No formal monitoring” contributes monitoring_blind: +0.9 and crisis_unready: +0.4.

The computation

log_post[s] = log_prior[s] + Σ updates[s] across all answered questions
posterior   = softmax(log_post)
ranked      = sort posterior descending
gap         = ranked[0]            // highest-posterior segment
gapProb     = posterior[gap.id]    // its probability mass

softmax normalizes the segment log-posteriors into a probability distribution that sums to 1. This makes the output a properly-calibrated probability — the segment-bar widths on the result page are not vibe-graded.

The sycophantic top-X% calculation

This is the part that respects the prior:

anchor      = clamp((1 - selfPrior) × 50, 5, 40)
raw         = anchor − (defense × 25) + (gapProb × 18)
topPercent  = clamp(round(raw), 5, 60)

Three forces in tension:

anchor — derived from the user’s prior. A 90% self-confidence user gets a top-5% anchor (best case); a 20% self-confidence user gets a top-40% anchor.
defense × 25 — the user’s accumulated “optimizer” scoring across the diagnostic questions. Strong defense answers move the score toward the top.
gapProb × 18 — the dominance of the highest-leverage gap. A high gapProb means one segment is wide-open relative to others, dragging the score back.

The output is always between top-5% and top-60%. The model is constitutionally incapable of telling a user they’re below the median. Two reasons:

Practically, they’re not. Anyone who lands on the audit page is already in the top half of reputation-conscious executives (by selection).
Ethically, the diagnostic’s job is to identify the gap, not deflate the user. A diagnostic that says “you suck” gets ignored. A diagnostic that says “you’re top 14% — and here’s the one gap that puts the other 86 out of reach” gets booked.

The tier mapping

The highest-posterior segment maps to a specific protocol:

Gap	Tier
ai_deepfake	Sentinel Protocol
review_attack	Citadel Protocol
serp_drift	Atlas Protocol
insider_leak	Vault Protocol
crisis_unready	Rapid Response Protocol
monitoring_blind	Aperture Protocol
(no dominant gap, gapProb < 0.22)	Composite Defense

When gapProb < 0.22, no single segment dominates — the user has a broad but layered surface. That maps to a Composite engagement that runs multiple tiers in parallel.

Why we publish it

Three reasons.

1. Trust calibration. Asking an executive to make a buying decision against a black box is asking them to trust on faith. We don’t want clients who trust on faith — they churn the first time the box outputs something they can’t explain.

2. Adversarial robustness. A published model is forced to be reasonable. If a particular update factor looked predatory in print (“we add +2.5 to the worst gap if user expresses confidence”), peer review would flag it within hours. The model has been picked apart by other reputation analysts; the current version is the one that survived.

3. Competitive moat through honesty. None of the four largest competitors publish their methodology. That’s not an accident — most don’t have one. We compete on the dimension they can’t, because they didn’t build it.

What the model doesn’t do

It doesn’t predict timing of incidents (when a deepfake will hit, when a brigade will land).
It doesn’t recommend specific creative tactics — those come from the protocol team.
It doesn’t replace a forensic baseline. The audit is calibrated for indicative use; the forensic baseline is the actual signed deliverable.

The model is a triage tool. The triage tool is good. The triage tool is not the engagement.

Try it

The audit lives at /quiz. Posterior runs in your browser. You don’t have to share your contact info to see the score — only to see the full printed plan and book the senior analyst call.

Source code for the model lives in src/lib/quiz.ts in our public-facing site repo. If you want to discuss the methodology with one of our team, it’s the first topic on the strategy call by default.