Methodology · technical reference

How Jelleo finds bugs before they become exploits.

A four-pillar autonomous audit platform for on-chain smart contracts — Solana (BPF / Rust) and Aptos (Move) today, with the same loop architecture extending to Solidity and C. This page is the public technical reference: architecture, hypothesis scoping, propagation, severity, attestation, reporting, and lifecycle. The same loop that produced F7 against Percolator (April 2026, Solana) also confirmed 2 Critical findings against the Mutatis Aptos benchmark (May 2026, Move).

v0.1 deployed F7 disclosed · Percolator PR #39 Solana + Aptos live Apache-2.0 licensed

The four pillars · the autonomous immune system

Jelleo composes four interlocking pillars. Each ships in v0.1 as a CLI command and scales in the funded build. No platform today combines all four.

P1 · detection

Counterfactual mainnet detection

Per-tx parallel simulation. State divergence between simulated and observed mainnet flags the tx within seconds of slot finality.

v0.1: shadow (commit-triggered). Funded: live mainnet ingestion + sub-slot replay + on-call routing.
P2 · propagation

Cross-protocol bug-class propagation

When a bug confirms anywhere, Jelleo extracts the structural pattern and searches every indexed protocol — vulnerable instances flagged in minutes.

v0.1: propagate (15-protocol corpus). Funded: multi-protocol indexing + disclosure-feed integrations.
P3 · fix bundle

Closed-loop fix bundle

Confirmed bug → agent-generated fix → Kani proves preservation in the bug's neighborhood → tests → opened as a single PR.

v0.1: confirm + synth-kani scaffold. Funded: end-to-end MVP at the 90-day Tranche 2 trigger.
P4 · attestation

On-chain attestation registry

Every cycle publishes a signed Merkle root: which invariants were verified at which commit SHA — on-chain, queryable, composable.

v0.1: Ed25519 signing off-chain. Funded: Anchor program on devnet day 90, mainnet day 180 post-audit.

How the pillars compose during a hunt cycle

Every cycle walks the same sequence. Pillar P1 watches mainnet for divergence; on a confirmed bug from any pillar, P2 propagates the pattern across the corpus; P3 closes the loop with a verified fix bundle; P4 records the cycle's verified invariants on-chain. The pillars are not phases — they run continuously, in parallel, and feed one another.

/ Step 01 · P1

Detect

Per-tx parallel simulation flags state divergence on mainnet within seconds of slot finality. Source: shadow command.

/ Step 02 · P2

Propagate

Confirmed bug-class signature is dispatched against every protocol in the indexed corpus. Vulnerable instances flagged in minutes.

/ Step 03 · P3

Fix bundle

Agent-authored fix → Kani proves preservation → cargo tests pass → opened as a single PR. Maintainer-ready, signed, traceable.

/ Step 04 · P4

Attest

Per-cycle Ed25519-signed Merkle root: which invariants verified, at which commit SHA. Off-chain today, on-chain in the funded build.

Feedback loop: every confirmed finding feeds back into the hypothesis library — sibling claims auto-derive, the library compounds, the next cycle's coverage is broader. The pillars run continuously and in parallel, not as phases.

Shift-left distribution

The same bug-class work also lands in your developers' IDEs. Jelleo maintains a public Solana Security Guidance ruleset for Anthropic's Claude Code security-guidance plugin — 20 rules total, headline rule SOL-001 covers two confirmed-exploitable bounty wins. Drop two files into .claude/ and the plugin flags Solana-specific bug classes as your team writes code. MIT licensed at github.com/Copenhagen0x/solana-security-guidance.

How Jelleo fits inside STRIDE.

Solana Foundation launched STRIDE on April 6, 2026 with Asymmetric Research — a tiered, foundation-funded program covering eight security pillars across the ecosystem. STRIDE is the framework. Jelleo is one of multiple artifact-producing vendors that STRIDE-aligned protocols use to satisfy the smart-contract integrity pillar at depth.

STRIDE's eight pillars span both opsec/threat-monitoring (Tier 1 surface) and smart-contract integrity (Tier 1 + Tier 2 deep verification). Jelleo's four pillars produce the artifacts that map onto the smart-contract integrity surface:

STRIDE pillar (SF program)Jelleo's contributionArtifact
Smart-contract integrity — the depth pillar STRIDE evaluators assess Continuous hypothesis-driven verification with Kani proofs and signed attestations Per-cycle signed Merkle root + per-finding signed disclosure package (P4)
Adversarial detection Counterfactual mainnet sim flags state-divergent txs (P1) Shadow-audit alerts
Coordinated disclosure Closed-loop fix bundle delivers maintainer-ready PRs (P3) GitHub PR with Kani proof + cargo test
Cross-protocol monitoring Bug-class propagation across the indexed corpus (P2) Cross-protocol propagation report

Jelleo does not compete with STRIDE — it is complementary. STRIDE evaluates programs holistically; Jelleo produces the depth artifacts that protocols need to demonstrate the smart-contract integrity pillar at higher tiers. Asymmetric Research is the program operator for STRIDE; Jelleo is one of the artifact-producing vendors operating inside it.

Year-1 OKR: open an artifact-recognition channel with STRIDE assessors so Jelleo's signed attestations are accepted directly as evidence for the smart-contract integrity pillar. (Goal, not commitment — depends on STRIDE program governance.)

Hypothesis library scoping

The hypothesis library is the platform's source of truth for "what to test." It compounds with every confirmed finding — meaning at scale (27 protocols, 1,500+ hypotheses) precision matters. Each hypothesis must declare exactly which protocols it applies to, under what conditions, and which generalized bug class it represents. Below is the schema.

Schema · per-hypothesis frontmatter

Every hypothesis is a YAML entry under hypotheses: in a hypotheses.yaml file (per workspace, per target). The required and optional fields:

id
Unique short identifier (e.g. H1-residual-conservation). Stable across cycles. Used as the cross-DB key.
class
One of: invariant_property, state_transition, authorization, arithmetic_overflow, implicit_invariant. Determines which Layer-2 / Layer-3 dispatch path the hunt loop selects.
claim
The falsifiable claim in plain English. Phrased so a clean negative result strengthens the disclosure.
severity
One of: Critical, High, Medium, Low, Info. Sets the floor on auto-derived severity (see §05).
applies_to (scoping field)
List of protocol names this hypothesis applies to. Used by the loader to filter which hypotheses run against which target. Wildcard ['*'] matches all protocols. Example: [percolator, drift, mango].
scope_conditions (optional)
Predicate(s) that must be true for the hypothesis to be relevant. E.g. has_insurance_pool, uses_pyth_oracle, perpetual_funding.
bug_class (propagation field)
Generalized class identifier for cross-protocol propagation (§04). When this hypothesis confirms TRUE on protocol A, Jelleo dispatches the same generalized class against every other protocol where applies_to matches.
target_file, relevant_constants, relevant_instructions
Anchor fields used by the recon agent to focus its initial reading window. Loader does not filter on them.

Example · F7 hypothesis with full scoping

hypotheses.yaml · F7
- id: H1-residual-conservation
  class: invariant_property
  claim: >
    The post-haircut residual cash on a market
    (vault - cash_locked_in_orderbook - claimable_pnl - insurance_counter)
    is conserved by every internal accounting helper. If any helper shrinks
    the insurance counter, it MUST also debit the vault by the same amount.
  severity: Critical
  applies_to: [percolator, drift, mango, marginfi]
  scope_conditions: [has_insurance_pool, has_haircut_accounting]
  bug_class: insurance-counter-vault-divergence    # the propagation key
  target_file: src/percolator.rs

Loader semantics

When audit-pipeline hunt runs against target T with conditions C (derived from workspace.json), the loader applies a 3-step filter to every hypothesis in hypotheses.yaml:

  1. applies_to filter: hypothesis loads only if T ∈ applies_to or '*' ∈ applies_to.
  2. scope_conditions filter: hypothesis loads only if every predicate evaluates true under target conditions.
  3. severity floor: if invoked with --min-severity High, hypotheses below that floor are skipped.

The result: at scale, a target only ever sees the hypotheses that actually apply to it. The library can grow to thousands of entries without any single cycle being polluted by irrelevant claims.

Why this matters operationally. Loader-level filtering is what makes the library safe to expand. Without scoping, a perp-DEX-specific claim ("funding rate captured before mark mutation") would be dispatched against an AMM, where the recon agent burns budget proving the claim trivially holds because there is no funding rate at all. Scoping catches this at filter time — no agent, no token spend, no false-negative noise in the report.

Engineering reference: the canonical schema spec — including the full predicate vocabulary, validation rules, and migration plan — lives at docs/HYPOTHESIS_SCHEMA.md in the platform repo. The JSON Schema at src/audit_pipeline/schemas/hypothesis.schema.json is consumed by the loader and by editor tooling.

Auto-learning across the cluster.

Bug classes recur. F7's "shrink insurance counter without debiting the vault" pattern probably exists in any protocol with insurance accounting; CatchupAccrue's "advance the clock without touching the accounts" pattern probably exists in any protocol with multi-instruction settlement. Jelleo's propagation engine is how we find them — every customer benefits from every other customer's findings.

Trigger

Whenever a hypothesis H on protocol A moves to status confirmed (PoC fired or debate-promoted), the propagation engine fires. The engine takes H's bug_class, looks up every protocol in the corpus whose applies_to includes H's class generalization, and dispatches the same hypothesis class — in parallel — against every matching protocol.

Pipeline

propagation flow
// the moment a finding moves to status=confirmed, this loop fires
1. Confirmed finding on protocol A · cycle K
2. Extract bug_class signature (regex set + structural pattern)
3. Walk the corpus (15+ protocols, indexed)
4. For each candidate file in each protocol:
     - signature match score = count(distinct signatures hit)
     - if score >= MIN_SCORE: emit candidate
5. Top candidates dispatched to Layer 1 (recon agent)
6. Layer 1 verdicts feed back into the per-protocol findings DB
7. Customer of every affected protocol gets:
     - email notification (immediate, on confirmed propagation)
     - dashboard update
     - 24h report row
8. Cluster-level metric: bug-class hit rate, time-to-detect, false-positive rate

Customer notification on cross-protocol re-test

If your protocol gets re-tested because a bug class confirmed on someone else's protocol, you receive an immediate email of the form:

email · cross-protocol re-test
Subject: [Jelleo] Cross-protocol re-test · <your-protocol> · bug_class=X

We confirmed bug class <X> on <other-protocol> on <date>.
We re-tested <your-protocol> against the same class.

Result: PASS · <n> hypotheses dispatched, all green
        OR
        REVIEW · <n> candidates flagged for Layer-2 escalation

Full report: https://jelleo.com/reports/<your-protocol>/<cycle-id>

This is the "auto-learning" property: the cluster's collective coverage compounds. Every confirmed bug class adds permanent re-test coverage across every applicable protocol, forever.

Implementation

The corpus is initialized once with audit-pipeline propagate init-corpus (15 popular Solana programs by default — Drift, Mango, Marginfi, Kamino, Phoenix, OpenBook, Orca, Meteora, Raydium, Marinade, Anchor, SPL, plus the engine under audit). Each protocol is shallow-cloned for fast indexing. The search itself is audit-pipeline propagate search -c <corpus> -s '<sig1>' -s '<sig2>', with intentionally simple regex matching to keep the indexer fast and language-agnostic. Top hits escalate to a Layer-1 agent for deep verification.

Five tiers · formal definitions

Severity is what the customer sees first. The rubric is stable, auditable, and consistent across cycles. Every finding gets a tier; tiers are derived, not negotiated.

TierDefinitionExample
Critical Direct loss of user funds or full protocol takeover with no meaningful preconditions. Reachable from a permissionless instruction by any signer. Must be patched immediately. Bypass of the authorization gate on a fund-moving instruction; price-oracle manipulation drains the pool
High Significant loss of user funds or protocol invariant violation under realistic preconditions (specific market state, signer with limited but obtainable role). Patch should ship in next release. F7 — insurance siphon under a self-trade construction; liquidation incentive overpayment
Medium Hardening issue, partial loss possible, or invariant violation requiring privileged signer or improbable state. Worth fixing in normal cadence. Admin-only DoS via fee-rate misconfiguration; a rounding asymmetry in fee accounting
Low Minor issue with no plausible path to fund loss. Code-quality or defense-in-depth concern. Missing compile-time tripwire on a magic constant; an event field not emitted
Info Informational. No security impact. Documentation or style suggestion. Inconsistent comment between two helpers that compute the same thing

Auto-derivation

If the hypothesis YAML declares an explicit severity:, that becomes the floor. Otherwise the severity is derived from (class, verdict, poc_fired, debate_promoted) per:

  1. PoC fired AND class is invariant_property or authorization  →  Critical
  2. PoC fired (any other class)  →  High
  3. Verdict TRUE + debate-promoted (no PoC yet)  →  High
  4. Verdict TRUE (no debate, no PoC)  →  Medium
  5. Verdict NEEDS_LAYER_2_TO_DECIDE  →  Low
  6. Otherwise  →  Info

Source: severity.derive_severity() in the platform package. The same function is used by the report generator and the dashboard so the customer's view never disagrees with the on-disk record.

State machine · seven states, restricted transitions

Every finding is a row in the SQLite findings DB with a status column. Transitions are restricted so the audit trail is honest — you cannot jump from a brand-new finding to "fixed" without walking the chain. Every transition is logged in an append-only transitions table.

StateMeaningValid next states
newFresh from a hunt cycle, not yet reviewedtriaged, confirmed, rejected
triagedHuman or automation confirmed it's a real candidateconfirmed, rejected
confirmedEmpirical proof exists (PoC fired)disclosed, rejected
disclosedReported to the maintainer (issue filed / email sent)fixed, rejected
fixedMaintainer shipped a patchverified, rejected
verifiedPatch confirmed effective via re-run cycle(terminal)
rejectedRefuted (debate flipped it, or PoC didn't fire)(terminal)

Source: lifecycle.py · VALID_TRANSITIONS. Invalid transitions raise InvalidTransition at the DB layer, before the row is touched. The full per-finding history is retrievable via db.transitions_for(finding_id).

Signed receipts · Ed25519.

Every audit cycle produces a signed receipt. Customers can verify the receipt against the published Jelleo public key without trusting the platform. The mechanism today is off-chain Ed25519; the funded-state delta is the on-chain Anchor registry.

What gets signed

  • The Markdown report for a hunt cycle (cycle <id>.report.md)
  • The findings JSON dump for the cycle
  • The per-finding disclosure package (when a finding moves to disclosed)
  • The weekly / monthly rollup PDF

Receipt format

cycle-2026-05-07.report.md.sig
-----BEGIN JELLEO SIGNATURE-----
Algorithm: Ed25519
Signed-At: 2026-05-07T16:23:11+00:00
Signed-File: cycle-2026-05-07.report.md
Signed-Bytes: 142853

<base64 signature>
-----END JELLEO SIGNATURE-----

Verification

The Jelleo Ed25519 public key is published in the open-source repo and on this site. To verify a signature:

$ verify a signed cycle
$ audit-pipeline sign verify <file> <file>.sig --pubkey jelleo.ed25519.pub
✓ VALID signature on cycle-2026-05-07.report.md

A valid signature confirms (a) the file's bytes have not been altered since signing, and (b) the file was signed by the platform key — meaning the cycle ran in a Jelleo workspace with a private key matching the published public key. This is the basis for the on-chain attestation registry shipping in the funded build.

On-chain registry (funded delta)

P4's funded version publishes a per-cycle Merkle root on-chain via a small (~500 LOC) Anchor program. The Merkle leaves are the per-hypothesis verdict hashes. Devnet day 90, mainnet day 180 post-external-audit.

Three cadences · one urgency channel

Customers do not wait for a report cycle to find out about a critical bug. The cadence reports are summaries; urgent findings push immediately.

Cadence

CadenceContentsChannel
Immediate Single confirmed Critical or High finding · severity · summary · repro link · suggested fix Email (transactional) · dashboard alert · optional Slack webhook
24-hour rollup Every finding moved to confirmed or rejected in the last 24h. New cycles run, total spend, throughput. Email · signed PDF attached
Weekly 7-day rollup. Severity breakdown, lifecycle transitions, propagation hits, hypothesis library deltas, methodology changes. Email · signed PDF · branded HTML dashboard link
Monthly Executive summary. Trend analysis, MTTR (mean-time-to-resolve), library coverage, Kani proof coverage, billing summary. Email · signed PDF · invoice

Trigger semantics for the immediate channel

The immediate email fires when a finding's status moves from new or triaged to confirmed, AND severity is Critical or High. There is no batching, no daily digest. The customer's primary on-call gets the email; their team alias gets a CC. The same event fires the dashboard alert pulse and (if configured) the Slack webhook.

Customer dashboard

Each customer accesses their dashboard via a token-gated portal at jelleo.com/customer/. The token resolves to a per-customer view that pulls live state from the findings DB via a static-JSON snapshot and refreshes on a configurable interval (default 60s). Status pills, severity bars, recent cycles, recent findings, daemon health. The universal demo token is demo.

What's signed, what's not

Every PDF rollup (24h / weekly / monthly) ships with a Jelleo Ed25519 signature alongside (§07). The dashboard view itself is not signed — it is a live mirror of state. Customers who need a signed point-in-time snapshot request a rollup at any moment via audit-pipeline report cycle --cycle-id <id> followed by sign sign <file>; the funded build automates this as a one-click "issue receipt" action.

Disclosure embargo

Reports include findings in the disclosed state in summary form (severity, bug-class, embargo countdown) but withhold reproduction details until the embargo lifts. Defaults: 30 days for High, extensible to 90 days; Critical findings get maximum embargo flexibility while a fix is in flight. Full policy and timeline at jelleo.com/security.html.

F7 · the inaugural finding

F7 is the proof that the methodology works on real code. It was the first hypothesis to confirm against Anatoly's Percolator engine, in April 2026 — disclosed in PR #39, closed without merge, with the regression suite landing on main at a1afd2e approximately one month later.

Bug class

insurance-counter-vault-divergence — the helper use_insurance_buffer shrinks the insurance counter without correspondingly debiting the vault. The post-haircut residual cash on the market grows by the absorbed amount, and the K/F-winning side claims it. A self-trade nets (ΔK − IM) · N and captures the insurance fund.

Each pillar's contribution to F7

  1. Hypothesis library: H1-residual-conservation (class invariant_property) declared the conservation property that use_insurance_buffer violated.
  2. Layer-1 recon (P2-adjacent): Multi-agent code review converged on the helper as the failing point.
  3. Layer-1.5 debate: An adversarial second-opinion agent flipped an early "epoch-staleness" lead that was REFUTED, and re-anchored the hunt on the residual-formula bug — which proved correct.
  4. Layer-2 PoC (P3 scaffold): Auto-generated cargo test fired against the engine helper, demonstrating the residual delta. cargo test PASS.
  5. Layer-4 LiteSVM (P3 escalation): BPF-level reproduction in LiteSVM at commit 43cdcd8 — the PoC test exists in PR #39's test suite.
  6. P4 attestation: The cycle was signed (Ed25519); the verification key is published.

What was disclosed

PR #39 contained: (a) the LiteSVM reachability test demonstrating the drain, (b) the suggested fix (debit the vault concurrently with the counter shrink), (c) the Markdown writeup explaining the bug-class. The PR was closed without merge — the maintainer landed an A1 regression suite on main at a1afd2e independently. The fix verified locally across 277 unit tests with zero regressions.

Cross-protocol propagation status

F7's bug_class (insurance-counter-vault-divergence) is a candidate for cross-protocol propagation against any indexed protocol with insurance accounting. The funded-state P2 build will run this propagation continuously. v0.1 ran the propagation manually against the corpus and found no immediate hits — the signature is structurally specific to Percolator's helper geometry. The pattern matcher remains armed.

Same four pillars · multiple smart-contract languages.

The four-pillar architecture (Detection · Propagation · Fix Bundle · Attestation) is language-agnostic. What changes per language is the adapter layer: which symbolic verifier runs at Layer 3, which fuzz harness runs at Layer 4, and which test runner produces the Layer 2 abort signal. Today: Solana / BPF and Aptos / Move are live. Solidity (Foundry / forge) and C (clang+sanitizers) ride the same plumbing — the dispatch layer routes by language.

Layer Solana adapter Aptos adapter
Layer 2 · PoC runtime cargo test --features test in the engine repo; aborts on panic! with a custom error code aptos move test --filter <slug>; aborts with a custom E_BUG_HIT code originating in the target module (stdlib aborts are filtered out by Layer 2.5)
Layer 2.5 · triage LLM judge on cargo panic + test body; STRONG / SOFT / FALSE / LOST Same judge, language-aware FALSE-pattern set (matches Move VM abort signatures + framework setup aborts that aren't real bugs)
Layer 3 · symbolic Kani bounded model checker; assertion-based spec authored by the engine; counterexample = bug confirmed Move Prover with Boogie + Z3 / CVC5 backends; spec authored as spec module { ... } blocks with pragmas and direct global accessors; counterexample = bug confirmed
Layer 4 · property fuzz LiteSVM deploy + replay of LLM-authored harness; on-chain BPF instructions; abort or PASS pattern aptos move test with property harness; FAIL pattern (inverted assertion fires) or PASS pattern (attacker scenario completes end-to-end). Either is bug-confirmation signal.
Layer P3 · fix bundle LLM-authored unified diff against Rust source; verifier applies + runs cargo test + checks well-formedness LLM-authored unified diff against Move source; same 5-gate verifier (well-formedness, single-function scope, no new deps; aptos move test backend is on the funded-tier roadmap)

Worked example · Aptos cycle 20260513-191318 (osec-aptos-small)

The Mutatis Aptos benchmark (OSec eval suite, small tier) ran end-to-end through the four pillars:

  • 40 hypotheses dispatched from the OSec Aptos hypothesis class library (authorization, arithmetic, accounting, capability handling, event auditability, oracle freshness).
  • 7 PoC fires at Layer 2 (test aborted in target module with custom error code).
  • 4 STRONG after Layer 2.5 triage filtered 1 SOFT (wrong invariant) and 2 FALSE (stdlib / setup abort, not target).
  • 2 root causes after the clustering rule collapsed 3 STRONG fires hitting transfer_admin in access_control.move into one finding.
  • 2 Critical findings published:
  • $11.62 total LLM spend, 2m 16s wall-clock.

Foundation · Production · Ceiling.

Three depth tiers, three customer profiles. Pricing is 2.4× our raw API cost (58% gross margin). Customers move between tiers month over month; engagement length is open-ended.

Tier Hyps / day Mix Engagement / mo Engagement / yr
Foundation 300 Sonnet-only · daily runs · monitoring depth $30K $360K
Production recommended 500 Sonnet + tool-using Opus + weekly Kani · default for $1B+ TVL DeFi $60K $720K
Ceiling 1,500+ Multi-round refinement · attack-chain enumeration · Kani synthesis on every Critical/High $120K $1.44M

All tiers receive: signed cycle receipts (§07), 24h / weekly / monthly reports (§08), live dashboard, immediate notification on Critical/High (§08), GitHub-issue auto-file on confirmed findings, and access to the cross-protocol propagation channel (§04). Higher tiers add: Kani proof on every Critical/High, P3 fix-bundle PRs, attack-chain enumeration depth, and on-call SLA on disclosure response.

STRIDE Tier-2 reimbursement path

STRIDE Tier-2 ($100M+ TVL) protocols qualifying for Solana Foundation formal-verification mandate funding may direct that funding toward Jelleo as the artifact-producing layer. No formal partnership today; this is a Year-1 OKR (§02).