How Jelleo finds bugs before they become exploits.
A four-pillar autonomous audit platform for on-chain smart contracts — Solana (BPF / Rust) and Aptos (Move) today, with the same loop architecture extending to Solidity and C. This page is the public technical reference: architecture, hypothesis scoping, propagation, severity, attestation, reporting, and lifecycle. The same loop that produced F7 against Percolator (April 2026, Solana) also confirmed 2 Critical findings against the Mutatis Aptos benchmark (May 2026, Move).
The four pillars · the autonomous immune system
Jelleo composes four interlocking pillars. Each ships in v0.1 as a CLI command and scales in the funded build. No platform today combines all four.
Counterfactual mainnet detection
Per-tx parallel simulation. State divergence between simulated and observed mainnet flags the tx within seconds of slot finality.
shadow (commit-triggered). Funded: live mainnet ingestion + sub-slot replay + on-call routing.Cross-protocol bug-class propagation
When a bug confirms anywhere, Jelleo extracts the structural pattern and searches every indexed protocol — vulnerable instances flagged in minutes.
propagate (15-protocol corpus). Funded: multi-protocol indexing + disclosure-feed integrations.Closed-loop fix bundle
Confirmed bug → agent-generated fix → Kani proves preservation in the bug's neighborhood → tests → opened as a single PR.
confirm + synth-kani scaffold. Funded: end-to-end MVP at the 90-day Tranche 2 trigger.On-chain attestation registry
Every cycle publishes a signed Merkle root: which invariants were verified at which commit SHA — on-chain, queryable, composable.
How the pillars compose during a hunt cycle
Every cycle walks the same sequence. Pillar P1 watches mainnet for divergence; on a confirmed bug from any pillar, P2 propagates the pattern across the corpus; P3 closes the loop with a verified fix bundle; P4 records the cycle's verified invariants on-chain. The pillars are not phases — they run continuously, in parallel, and feed one another.
Detect
Per-tx parallel simulation flags state divergence on mainnet within seconds of slot finality. Source: shadow command.
Propagate
Confirmed bug-class signature is dispatched against every protocol in the indexed corpus. Vulnerable instances flagged in minutes.
Fix bundle
Agent-authored fix → Kani proves preservation → cargo tests pass → opened as a single PR. Maintainer-ready, signed, traceable.
Attest
Per-cycle Ed25519-signed Merkle root: which invariants verified, at which commit SHA. Off-chain today, on-chain in the funded build.
Feedback loop: every confirmed finding feeds back into the hypothesis library — sibling claims auto-derive, the library compounds, the next cycle's coverage is broader. The pillars run continuously and in parallel, not as phases.
The same bug-class work also lands in your developers' IDEs. Jelleo maintains a public Solana Security Guidance ruleset for Anthropic's Claude Code security-guidance plugin — 20 rules total, headline rule SOL-001 covers two confirmed-exploitable bounty wins. Drop two files into .claude/ and the plugin flags Solana-specific bug classes as your team writes code. MIT licensed at github.com/Copenhagen0x/solana-security-guidance.
How Jelleo fits inside STRIDE.
Solana Foundation launched STRIDE on April 6, 2026 with Asymmetric Research — a tiered, foundation-funded program covering eight security pillars across the ecosystem. STRIDE is the framework. Jelleo is one of multiple artifact-producing vendors that STRIDE-aligned protocols use to satisfy the smart-contract integrity pillar at depth.
STRIDE's eight pillars span both opsec/threat-monitoring (Tier 1 surface) and smart-contract integrity (Tier 1 + Tier 2 deep verification). Jelleo's four pillars produce the artifacts that map onto the smart-contract integrity surface:
| STRIDE pillar (SF program) | Jelleo's contribution | Artifact |
|---|---|---|
| Smart-contract integrity — the depth pillar STRIDE evaluators assess | Continuous hypothesis-driven verification with Kani proofs and signed attestations | Per-cycle signed Merkle root + per-finding signed disclosure package (P4) |
| Adversarial detection | Counterfactual mainnet sim flags state-divergent txs (P1) | Shadow-audit alerts |
| Coordinated disclosure | Closed-loop fix bundle delivers maintainer-ready PRs (P3) | GitHub PR with Kani proof + cargo test |
| Cross-protocol monitoring | Bug-class propagation across the indexed corpus (P2) | Cross-protocol propagation report |
Jelleo does not compete with STRIDE — it is complementary. STRIDE evaluates programs holistically; Jelleo produces the depth artifacts that protocols need to demonstrate the smart-contract integrity pillar at higher tiers. Asymmetric Research is the program operator for STRIDE; Jelleo is one of the artifact-producing vendors operating inside it.
Year-1 OKR: open an artifact-recognition channel with STRIDE assessors so Jelleo's signed attestations are accepted directly as evidence for the smart-contract integrity pillar. (Goal, not commitment — depends on STRIDE program governance.)
Hypothesis library scoping
The hypothesis library is the platform's source of truth for "what to test." It compounds with every confirmed finding — meaning at scale (27 protocols, 1,500+ hypotheses) precision matters. Each hypothesis must declare exactly which protocols it applies to, under what conditions, and which generalized bug class it represents. Below is the schema.
Schema · per-hypothesis frontmatter
Every hypothesis is a YAML entry under hypotheses: in a hypotheses.yaml file (per workspace, per target). The required and optional fields:
- id
- Unique short identifier (e.g.
H1-residual-conservation). Stable across cycles. Used as the cross-DB key. - class
- One of:
invariant_property,state_transition,authorization,arithmetic_overflow,implicit_invariant. Determines which Layer-2 / Layer-3 dispatch path the hunt loop selects. - claim
- The falsifiable claim in plain English. Phrased so a clean negative result strengthens the disclosure.
- severity
- One of:
Critical,High,Medium,Low,Info. Sets the floor on auto-derived severity (see §05). - applies_to (scoping field)
- List of protocol names this hypothesis applies to. Used by the loader to filter which hypotheses run against which target. Wildcard
['*']matches all protocols. Example:[percolator, drift, mango]. - scope_conditions (optional)
- Predicate(s) that must be true for the hypothesis to be relevant. E.g.
has_insurance_pool,uses_pyth_oracle,perpetual_funding. - bug_class (propagation field)
- Generalized class identifier for cross-protocol propagation (§04). When this hypothesis confirms TRUE on protocol A, Jelleo dispatches the same generalized class against every other protocol where
applies_tomatches. - target_file, relevant_constants, relevant_instructions
- Anchor fields used by the recon agent to focus its initial reading window. Loader does not filter on them.
Example · F7 hypothesis with full scoping
- id: H1-residual-conservation
class: invariant_property
claim: >
The post-haircut residual cash on a market
(vault - cash_locked_in_orderbook - claimable_pnl - insurance_counter)
is conserved by every internal accounting helper. If any helper shrinks
the insurance counter, it MUST also debit the vault by the same amount.
severity: Critical
applies_to: [percolator, drift, mango, marginfi]
scope_conditions: [has_insurance_pool, has_haircut_accounting]
bug_class: insurance-counter-vault-divergence # the propagation key
target_file: src/percolator.rs
Loader semantics
When audit-pipeline hunt runs against target T with conditions C (derived from workspace.json), the loader applies a 3-step filter to every hypothesis in hypotheses.yaml:
- applies_to filter: hypothesis loads only if
T ∈ applies_toor'*' ∈ applies_to. - scope_conditions filter: hypothesis loads only if every predicate evaluates true under target conditions.
- severity floor: if invoked with
--min-severity High, hypotheses below that floor are skipped.
The result: at scale, a target only ever sees the hypotheses that actually apply to it. The library can grow to thousands of entries without any single cycle being polluted by irrelevant claims.
Engineering reference: the canonical schema spec — including the full predicate vocabulary, validation rules, and migration plan — lives at
docs/HYPOTHESIS_SCHEMA.md
in the platform repo. The JSON Schema at src/audit_pipeline/schemas/hypothesis.schema.json is consumed by the loader and by editor tooling.
Auto-learning across the cluster.
Bug classes recur. F7's "shrink insurance counter without debiting the vault" pattern probably exists in any protocol with insurance accounting; CatchupAccrue's "advance the clock without touching the accounts" pattern probably exists in any protocol with multi-instruction settlement. Jelleo's propagation engine is how we find them — every customer benefits from every other customer's findings.
Trigger
Whenever a hypothesis H on protocol A moves to status confirmed (PoC fired or debate-promoted), the propagation engine fires. The engine takes H's bug_class, looks up every protocol in the corpus whose applies_to includes H's class generalization, and dispatches the same hypothesis class — in parallel — against every matching protocol.
Pipeline
// the moment a finding moves to status=confirmed, this loop fires
1. Confirmed finding on protocol A · cycle K
2. Extract bug_class signature (regex set + structural pattern)
3. Walk the corpus (15+ protocols, indexed)
4. For each candidate file in each protocol:
- signature match score = count(distinct signatures hit)
- if score >= MIN_SCORE: emit candidate
5. Top candidates dispatched to Layer 1 (recon agent)
6. Layer 1 verdicts feed back into the per-protocol findings DB
7. Customer of every affected protocol gets:
- email notification (immediate, on confirmed propagation)
- dashboard update
- 24h report row
8. Cluster-level metric: bug-class hit rate, time-to-detect, false-positive rate
Customer notification on cross-protocol re-test
If your protocol gets re-tested because a bug class confirmed on someone else's protocol, you receive an immediate email of the form:
Subject: [Jelleo] Cross-protocol re-test · <your-protocol> · bug_class=X
We confirmed bug class <X> on <other-protocol> on <date>.
We re-tested <your-protocol> against the same class.
Result: PASS · <n> hypotheses dispatched, all green
OR
REVIEW · <n> candidates flagged for Layer-2 escalation
Full report: https://jelleo.com/reports/<your-protocol>/<cycle-id>
This is the "auto-learning" property: the cluster's collective coverage compounds. Every confirmed bug class adds permanent re-test coverage across every applicable protocol, forever.
Implementation
The corpus is initialized once with audit-pipeline propagate init-corpus (15 popular Solana programs by default — Drift, Mango, Marginfi, Kamino, Phoenix, OpenBook, Orca, Meteora, Raydium, Marinade, Anchor, SPL, plus the engine under audit). Each protocol is shallow-cloned for fast indexing. The search itself is audit-pipeline propagate search -c <corpus> -s '<sig1>' -s '<sig2>', with intentionally simple regex matching to keep the indexer fast and language-agnostic. Top hits escalate to a Layer-1 agent for deep verification.
Five tiers · formal definitions
Severity is what the customer sees first. The rubric is stable, auditable, and consistent across cycles. Every finding gets a tier; tiers are derived, not negotiated.
| Tier | Definition | Example |
|---|---|---|
| Critical | Direct loss of user funds or full protocol takeover with no meaningful preconditions. Reachable from a permissionless instruction by any signer. Must be patched immediately. | Bypass of the authorization gate on a fund-moving instruction; price-oracle manipulation drains the pool |
| High | Significant loss of user funds or protocol invariant violation under realistic preconditions (specific market state, signer with limited but obtainable role). Patch should ship in next release. | F7 — insurance siphon under a self-trade construction; liquidation incentive overpayment |
| Medium | Hardening issue, partial loss possible, or invariant violation requiring privileged signer or improbable state. Worth fixing in normal cadence. | Admin-only DoS via fee-rate misconfiguration; a rounding asymmetry in fee accounting |
| Low | Minor issue with no plausible path to fund loss. Code-quality or defense-in-depth concern. | Missing compile-time tripwire on a magic constant; an event field not emitted |
| Info | Informational. No security impact. Documentation or style suggestion. | Inconsistent comment between two helpers that compute the same thing |
Auto-derivation
If the hypothesis YAML declares an explicit severity:, that becomes the floor. Otherwise the severity is derived from (class, verdict, poc_fired, debate_promoted) per:
- PoC fired AND class is
invariant_propertyorauthorization→ Critical - PoC fired (any other class) → High
- Verdict TRUE + debate-promoted (no PoC yet) → High
- Verdict TRUE (no debate, no PoC) → Medium
- Verdict
NEEDS_LAYER_2_TO_DECIDE→ Low - Otherwise → Info
Source: severity.derive_severity() in the platform package. The same function is used by the report generator and the dashboard so the customer's view never disagrees with the on-disk record.
State machine · seven states, restricted transitions
Every finding is a row in the SQLite findings DB with a status column. Transitions are restricted so the audit trail is honest — you cannot jump from a brand-new finding to "fixed" without walking the chain. Every transition is logged in an append-only transitions table.
| State | Meaning | Valid next states |
|---|---|---|
| new | Fresh from a hunt cycle, not yet reviewed | triaged, confirmed, rejected |
| triaged | Human or automation confirmed it's a real candidate | confirmed, rejected |
| confirmed | Empirical proof exists (PoC fired) | disclosed, rejected |
| disclosed | Reported to the maintainer (issue filed / email sent) | fixed, rejected |
| fixed | Maintainer shipped a patch | verified, rejected |
| verified | Patch confirmed effective via re-run cycle | (terminal) |
| rejected | Refuted (debate flipped it, or PoC didn't fire) | (terminal) |
Source: lifecycle.py · VALID_TRANSITIONS. Invalid transitions raise InvalidTransition at the DB layer, before the row is touched. The full per-finding history is retrievable via db.transitions_for(finding_id).
Signed receipts · Ed25519.
Every audit cycle produces a signed receipt. Customers can verify the receipt against the published Jelleo public key without trusting the platform. The mechanism today is off-chain Ed25519; the funded-state delta is the on-chain Anchor registry.
What gets signed
- The Markdown report for a hunt cycle (
cycle <id>.report.md) - The findings JSON dump for the cycle
- The per-finding disclosure package (when a finding moves to
disclosed) - The weekly / monthly rollup PDF
Receipt format
-----BEGIN JELLEO SIGNATURE-----
Algorithm: Ed25519
Signed-At: 2026-05-07T16:23:11+00:00
Signed-File: cycle-2026-05-07.report.md
Signed-Bytes: 142853
<base64 signature>
-----END JELLEO SIGNATURE-----
Verification
The Jelleo Ed25519 public key is published in the open-source repo and on this site. To verify a signature:
$ audit-pipeline sign verify <file> <file>.sig --pubkey jelleo.ed25519.pub
✓ VALID signature on cycle-2026-05-07.report.md
A valid signature confirms (a) the file's bytes have not been altered since signing, and (b) the file was signed by the platform key — meaning the cycle ran in a Jelleo workspace with a private key matching the published public key. This is the basis for the on-chain attestation registry shipping in the funded build.
On-chain registry (funded delta)
P4's funded version publishes a per-cycle Merkle root on-chain via a small (~500 LOC) Anchor program. The Merkle leaves are the per-hypothesis verdict hashes. Devnet day 90, mainnet day 180 post-external-audit.
Three cadences · one urgency channel
Customers do not wait for a report cycle to find out about a critical bug. The cadence reports are summaries; urgent findings push immediately.
Cadence
| Cadence | Contents | Channel |
|---|---|---|
| Immediate | Single confirmed Critical or High finding · severity · summary · repro link · suggested fix | Email (transactional) · dashboard alert · optional Slack webhook |
| 24-hour rollup | Every finding moved to confirmed or rejected in the last 24h. New cycles run, total spend, throughput. |
Email · signed PDF attached |
| Weekly | 7-day rollup. Severity breakdown, lifecycle transitions, propagation hits, hypothesis library deltas, methodology changes. | Email · signed PDF · branded HTML dashboard link |
| Monthly | Executive summary. Trend analysis, MTTR (mean-time-to-resolve), library coverage, Kani proof coverage, billing summary. | Email · signed PDF · invoice |
Trigger semantics for the immediate channel
The immediate email fires when a finding's status moves from new or triaged to confirmed, AND severity is Critical or High. There is no batching, no daily digest. The customer's primary on-call gets the email; their team alias gets a CC. The same event fires the dashboard alert pulse and (if configured) the Slack webhook.
Customer dashboard
Each customer accesses their dashboard via a token-gated portal at jelleo.com/customer/. The token resolves to a per-customer view that pulls live state from the findings DB via a static-JSON snapshot and refreshes on a configurable interval (default 60s). Status pills, severity bars, recent cycles, recent findings, daemon health. The universal demo token is demo.
What's signed, what's not
Every PDF rollup (24h / weekly / monthly) ships with a Jelleo Ed25519 signature alongside (§07). The dashboard view itself is not signed — it is a live mirror of state. Customers who need a signed point-in-time snapshot request a rollup at any moment via audit-pipeline report cycle --cycle-id <id> followed by sign sign <file>; the funded build automates this as a one-click "issue receipt" action.
Disclosure embargo
Reports include findings in the disclosed state in summary form (severity, bug-class, embargo countdown) but withhold reproduction details until the embargo lifts. Defaults: 30 days for High, extensible to 90 days; Critical findings get maximum embargo flexibility while a fix is in flight. Full policy and timeline at jelleo.com/security.html.
F7 · the inaugural finding
F7 is the proof that the methodology works on real code. It was the first hypothesis to confirm against Anatoly's Percolator engine, in April 2026 — disclosed in PR #39, closed without merge, with the regression suite landing on main at a1afd2e approximately one month later.
Bug class
insurance-counter-vault-divergence — the helper use_insurance_buffer shrinks the insurance counter without correspondingly debiting the vault. The post-haircut residual cash on the market grows by the absorbed amount, and the K/F-winning side claims it. A self-trade nets (ΔK − IM) · N and captures the insurance fund.
Each pillar's contribution to F7
- Hypothesis library:
H1-residual-conservation(classinvariant_property) declared the conservation property thatuse_insurance_bufferviolated. - Layer-1 recon (P2-adjacent): Multi-agent code review converged on the helper as the failing point.
- Layer-1.5 debate: An adversarial second-opinion agent flipped an early "epoch-staleness" lead that was REFUTED, and re-anchored the hunt on the residual-formula bug — which proved correct.
- Layer-2 PoC (P3 scaffold): Auto-generated cargo test fired against the engine helper, demonstrating the residual delta.
cargo testPASS. - Layer-4 LiteSVM (P3 escalation): BPF-level reproduction in LiteSVM at commit 43cdcd8 — the PoC test exists in PR #39's test suite.
- P4 attestation: The cycle was signed (Ed25519); the verification key is published.
What was disclosed
PR #39 contained: (a) the LiteSVM reachability test demonstrating the drain, (b) the suggested fix (debit the vault concurrently with the counter shrink), (c) the Markdown writeup explaining the bug-class. The PR was closed without merge — the maintainer landed an A1 regression suite on main at a1afd2e independently. The fix verified locally across 277 unit tests with zero regressions.
Cross-protocol propagation status
F7's bug_class (insurance-counter-vault-divergence) is a candidate for cross-protocol propagation against any indexed protocol with insurance accounting. The funded-state P2 build will run this propagation continuously. v0.1 ran the propagation manually against the corpus and found no immediate hits — the signature is structurally specific to Percolator's helper geometry. The pattern matcher remains armed.
Same four pillars · multiple smart-contract languages.
The four-pillar architecture (Detection · Propagation · Fix Bundle · Attestation) is language-agnostic. What changes per language is the adapter layer: which symbolic verifier runs at Layer 3, which fuzz harness runs at Layer 4, and which test runner produces the Layer 2 abort signal. Today: Solana / BPF and Aptos / Move are live. Solidity (Foundry / forge) and C (clang+sanitizers) ride the same plumbing — the dispatch layer routes by language.
| Layer | Solana adapter | Aptos adapter |
|---|---|---|
Layer 2 · PoC runtime |
cargo test --features test in the engine repo; aborts on panic! with a custom error code |
aptos move test --filter <slug>; aborts with a custom E_BUG_HIT code originating in the target module (stdlib aborts are filtered out by Layer 2.5) |
Layer 2.5 · triage |
LLM judge on cargo panic + test body; STRONG / SOFT / FALSE / LOST | Same judge, language-aware FALSE-pattern set (matches Move VM abort signatures + framework setup aborts that aren't real bugs) |
Layer 3 · symbolic |
Kani bounded model checker; assertion-based spec authored by the engine; counterexample = bug confirmed | Move Prover with Boogie + Z3 / CVC5 backends; spec authored as spec module { ... } blocks with pragmas and direct global accessors; counterexample = bug confirmed |
Layer 4 · property fuzz |
LiteSVM deploy + replay of LLM-authored harness; on-chain BPF instructions; abort or PASS pattern | aptos move test with property harness; FAIL pattern (inverted assertion fires) or PASS pattern (attacker scenario completes end-to-end). Either is bug-confirmation signal. |
Layer P3 · fix bundle |
LLM-authored unified diff against Rust source; verifier applies + runs cargo test + checks well-formedness |
LLM-authored unified diff against Move source; same 5-gate verifier (well-formedness, single-function scope, no new deps; aptos move test backend is on the funded-tier roadmap) |
Worked example · Aptos cycle 20260513-191318 (osec-aptos-small)
The Mutatis Aptos benchmark (OSec eval suite, small tier) ran end-to-end through the four pillars:
- 40 hypotheses dispatched from the OSec Aptos hypothesis class library (authorization, arithmetic, accounting, capability handling, event auditability, oracle freshness).
- 7 PoC fires at Layer 2 (test aborted in target module with custom error code).
- 4 STRONG after Layer 2.5 triage filtered 1 SOFT (wrong invariant) and 2 FALSE (stdlib / setup abort, not target).
- 2 root causes after the clustering rule collapsed 3 STRONG fires hitting
transfer_admininaccess_control.moveinto one finding. - 2 Critical findings published:
- Missing auth check on
transfer_admin— covers APT1 / APT4 / APT5 hypotheses. - Permissionless
emergency_draindrains the vault — APT38.
- Missing auth check on
- $11.62 total LLM spend, 2m 16s wall-clock.
Foundation · Production · Ceiling.
Three depth tiers, three customer profiles. Pricing is 2.4× our raw API cost (58% gross margin). Customers move between tiers month over month; engagement length is open-ended.
| Tier | Hyps / day | Mix | Engagement / mo | Engagement / yr |
|---|---|---|---|---|
| Foundation | 300 | Sonnet-only · daily runs · monitoring depth | $30K | $360K |
| Production recommended | 500 | Sonnet + tool-using Opus + weekly Kani · default for $1B+ TVL DeFi | $60K | $720K |
| Ceiling | 1,500+ | Multi-round refinement · attack-chain enumeration · Kani synthesis on every Critical/High | $120K | $1.44M |
All tiers receive: signed cycle receipts (§07), 24h / weekly / monthly reports (§08), live dashboard, immediate notification on Critical/High (§08), GitHub-issue auto-file on confirmed findings, and access to the cross-protocol propagation channel (§04). Higher tiers add: Kani proof on every Critical/High, P3 fix-bundle PRs, attack-chain enumeration depth, and on-call SLA on disclosure response.
STRIDE Tier-2 reimbursement path
STRIDE Tier-2 ($100M+ TVL) protocols qualifying for Solana Foundation formal-verification mandate funding may direct that funding toward Jelleo as the artifact-producing layer. No formal partnership today; this is a Year-1 OKR (§02).