Testing & Verification
Last Updated: 2026-03-01
Our V2 smart contract testing strategy employs an 8-layer security stack to ensure security, correctness, and resilience against edge cases and attacks. All 3 V2 contracts are deployed to Base Mainnet and verified on BaseScan.
Testing Overview
V2 Test Suite (March 1, 2026 — post staging red team)
├── 95 Hardhat Unit Tests (~45s)
├── 16 Foundry Fuzz Tests @ 10K runs (160K iterations)
├── 137 Medusa Parallel Fuzz Tests (8 workers)
├── 60/64 Halmos Symbolic Proofs (4 solver timeouts — 2 resolved post-CEI refactor)
├── 19 Certora Formal Verification Rules (all re-verified post-CEI)
├── 441 Gambit Mutation Mutants Generated
└── Slither Static Analysis (0 high/medium)All 3 V2 contracts deployed to Base Mainnet with full 8-layer security verification. Red team staging engagement fixed 9 findings (1 Critical CEI, 3 High, 2 Medium). Certora re-verified all 3 contracts post-CEI-refactor.
1. Unit Tests (Hardhat — 95 Tests)
Traditional unit tests covering expected behavior, edge cases, and full lifecycle for all V2 contracts.
Coverage Areas
| Contract | Coverage |
|---|---|
| AbbaBabaScore.sol | Score deltas, tier boundaries, completion/dispute/abandonment, admin adjustments |
| AbbaBabaEscrow.sol | Escrow lifecycle, fee calculation, delivery, finalization, disputes, abandonment |
| AbbaBabaResolver.sol | Resolution outcomes, split validation, role access, admin overrides |
Running Unit Tests
cd contracts
npx hardhat test
# With coverage
npx hardhat coverage2. Fuzz Testing (Foundry — 16 Tests)
Foundry fuzz tests with 10,000 runs per test, covering fee calculations, tier boundaries, score deltas, and escrow lifecycle edge cases.
| Suite | Tests | Runs Each | Status |
|---|---|---|---|
| FuzzScoreV2 | 9/9 | 10,000 | All Pass |
| FuzzEscrowV2 | 7/7 | 10,000 | All Pass |
| Total | 16/16 | 10K each | All Pass |
160,000 total fuzz iterations with zero failures.
Running Foundry Fuzz Tests
cd contracts
forge test --fuzz-runs 100003. Parallel Fuzz Testing (Medusa — 137 Tests)
Medusa runs 8 parallel workers for faster coverage of property-based invariants.
| Contract | Tests | Status |
|---|---|---|
| MedusaScoreV2 | Score invariants (tiers, deltas, floor, ceiling) | All Pass |
| MedusaEscrowV2 | Escrow invariants (fees, balance, status, timing) | All Pass |
| Total | 137/137 | All Pass |
Note: Count decreased from 138 to 137 on Mar 1 when MockERC20.burn() was removed (real USDC has no public burn; its presence was causing false-positive invariant failures with testAllContracts: true).
Key Invariants Tested
- Platform fee is exactly 2% (200 BPS)
- Locked amount is exactly 98%
- Treasury receives fee at creation
- Balance conservation across all escrows
- Valid status transitions (never backward)
- Dispute only from Delivered state
- Abandon only from Funded after deadline
- Seller receives exactly lockedAmount
- Score can go negative (no floor on trust score)
- Score 100+ = unlimited max job value
- Score 0 = $10 floor
- Tier values are monotonically increasing
Running Medusa
cd contracts
medusa fuzz --config medusa-v2.json4. Symbolic Execution (Halmos — 60/64 Proofs)
Halmos uses symbolic execution to mathematically verify properties over all possible inputs.
| Contract | Proofs | Passed | Timeouts |
|---|---|---|---|
| HalmosScoreV2 | 26 | 26 | 0 |
| HalmosEscrowV2 | 21 | 17 | 4 |
| HalmosResolverV2 | 17 | 17 | 0 |
| Total | 64 | 60 | 4 |
4 timeout analysis: All timeouts are on nonlinear uint256 arithmetic (multiplication/division with large symbolic inputs). This is an inherent SMT solver limitation. Two timeouts resolved after the CEI refactoring in the staging red team tightened the reachable state space. Same properties verified by Certora Prover and 160K+ fuzz iterations.
Timed-out proofs: check_feeLessThanAmount, check_fullBuyerRefund, check_noOverflowOnFee, check_noOverflowOnSplit.
Running Halmos
cd contracts
forge build --config-path foundry.toml
halmos --contract HalmosScoreV2 --forge-build-out foundry-out --timeout 120
halmos --contract HalmosEscrowV2 --forge-build-out foundry-out --timeout 120
halmos --contract HalmosResolverV2 --forge-build-out foundry-out --timeout 1205. Formal Verification (Certora — 19 Rules)
Mathematical proofs that critical properties hold for all possible inputs and states.
ALL 3 CONTRACTS RE-VERIFIED on March 1, 2026 — Certora Prover returned “No errors found by Prover!” for all 3 contracts.
| Contract | Rules | Status | Report |
|---|---|---|---|
| AbbaBabaScore | 8 rules | All Verified | View |
| AbbaBabaEscrow | 6 rules | All Verified | View |
| AbbaBabaResolver | 5 rules | All Verified | View |
| Total | 19 rules | All Verified | — |
Rules Verified:
feeIs2PercentRule— PLATFORM_FEE_BPS == 200feeCalculationCorrect— Math matches specificationcompletionPointsConstantRule— COMPLETION_POINTS == 1abandonmentPenaltyConstantRule— ABANDONMENT_PENALTY == -5tierBoundariesRule— All 11 tiers correctresolverRoleConstantRule— Role hash constantdefaultAdminRoleZeroRule— Admin role is 0x00- Status transitions, access control, split validation, and more
Running Certora
# From contracts/ directory — CERTORAKEY must be exported (not just sourced)
CERTORAKEY=$(grep "^CERTORAKEY=" .env | cut -d'=' -f2) && export CERTORAKEY
certoraRun certora/conf/ScoreV2.conf --disable_local_typechecking
certoraRun certora/conf/EscrowV2.conf --disable_local_typechecking
certoraRun certora/conf/ResolverV2.conf --disable_local_typechecking6. Mutation Testing (Gambit — 441 Mutants)
Mutation testing introduces bugs into the code to verify tests catch them.
| Contract | Mutants Generated | Details |
|---|---|---|
| AbbaBabaScore | 121 | Operator, literal, and statement mutations |
| AbbaBabaEscrow | 282 | Comprehensive mutation coverage |
| AbbaBabaResolver | 38 | All mutation operators applied |
| Total | 441 | 3.2x increase from Feb 14 (138 mutants) |
Running Gambit
cd contracts
gambit mutate --filename contracts/AbbaBabaScore.sol
gambit mutate --filename contracts/AbbaBabaEscrow.sol
gambit mutate --filename contracts/AbbaBabaResolver.sol7. Static Analysis (Slither)
slither . --exclude-dependencies --filter-paths "test|mock|archive"Results: 0 High, 0 Medium, 23 Informational (lint-level, no action needed)
Test Metrics
Combined Test Count
Hardhat Unit: 95
Foundry Fuzz: 16
Medusa Parallel: 137
─────────────────────
Total Tests: 248
Halmos Proofs: 60/64
Certora Rules: 19
Gambit Mutants: 441Test Execution Time
Hardhat Unit Tests: ~45s
Foundry Fuzz (10K): ~30s
Medusa Fuzz: ~5 min
Halmos Symbolic: ~4 min
Certora: ~15 min (cloud)
Slither Static: ~10s
──────────────────────────────
Total Suite: ~25 minutesRunning the Full Test Suite
cd contracts
# Hardhat unit tests
npx hardhat test
# Foundry fuzz tests
forge test --fuzz-runs 10000
# Medusa parallel fuzz
medusa fuzz --config medusa-v2.json
# Halmos symbolic execution
halmos --solver-timeout-assertion 120
# Certora formal verification (requires API key)
certoraRun certora/conf/AbbaBabaScore.conf
certoraRun certora/conf/AbbaBabaEscrow.conf
certoraRun certora/conf/AbbaBabaResolver.conf
# Slither static analysis
slither . --exclude-dependencies --filter-paths "test|mock|archive"
# Gambit mutation testing
gambit mutate --filename contracts/AbbaBabaScore.solContinuous Integration
All tests run automatically on every push to main:
smart-contract-security-v2.yml
├── Slither Static Analysis Passing
├── Foundry Fuzz Tests 16/16 passing
├── Halmos Symbolic 58/64 passing
├── Certora Formal 19 rules verified
└── Mutation Testing 441 mutants generatedRelated Documentation
- Audit Dashboard — Full security audit results
- Methodology — Audit process and tools
- Daily Reports — Daily test reports
- Archive — Historical audit records
V2 Contracts Live on Base Mainnet: With 248 tests, 60 Halmos proofs, 19 Certora rules (re-verified post-CEI-refactor), and 441 Gambit mutants, these contracts represent the most thoroughly tested A2A settlement contracts in production.