Architecture

Acuity Index is a config-driven event indexer for Substrate chains. It decodes runtime events with subxt, derives query keys from TOML config, stores index entries in sled, and serves query access over WebSocket.

Main Components

src/main.rs: startup, config loading, database initialization, watcher setup, metrics listener, and reconnect supervisor loop
src/indexer.rs: indexing pipeline, resume logic, live-head tailing, event key derivation, notification fanout
src/config.rs: TOML schema and mapping resolution
src/websockets.rs: public API implementation, connection lifecycle, and optional finalized proof inclusion for GetEvents
src/shared.rs: wire types, sled key layouts, shared runtime state, and finalized-mode gating for proof responses
src/event_hydration.rs: decoded-event hydration plus finalized System.Events proof fetching
src/config_gen.rs: live metadata to starter spec generation
src/metrics.rs: metrics registry and HTTP export
src/synthetic_devnet.rs: synthetic local chain helpers and shared test types

The repository also includes a checked-in example spec, plus node-backed integration and benchmarking paths. The synthetic runtime has its own dedicated page in Synthetic Devnet.

Startup Sequence

The normal startup path is:

parse CLI args
load the required index spec
validate config and resolve runtime options
open sled
verify or initialize genesis_hash
start long-lived tasks such as WebSocket serving and spec watching
enter the RPC reconnect and indexing supervisor loop

The long-lived tasks created before entering the supervisor loop are important:

a bounded subscription dispatcher task
a single process-lifetime WebSocket listener
an optional metrics listener serving /metrics
an index-spec watcher for accepted file changes

The shared runtime state also tracks whether the current run is indexing finalized blocks so WebSocket proof responses stay aligned with the active mode.

Data Model

The sled database is organized into trees opened by Trees::open in src/shared.rs:

root: database-level values such as genesis_hash
span: indexed block spans for resume and reindex logic
variant: event references keyed by pallet and variant indices
index: custom and built-in query keys with (block_number, event_index) suffixes
events: decoded event JSON keyed by (block_number, event_index)

The two main query surfaces are:

variant queries via the variant tree
all other key queries via the index tree

Key::Custom covers every declared query key from [keys] in the TOML spec.

Indexing Flow

The main indexing loop lives in run_indexer in src/indexer.rs.

High-level behavior:

determine the starting head block
load previously indexed spans
resume an existing tail span or index the current head immediately
run backward backfill and live-head tracking concurrently

If finalized mode is enabled, the same finalized-only setting also governs whether API callers can receive verifiable proof material for GetEvents.

Per-block indexing follows this shape:

fetch block hash from RPC
create a block-scoped subxt view with api.at_block(hash)
fetch and iterate decoded runtime events
for each event:
read pallet name, event name, pallet index, and variant index
optionally write a variant index record if index_variant is enabled
decode fields schema-lessly into `scale_value::Composite<()>
derive indexing keys from explicit config
write event references for each derived key
store event refs locally and hydrate decoded event payloads from the node when queries or subscriptions need them

When GetEvents requests includeProofs = true, the WebSocket layer also asks for one proof object per returned block. That proof is built from the block header plus state_get_read_proof over the System.Events storage key.

Malformed persisted sled data is handled defensively during decode. Corrupt span records or malformed index keys are skipped with logging instead of panicking.

Historical State Requirement

Indexer::index_block requires historical state to be available for api.at_block(hash). If the node prunes historical state, the process exits with an explicit misconfiguration error and instructs the operator to run the node with --state-pruning archive-canonical.

Concurrency Model

The indexer uses one async loop that multiplexes several inputs with tokio::select!:

exit notifications
new chain head notifications from subxt
queued live-head indexing futures
queued backfill indexing futures
periodic stats logging

Two queues are maintained inside the loop:

backfill queue for descending historical blocks
live-head queue for ascending new blocks

queue_depth applies to both queues. Multiple outstanding block-indexing futures are allowed so the process can catch up against a fast-moving node.

Because futures can complete out of order, the code uses orphan maps:

orphans for backfill continuity
head_orphans for live-head continuity

Blocks only extend active spans once contiguity is satisfied.

Catchup And Rapidly Syncing Nodes

When a node jumps ahead quickly, the indexer does not poll for lag. It reacts to new announced head blocks, updates latest_seen_head, and fills the live-head queue immediately up to queue_depth.

This is the main reason --queue-depth is the primary tuning lever for catchup. The same mechanism keeps both head-following and backward backfill saturated.

Span And Resume Semantics

Each stored span means a contiguous start..=end block range has been indexed for a specific config revision.

Span values persist revision boundary information derived from IndexSpec.spec_change_blocks.

When loading spans, the indexer may trim or discard stale sections if the active spec introduces new historical revision boundaries.

Important invariants:

the active in-memory span is not always persisted immediately
on shutdown or recoverable failure, save_current_span(...) persists progress
if the upstream block stream closes, the indexer returns a recoverable error so the supervisor loop can reconnect and resume

Changes to index_variant only trigger historical reindexing when the spec revision advances via spec_change_blocks.

RPC Reconnection And Spec Reload

src/main.rs wraps the indexer in a supervisor loop that handles transient RPC failures and accepted spec reloads without data loss.

Recoverable errors trigger reconnect with exponential backoff. Fatal errors cause process exit.

Supervisor behavior, in practice:

derive the effective RPC URL from the latest accepted config snapshot
attempt RPC connection
verify chain genesis hash
publish the fresh RPC handle into shared runtime state
spawn the indexer task
wait for signals, watcher updates, or indexer completion
on accepted spec reload, stop only the current indexer and restart it
on recoverable indexer failure, reconnect and resume
on fatal failure, log and exit

The spec watcher validates the entire updated file before publishing it and rejects changes to name or genesis_hash.

Synthetic Devnet Architecture

The local synthetic stack stays intentionally close to production architecture.

Layers:

runtime/ builds a small Polkadot SDK runtime WASM
polkadot-omni-node runs that runtime locally
src/synthetic_devnet.rs renders the matching index spec
seed_synthetic_runtime writes deterministic chain data
tests and benchmarks validate the public WebSocket API against the real indexer

This is not a mocked shortcut. It exercises the normal RPC, metadata decoding, indexing, persistence, and query surfaces end to end.

For proof-oriented tests, the local node is started in a libp2p-enabled mode instead of instant-seal dev mode so finalized proof verification can run against a more realistic finalized-chain setup.

Invariants

a database path belongs to exactly one chain identity
accepted spec reloads should not take down the public service
the synthetic test path should stay close to production architecture
public API correctness is validated end to end, not only through unit tests