Architecture
Acuity Index is a config-driven event indexer for Substrate chains. It decodes
runtime events with subxt, derives query keys from TOML config, stores index
entries in sled, and serves query access over WebSocket.
Main Components
src/main.rs: startup, config loading, database initialization, watcher setup, metrics listener, and reconnect supervisor loopsrc/indexer.rs: indexing pipeline, resume logic, live-head tailing, event key derivation, notification fanoutsrc/config.rs: TOML schema and mapping resolutionsrc/websockets.rs: public API implementation, connection lifecycle, and optional finalized proof inclusion forGetEventssrc/shared.rs: wire types, sled key layouts, shared runtime state, and finalized-mode gating for proof responsessrc/event_hydration.rs: decoded-event hydration plus finalizedSystem.Eventsproof fetchingsrc/config_gen.rs: live metadata to starter spec generationsrc/metrics.rs: metrics registry and HTTP exportsrc/synthetic_devnet.rs: synthetic local chain helpers and shared test types
The repository also includes a checked-in example spec, plus node-backed integration and benchmarking paths. The synthetic runtime has its own dedicated page in Synthetic Devnet.
Startup Sequence
The normal startup path is:
- parse CLI args
- load the required index spec
- validate config and resolve runtime options
- open
sled - verify or initialize
genesis_hash - start long-lived tasks such as WebSocket serving and spec watching
- enter the RPC reconnect and indexing supervisor loop
The long-lived tasks created before entering the supervisor loop are important:
- a bounded subscription dispatcher task
- a single process-lifetime WebSocket listener
- an optional metrics listener serving
/metrics - an index-spec watcher for accepted file changes
The shared runtime state also tracks whether the current run is indexing finalized blocks so WebSocket proof responses stay aligned with the active mode.
Data Model
The sled database is organized into trees opened by Trees::open in
src/shared.rs:
root: database-level values such asgenesis_hashspan: indexed block spans for resume and reindex logicvariant: event references keyed by pallet and variant indicesindex: custom and built-in query keys with(block_number, event_index)suffixesevents: decoded event JSON keyed by(block_number, event_index)
The two main query surfaces are:
- variant queries via the
varianttree - all other key queries via the
indextree
Key::Custom covers every declared query key from [keys] in the TOML spec.
Indexing Flow
The main indexing loop lives in run_indexer in src/indexer.rs.
High-level behavior:
- determine the starting head block
- load previously indexed spans
- resume an existing tail span or index the current head immediately
- run backward backfill and live-head tracking concurrently
If finalized mode is enabled, the same finalized-only setting also governs
whether API callers can receive verifiable proof material for GetEvents.
Per-block indexing follows this shape:
- fetch block hash from RPC
- create a block-scoped
subxtview withapi.at_block(hash) - fetch and iterate decoded runtime events
- for each event:
- read pallet name, event name, pallet index, and variant index
- optionally write a variant index record if
index_variantis enabled - decode fields schema-lessly into `scale_value::Composite<()>
- derive indexing keys from explicit config
- write event references for each derived key
- store event refs locally and hydrate decoded event payloads from the node when queries or subscriptions need them
When GetEvents requests includeProofs = true, the WebSocket layer also asks
for one proof object per returned block. That proof is built from the block
header plus state_get_read_proof over the System.Events storage key.
Malformed persisted sled data is handled defensively during decode. Corrupt span records or malformed index keys are skipped with logging instead of panicking.
Historical State Requirement
Indexer::index_block requires historical state to be available for
api.at_block(hash). If the node prunes historical state, the process exits with
an explicit misconfiguration error and instructs the operator to run the node
with --state-pruning archive-canonical.
Concurrency Model
The indexer uses one async loop that multiplexes several inputs with
tokio::select!:
- exit notifications
- new chain head notifications from
subxt - queued live-head indexing futures
- queued backfill indexing futures
- periodic stats logging
Two queues are maintained inside the loop:
- backfill queue for descending historical blocks
- live-head queue for ascending new blocks
queue_depth applies to both queues. Multiple outstanding block-indexing futures
are allowed so the process can catch up against a fast-moving node.
Because futures can complete out of order, the code uses orphan maps:
orphansfor backfill continuityhead_orphansfor live-head continuity
Blocks only extend active spans once contiguity is satisfied.
Catchup And Rapidly Syncing Nodes
When a node jumps ahead quickly, the indexer does not poll for lag. It reacts to
new announced head blocks, updates latest_seen_head, and fills the live-head
queue immediately up to queue_depth.
This is the main reason --queue-depth is the primary tuning lever for catchup.
The same mechanism keeps both head-following and backward backfill saturated.
Span And Resume Semantics
Each stored span means a contiguous start..=end block range has been indexed
for a specific config revision.
Span values persist revision boundary information derived from
IndexSpec.spec_change_blocks.
When loading spans, the indexer may trim or discard stale sections if the active spec introduces new historical revision boundaries.
Important invariants:
- the active in-memory span is not always persisted immediately
- on shutdown or recoverable failure,
save_current_span(...)persists progress - if the upstream block stream closes, the indexer returns a recoverable error so the supervisor loop can reconnect and resume
Changes to index_variant only trigger historical reindexing
when the spec revision advances via spec_change_blocks.
RPC Reconnection And Spec Reload
src/main.rs wraps the indexer in a supervisor loop that handles transient RPC
failures and accepted spec reloads without data loss.
Recoverable errors trigger reconnect with exponential backoff. Fatal errors cause process exit.
Supervisor behavior, in practice:
- derive the effective RPC URL from the latest accepted config snapshot
- attempt RPC connection
- verify chain genesis hash
- publish the fresh RPC handle into shared runtime state
- spawn the indexer task
- wait for signals, watcher updates, or indexer completion
- on accepted spec reload, stop only the current indexer and restart it
- on recoverable indexer failure, reconnect and resume
- on fatal failure, log and exit
The spec watcher validates the entire updated file before publishing it and
rejects changes to name or genesis_hash.
Synthetic Devnet Architecture
The local synthetic stack stays intentionally close to production architecture.
Layers:
runtime/builds a small Polkadot SDK runtime WASMpolkadot-omni-noderuns that runtime locallysrc/synthetic_devnet.rsrenders the matching index specseed_synthetic_runtimewrites deterministic chain data- tests and benchmarks validate the public WebSocket API against the real indexer
This is not a mocked shortcut. It exercises the normal RPC, metadata decoding, indexing, persistence, and query surfaces end to end.
For proof-oriented tests, the local node is started in a libp2p-enabled mode instead of instant-seal dev mode so finalized proof verification can run against a more realistic finalized-chain setup.
Invariants
- a database path belongs to exactly one chain identity
- accepted spec reloads should not take down the public service
- the synthetic test path should stay close to production architecture
- public API correctness is validated end to end, not only through unit tests