talea

Architecture & design

Why talea is shaped the way it is. This is the rationale document — for the wire contract see the HTTP API reference; for hands-on use see the tutorial.

The problem

A ledger for cryptocurrency operations has to stay correct under three hostile conditions at once:

Unreliable networks. Clients time out, retry, and crash mid-request. A naive ledger double-posts a deposit because the first request succeeded but its response was lost.
Concurrency. Many writers hit the same book at the same time — from multiple server instances. Orderings must stay consistent and auditable after the fact.
Audit requirements. “What was the balance at 14:02 UTC?” must have exactly one answer, derivable from records that cannot be silently rewritten.

Most of talea’s design falls out of refusing to compromise on any of these.

The approach

Event sourcing with projections in the same transaction

Every write is an event in an append-only log. Balances, postings, and registries are projections of that log — but unlike classic eventually-consistent event sourcing, talea maintains the projections in the same database transaction that appends the event.

            one DB transaction
  ┌─────────────────────────────────────┐
  │ append event (seq N)                │
  │ insert postings                     │
  │ upsert balances (+ min_balance gate)│
  │ record idempotency key              │
  └─────────────────────────────────────┘
            commit → NOTIFY subscribers

You get event-sourced auditability without read-your-writes anomalies: the moment a commit returns, every read reflects it.

Trade-off: projections-in-transaction serializes writes per book and makes commits heavier than a bare append. talea accepts the write ceiling and addresses it with group commit (below) rather than giving up immediate consistency.

Gapless per-book sequences via a single write arbiter

Each book has a dense sequence 1..N with no gaps, ever. On the database backends, the implementation is a per-book counter row that the committing transaction locks and increments. That row lock is the write arbiter: it serializes commits to a book across threads, processes, and server instances, because the database is the single point of coordination. The embedded append-log backend gets the same property from a different arbiter: one in-process writer task per book assigns sequences (single-process by construction, enforced with a directory lock).

Trade-off: one hot book = one lock = a throughput ceiling per book. This is deliberate — gaplessness is an audit property worth more than write parallelism within a book. Books are the scaling unit: different books commit fully in parallel.

Idempotency keys make every retry safe

Every transaction carries a caller-supplied idempotency key, unique per book. Replaying a key returns the original commit ("deduplicated": true) instead of posting again. The key is recorded inside the commit transaction, so there is no window where a retry can double-post.

This is the load-bearing decision for the whole system’s failure story: timeouts (408 timeout), queue backpressure (429 overloaded), and admission shedding (503 overloaded) are all safe to answer with “retry the same request” — overload degrades to retry later, never maybe applied twice. The SDK encodes this: its retry policy retries 503/429/408/transport errors automatically.

The CLI never auto-generates keys (--idem is required): a generated key would defeat the purpose, because a retried CLI invocation would generate a different key.

Balances are normal-side-adjusted projections

Raw storage is debits-minus-credits. Reporting adjusts by the account’s normal side, so a liability holding 100 reads +100 rather than -100. The payoff is a uniform constraint semantics: min_balance: 0 means “never overdraw” for every account kind, checked at commit time inside the transaction.

`as_of` replays by commit time, on the database clock

Point-in-time queries (balance, trial-balance with ?as_of=) replay postings by commit time. Commit timestamps come from the database clock (clock_timestamp() on Postgres), captured under the counter lock — so they are monotonic with respect to seq within a book even when several server instances with skewed clocks share one database. Timestamps are truncated to microseconds because that is what the database stores; what you read back is exactly what was committed.

Trade-off: commit time is not business time. occurred_at exists for business time, but as_of deliberately keys on commit order — the only order the ledger can guarantee was observed.

Per-book write router with group commit

Inside one server instance, posts to the same book are routed to a per-book committer task with a bounded queue (TALEA_WRITE_QUEUE_DEPTH, default 256). The committer drains up to TALEA_WRITE_BATCH_MAX (default 64) drafts and commits them in one storage transaction — one counter-lock acquisition amortized across the batch.

posts ──► per-book queue ──► committer ──► group commit (1 txn, N drafts)
              │ full?
              └──► 429 overloaded + Retry-After (caller retries, same key)

A full queue is backpressure, not failure. Validation rejections inside a batch are isolated with savepoints so one bad draft doesn’t poison its batchmates.

Admission control, and why `/health` is inside it

A global in-flight limit (TALEA_MAX_INFLIGHT, default 256) sheds excess load immediately as 503 + Retry-After: 1 instead of queueing toward collapse. /health deliberately sits inside that limit: a 503 from /health is a real load signal. Wire it to load-balancer readiness (stop sending traffic) and not liveness (restart the instance), or saturation will eject healthy instances exactly when you need them.

Three stores, one executable contract

Store is the persistence trait; Postgres (production, LISTEN/NOTIFY subscriptions), SQLite (embedded, in-process broadcast), and the append-log store (embedded, group commit with strict ack-after-fsync — see talea-store-log) all implement it. One backend-agnostic conformance suite runs against all three — idempotency, gapless sequences, constraint enforcement, pagination, subscribe catch-up — so “the stores behave identically” is a tested claim, not a hope. The known divergence is documented: the embedded backends’ subscriptions only see commits from the same process.

Trait symmetry: `LedgerService` and `TaleaClient`

LedgerApi is implemented twice: by LedgerService (in-process, over a Store) and by TaleaClient (remote, over HTTP). Code written against the trait runs against either — a test proves it. This keeps the HTTP layer honest (it can’t grow semantics the trait doesn’t have) and makes the SDK’s behavior the server’s behavior.

Alternatives considered

Optimistic insert with unique-violation retry instead of the counter-row lock: avoids the lock but produces gaps on conflict and makes cross-instance commit timestamps racy. Rejected — gaplessness is the audit anchor. (A per-book writer actor with the DB lock retained as arbiter was spec’d as the upgrade path and later shipped as the write router.)
Async projections (classic CQRS): higher write throughput, but introduces read-your-writes anomalies and a reconciliation surface. Rejected — a ledger’s reads must be trustworthy immediately.
Server-generated idempotency keys: removes a caller burden but defeats retry safety, since a retry would carry a fresh key. Rejected at every layer including the CLI.

HTTP API reference — the wire contract these decisions produce
How to run on Postgres — the operational consequences (LB config, PgBouncer, pool sizing)
Why the append-log store works this way — how the embedded backend meets the same invariants without a database
README — Design notes and limits — accepted limitations in brief