Event Sourcing
Store the stream of events as the source of truth and derive current state by replaying.
Event sourcing stores the append-only stream of domain events as the source of truth. Instead of saving only "account balance = $50", the system saves the sequence "AccountOpened", "Deposited $100", "Withdrawn $30", and "Withdrawn $20". Current state is a derived view produced by replaying the events in order.
The problem: current state overwrites the story
Current-state systems are easy to query, but every update destroys context unless you build separate audit tables. If a booking changed from PENDING to CONFIRMED toCANCELLED, the final row may not tell you who changed it, which payment event arrived first, or what the system believed at 10:05. Event sourcing makes the history the primary data.
| Model | Source of truth | Strength | Cost |
|---|---|---|---|
| Current-state CRUD | Latest row values | Simple queries and updates | History must be bolted on separately |
| Audit tables | Latest row plus change log | Good compliance trail | Can drift from domain events and often lacks replay semantics |
| Event sourcing | Append-only event stream | Audit, replay, time travel, rebuildable projections | More design and operational complexity |
Event streams: append-only facts per aggregate
Events are usually grouped by aggregate: one account, order, booking, cart, or document. Each stream has a monotonically increasing version. Commands validate against the current aggregate state, then append one or more new events if the command is allowed.
account-42 event stream:
v1 AccountOpened { ownerId: "u1", currency: "USD" }
v2 MoneyDeposited { amountCents: 10000 }
v3 MoneyWithdrawn { amountCents: 3000 }
v4 MoneyWithdrawn { amountCents: 2000 }
replay(events):
state = { opened: false, balanceCents: 0 }
for event in events ordered by version:
if event.type == "AccountOpened":
state.opened = true
state.currency = event.currency
if event.type == "MoneyDeposited":
state.balanceCents += event.amountCents
if event.type == "MoneyWithdrawn":
state.balanceCents -= event.amountCents
return state
result: { opened: true, balanceCents: 5000, currency: "USD" }Optimistic concurrency
Appending usually includes an expected version: "appendMoneyWithdrawnonly if the stream is still at version 4". If another writer appended version 5 first, the command reloads, checks business rules again, and retries or rejects. This protects invariants without locking the entire system.
Snapshots: replay faster without changing the truth
Replaying ten events is cheap. Replaying ten million events for a hot aggregate on every request is not. A snapshot stores the derived state at a particular event version. To load the aggregate, read the latest snapshot and replay only events after it.
snapshot:
aggregateId: account-42
version: 100000
state: { balanceCents: 918273, status: "OPEN" }
loadAggregate(account-42):
snapshot = readLatestSnapshot(account-42)
events = readEventsAfter(account-42, snapshot.version)
return replayFrom(snapshot.state, events)- Snapshot every N events: simple and predictable, common for aggregates with long histories.
- Snapshot by cost: create one when replay time exceeds a threshold rather than every fixed count.
- Snapshot is cache, not truth: if a snapshot is corrupted, you can rebuild it from the event stream.
CQRS and read models: write facts, query projections
Event-sourced write models are not optimized for arbitrary queries like "show the last 50 orders for this customer". The common pairing isCQRS: commands append events to the write model, and asynchronous projectors build read models tailored to screens, search, analytics, and APIs.
command API:
CancelOrder(orderId)
→ validate by replaying order stream
→ append OrderCancelled v12
projectors consume events:
OrderCancelled
→ update orders_by_customer table
→ remove shipment task
→ update support dashboard
→ publish integration event to KafkaThose projection updates are often delivered through a log such as Kafka or through an outbox/CDC pipeline. Read models can lag behind the write stream, so user experience must handle eventual consistency with loading states or read-your-writes shortcuts.
Benefits: audit, time travel, and debugging
Event sourcing shines where the history is valuable, not just the final value. Financial ledgers, booking systems, inventory movements, workflow engines, source control, collaboration logs, and payment systems all benefit from a trustworthy sequence of facts.
- Audit: every decision can point to the events that caused it, including who initiated commands and when.
- Time travel: replay up to event version 123 to answer what the system believed at that moment.
- Debugging: copy one aggregate stream into a test and reproduce a bug exactly.
- Integration: downstream services can consume the same domain events that created the state.
Downsides: schema evolution and operational complexity
Event sourcing is a commitment. Events are long-lived APIs to your own future code. You cannot casually rename fields or reinterpret old facts without migration or upcasting. Teams also have to operate projectors, handle duplicate delivery, monitor lag, and explain eventual consistency to product owners.
| Gotcha | Why it matters | Common mitigation |
|---|---|---|
| Event schema evolution | Old events must still replay years later | Version events and use upcasters |
| Projection lag | Read model may trail the write stream | Expose pending states and monitor consumer lag |
| Idempotency | Projectors may process the same event more than once | Store last processed event id per projector |
| Privacy deletion | Immutable logs conflict with erasure requirements | Encrypt PII separately or store references that can be scrubbed |
| Overuse | CRUD domains become unnecessarily hard | Use it only where history has product or compliance value |
- Event sourcing stores immutable domain events as the source of truth; current state is derived by replaying them in order.
- Snapshots speed up loading long streams but remain rebuildable cache, not the authoritative record.
- CQRS pairs naturally with event sourcing: append events on the write side and build query-optimized read models asynchronously.
- The biggest benefits are audit, time travel, reproducible debugging, and rebuilding new projections from old facts.
- The biggest costs are schema evolution, projection lag, idempotent consumers, privacy handling, and added mental model complexity.
Mark it complete to track your progress through the workbook.