The Demo: Try to Lose a Message
Every messaging system claims durability. Claims are cheap. Vulkan ships a demo whose entire purpose is to let you attack it — because durability is the one product quality you can verify with your own hands in five minutes, and we’d rather you trust the experiment than the brochure.
go install github.com/agentstax/vulkan/cmd/vulkan@latestvulkan demo --chaosOne command spins up a disposable Postgres (Docker), a stream called
orders, three workers, and starts producing 5,000 messages. The scoreboard
at the bottom tracks three numbers — produced, processed, lost — and the
demo prints suggestions for ways to hurt it. Your goal is to make lost read
anything but zero.
The attacks
Section titled “The attacks”-
Kill a worker mid-message.
Terminal window kill -9 $(pgrep -f vulkan-worker-2)Worker 2 dies holding claimed messages, with no chance to clean up — no shutdown hook, no rollback call, nothing. Watch the scoreboard: the messages it held go back to
readywhen their leases expire, the other workers absorb them,loststays 0.There is no recovery code running. The transaction rollback is the recovery, and the lease is the failover — both properties of the data, not of any process that can die. How leases work →
-
Crash mid-commit, deterministically.
Suspect the
killtiming was lucky? Make the worker betray you on purpose:Terminal window vulkan demo worker --crash-after 3 # os.Exit(1) mid-transaction, every 3rd messageA worker that always dies at the worst possible moment, forever. The stream still drains.
lost: 0. -
Fail messages randomly.
Terminal window vulkan demo worker --fail-rate 0.3 --retries 3Now 30% of handler calls return errors. Watch retries climb with exponential backoff, and watch stubborn messages land in the dead-letter queue — visible, counted, and queryable, not discarded:
SELECT count(*) FROM vulkan.deliveries WHERE status = 'dead';Dead letters aren’t lost messages. They’re messages waiting for you, with their full payload and error history. Redrive them →
-
Go for the database.
Terminal window docker restart vulkan-demo-postgresThe one everyone assumes is cheating. Workers lose their connections, back off, and reattach when Postgres returns; every in-flight message was either committed (done, exactly as recorded) or not (claimable again). Postgres’s crash recovery has been hardened for thirty years — that’s the point of building on it instead of beside it.
-
The cruelest one: the zombie worker.
Terminal window vulkan demo worker --sleep 30s # processing takes longer than the leaseThe worker isn’t dead — just slow. Its lease expires, the reaper re-readies the message, another worker processes it… and then the zombie finishes too. The same message processed twice. The scoreboard calls it out.
This isn’t a Vulkan bug; it’s the honest physics of distributed delivery — the same trade SQS and Pulsar make. Vulkan induces it in the demo so you design idempotent handlers knowing it’s real, instead of discovering it in production. At-least-once, and proud of it →
The finale: three groups, one log
Section titled “The finale: three groups, one log”vulkan demo synthesisThe demo that shows why Vulkan is a platform, not a queue. On a single stream, simultaneously:
- group A retries message #5 three times and dead-letters it,
- group B processes message #5 without incident,
- group C starts from offset 0 and replays the whole history.
Per-group lifecycle, shared immutable log, independent cursors — the combination that normally requires running a queue and a log and glue. Why that’s hard everywhere else →
The scoreboard never lies
Section titled “The scoreboard never lies”The demo’s counters aren’t instrumentation we control — they’re SQL against
the same tables your application would use, and the demo prints every query
it runs so you can run them yourself in psql mid-attack:
SELECT (SELECT count(*) FROM vulkan.events) AS produced, (SELECT count(*) FROM vulkan.deliveries WHERE status = 'done') AS processed, (SELECT count(*) FROM vulkan.deliveries WHERE status = 'dead') AS dead_lettered;-- "lost" is produced minus everything accounted for. It will be zero.Convinced enough to wire it into real code? Quickstart →