From Dual-Write to Outbox: A Log of Idempotent Consumption and Field-Level Encryption
Lost events from dual-write pushed me to a transactional outbox, at-least-once delivery to idempotent consumption. Then I encrypted the GDPR fields at the row level with x-gdpr-sensitive. Notes from a real project.
When I resolved the invoice number collision in that same e-invoice integration, I described moving a synchronous flow onto RabbitMQ and into async. After that migration, a quieter bug I had missed was still there: some invoices were being written to the database but their event never made it onto the queue. The consumer never saw them, the user said “I issued it,” yet there was no trace of it on the other side. One or two a day, always at night, always during a deploy or a network blip.
Once I laid the picture out, the cause was obvious: the INSERT into the database and the publish to the queue were two separate systems, with no guarantee between them. The write would succeed, and right after it the publish would drop on a network hiccup. The local transaction had committed; the message had never left. This is the dual-write problem: a single unit of work writing to two separate data stores in a non-atomic way.
This post is a log of how I solved that with a transactional outbox, how I then suppressed the at-least-once duplicates that the fix introduced via idempotent consumption, and finally how I encrypted the personal data flowing through the event at the field level. I wrote the deep architectural side separately (link below); here I want to tell the in what order did I actually make these decisions in a real project side.
The dual-write problem: two writes, zero guarantee
The crux in one sentence: you cannot stretch a single transaction across two systems. The classic code that writes to MySQL and then publishes to RabbitMQ looks like this:
DB::transaction(function () use ($invoice) {
$invoice->save(); // 1) write to MySQL
$this->publishToRabbit($invoice); // 2) publish to the queue
});
This code is dangerous because it works most of the time. But there are two distinct failure modes:
save()succeeds,publishToRabbit()drops on a network error → the invoice is in the database, the event isn’t. Lost event.publishToRabbit()succeeds, then the transactionrollbacks for some other reason → the event went out but there’s no counterpart in the database. Phantom event.
Wrapping it in DB::transaction doesn’t save you either, because RabbitMQ isn’t part of that transaction; commit/rollback only wraps the MySQL side. The practical way to keep two systems in one atomic step is to reduce the write to a single system.
The fix: write to the database first, let a separate process publish
The transactional outbox pattern is a simple idea: instead of pushing the message straight to the queue, write it as a row into an outbox table inside the same database transaction. The invoice and the event either both exist or both don’t, in the same commit — dual-write collapses into a single write.
DB::transaction(function () use ($invoice) {
$invoice->save();
OutboxMessage::create([
'id' => Str::uuid(), // the event's idempotency key
'topic' => 'invoice.issued',
'payload' => $this->buildPayload($invoice),
'status' => 'pending',
'created_at' => now(),
]);
});
A separate relay (publisher) does the queue push: it reads the pending rows, publishes them to RabbitMQ, and marks them published on success. I covered the basics of using RabbitMQ in an earlier post on RabbitMQ with PHP; the only difference here is that the publish call now lives in this relay, not in the business code.
The key point: in the “I published but crashed before stamping it published” case, the relay can publish the same row again. So while the outbox solves dual-write, it gives you no free guarantee; it gives you at-least-once. The message isn’t lost, but it can repeat. That’s the next problem.
Once you have at-least-once: the consumer must be idempotent
Under at-least-once delivery you have to assume every event will arrive at least once, sometimes more than once. The fix is not to try to make the event unique — it’s to set up the consumer so that processing the same event twice has the same effect as processing it once. This is the queue-side face of the discipline I described when writing about idempotency in APIs: there the client generated an Idempotency-Key; here the event’s own id is that key.
The consumer writes the id of every event it processes into a processed_messages table, and does the work inside the same transaction:
public function handle(array $event): void
{
DB::transaction(function () use ($event) {
// If already processed, exit quietly (the unique constraint guarantees this)
$inserted = DB::table('processed_messages')->insertOrIgnore([
'message_id' => $event['id'],
'processed_at' => now(),
]);
if ($inserted === 0) {
return; // it's a repeat; produce no side effects
}
$this->applyBusinessEffect($event); // the real work — exactly once
});
}
The unique constraint on processed_messages.message_id is the heart of it: a second arrival of the same id is silently dropped by insertOrIgnore, and the business effect never runs. Putting the business effect in the same transaction is mandatory — otherwise the “I processed it but crashed before stamping” gap opens, which is exactly what we were running from.
The depth of this pattern lives on sade.dev
Turning the outbox from a working journal entry into a durable system rests on a few decisions I left out of scope here: whether to feed the relay with polling or with CDC (change data capture), how to dispatch pending rows across multiple relays without collisions using SELECT ... FOR UPDATE SKIP LOCKED, how far delivery ordering can be guaranteed, how to prune the processed_messages table, and why exactly-once is usually an illusion. I collected those, with code and diagrams, separately:
Transactional Outbox: Dual-Write, At-Least-Once, and Idempotent Consumption → (sade.dev)
This blog’s lane stays hands-on; I leave the system-level design of the pattern to sade.dev.
And then GDPR: the personal data flowing through the event
Once the outbox settled in, a new question came up: these events carry personal data in their payload — customer name, tax ID, address — and that payload now sits in a durable table (outbox), and on top of that passes through the RabbitMQ broker. So if I left the data as-is, I’d be copying GDPR/KVKK-scoped fields, unencrypted, into more than one place.
I didn’t want to encrypt the whole payload — because I needed to see fields like topic and invoice_id in the clear for the relay and for observability. What I needed was field-level encryption: only the sensitive fields encrypted, the rest in the clear. I marked the sensitive fields in the payload schema with x-gdpr-sensitive and encrypted only those at serialization time:
// The schema declares which fields are sensitive
$schema = [
'invoice_id' => ['x-gdpr-sensitive' => false],
'customer_name'=> ['x-gdpr-sensitive' => true],
'tax_id' => ['x-gdpr-sensitive' => true],
'total' => ['x-gdpr-sensitive' => false],
];
$payload = collect($raw)->map(function ($value, $field) use ($schema) {
return ($schema[$field]['x-gdpr-sensitive'] ?? false)
? Crypt::encryptString((string) $value) // only the sensitive field is encrypted
: $value;
})->all();
Crypt is Laravel’s AES-256 encryption keyed by APP_KEY; on the consumer side it’s opened with Crypt::decryptString using the same key. In practice this had three benefits:
- No plaintext personal data sits in the outbox or the broker. Even if I accidentally log a table or a queue,
tax_idstays encrypted. - Clear fields stay clear. I can still trace the event by
invoice_idand produce metrics fromtotal; encryption doesn’t kill observability. - The schema documents itself. The
x-gdpr-sensitiveflag bakes the answer to “which field is personal data?” into the code; you don’t have to guess later during an audit.
Key management, rotation, and questions like “how do you search an encrypted field” are beyond the bounds of this log — those, too, are sade.dev’s territory.
Three things I noted in the log
Solving one pattern opens the next. The outbox closed dual-write but put at-least-once duplicates in its place; idempotent consumption closed that. Every guarantee comes at a cost, and before saying “solved” you have to ask what did I put in its place. It’s not a one-shot thing — it’s a chain of decisions.
Design the consumer, not the delivery guarantee. For a long time I wanted to chase “let the message arrive exactly once”; but in a distributed system the cheap and honest guarantee is at-least-once. The real work, as in the idempotency post, is designing the repeat to be side-effect-free. If the consumer is idempotent, the looseness of the delivery guarantee stops being a problem.
Think about compliance while designing the payload, not afterwards. Marking the personal data with x-gdpr-sensitive from the start made encryption a natural step of serialization rather than an “add-on.” Had I left it for last, I’d have been trying to round up plaintext data already copied into several places — which is the most expensive job of all.
In the end the three pieces fold into one sentence: write the data to a single system, suppress repeats at the consumer, seal the sensitive field before it leaves. It led to the same door as the lesson from the invoice number collision in the e-invoice integration — sealing state, just in time and correctly, before it opens to the outside world.
Comments
Sign in with your GitHub account to join the discussion. Comments are stored in GitHub Discussions.