Protocol level integrity guarantees in Kafka

I was recently asked to design a method to meet ITAC (IT Application Controls) standards for critical data flows in our organisation. ITAC are application-level controls looking mainly at how we ensure the completeness, accuracy and validity of transactions – for example, invoices or trades. Our control set is based on the ICFR principles, of which ITAC is one part.

The specific control objective I was asked to look at relates to the risk of loss of integrity of financial data transfers. The focus on the integrity of the data, not authenticity or non-repudiation is really important as it means cryptographic solutions aren’t required – Kafka’s native protocol features can satisfy the requirements.

Traditionally, we would compute a client side checksum over the key elements of the payload (e.g. if this is an invoice, the customer, invoice total, number of line items etc.) and send that with the message. This could then be recomputed on the receiving end regardless of the format transformation which happens in the middleware (e.g. converting JSON to XML to feed to a legacy system).

Why this is hard to apply in Kafka

These ITAC standards were developed around “classic” middleware like BizTalk, which operates fundamentally differently than Kafka. BizTalk functions as an ETL (Extract, Transform, Load) process where data transformation and potential loss are expected parts of the workflow. Kafka, however, operates more like a database log – it’s an append-only, read-only system once messages are written.

This architectural difference is significant. Most integrity risks arise from the ETL nature of traditional middleware, but this risk profile doesn’t apply to Kafka’s immutable message model.

How Kafka guarantees integrity

ITAC breaks down integrity guarantees into two core checks:

1. Completeness Controls

Standard Requirement: Implement reconciliation-based checks and interface failure monitoring.

Kafka’s Native Solution: Kafka’s “at least once” delivery guarantee is built directly into the protocol. When both producers and consumers are properly configured (which they are by default), this eliminates the need for additional completeness controls as we know that messages will be represented to the client until the client acknowledges them.

For monitoring i.e. can we tell whether an application is consistently failing, OpenTelemetry (OTEL) provides sufficient coverage, though we may need to fine-tune client OTEL configurations. Some clients process massive data volumes in a very short period of time which can cause them to drop telemetry based on available memory and throughput capacity, creating random gaps in the telemetry. Implementing sampling to prioritise delivery of error events would address this concern.

2. Accuracy Controls

Traditional Challenge: There are a number of accuracy risks stemming from ETL middleware’s data transformation capabilities. For example, how do we know the transformation logic is correct, or executed with ACIDic guarantees?

Kafka’s Advantage: Messages cannot be altered once sent, and end-to-end accuracy is already embedded in the Kafka protocol through a three-layer approach:

Producer Side:

Generates batches with RecordHeader metadata
Includes record count and batch-level CRC (checksum) calculated by the producer
Ensures data integrity from the source

Broker Side:

Multiple mechanisms record and test batch CRC with automatic index and log rebuilding on integrity failures
Validation rule: “A message entry is valid if the sum of its size and offset are less than the length of the file AND the CRC32 of the message payload matches the CRC stored with the message”, leading to automatic log truncation to the last valid offset when corruption is detected.
The broker serves entire batches to consumers (no selective message retrieval) – therefore as long as the batch CRC matches the CRC from the producer, we can be very sure that the content is correct.

Consumer Side:

CRC checking of received batches is enabled by default in the client, providing strong protection against “on-the-wire or on-disk corruption”
The entire batch is validated by the client a using batch-level checksum – per-record checksums were removed in Kafka 0.11.0
Requires clients to keep this feature enabled (default behavior)

The Bottom Line

Kafka’s architecture inherently satisfies ITAC integrity requirements without additional controls. The protocol’s built-in checksums, delivery guarantees, and immutable message design provide the completeness and accuracy controls that ITAC standards require.

When using Kafka, we should focus on making use of the application semantics offered by Kafka – such as eventual consistency and an event-driven architecture – rather than spending our time implementing additional integrity layers – the controls are already there, working at the protocol level.

Why this is hard to apply in Kafka

How Kafka guarantees integrity

1. Completeness Controls

2. Accuracy Controls

The Bottom Line

Leave a Reply Cancel reply