HTTP to Kafka (Deprecated)
The HTTP to Kafka origin listens on an HTTP endpoint and writes the contents of all authorized HTTP POST requests directly to Kafka. However, the HTTP to Kafka origin is now deprecated and will be removed in a future release. We recommend using the HTTP Server origin that can use multiple threads to enable parallel processing of data from multiple HTTP clients.
Use the HTTP to Kafka origin to write large volumes of HTTP POST requests immediately to Kafka without additional processing. To perform processing, you can create a separate pipeline with a Kafka Consumer origin that reads from the Kafka topic.
If you need to process data before writing it to Kafka or need to write to a destination system other than Kafka, use the HTTP Server origin.
You can configure multiple HTTP clients to send data to the HTTP to Kafka origin. Just complete the necessary prerequisites before you configure the origin. Here is an example of the architecture for using the HTTP to Kafka origin:
When you configure HTTP to Kafka, you specify the listening port, Kafka configuration information, maximum message size, and the application ID. You can also configure SSL/TLS properties, including default transport protocols and cipher suites.
Prerequisites
- Configure HTTP clients to send data to the HTTP to Kafka listening port
- When you configure the origin, you define a listening port number where the origin listens for data.
- Include the application ID in request headers
- When you configure the origin, you define an application ID. All messages sent to the HTTP to Kafka origin must include the application ID in the request header.
Pipeline Configuration
When you use an HTTP to Kafka origin in a pipeline, connect the origin to a Trash destination.
The HTTP to Kafka origin writes records directly to Kafka. The origin does not pass records to its output port, so you cannot perform additional processing or write the data to other destination systems.
However, since a pipeline requires a destination, you should connect the origin to the Trash destination to satisfy pipeline validation requirements.
A pipeline with the HTTP to Kafka origin should look like this:
Kafka Maximum Message Size
Configure the Kafka maximum message size in the origin in relationship to the equivalent Kafka cluster property. The origin property should be equal to or less than the Kafka cluster property.
The HTTP to Kafka origin writes the contents of each HTTP POST request to Kafka as a single message. So the maximum message size configured in the origin determines the maximum size of the HTTP request and limits the size of messages written to Kafka.
To ensure all messages are written to Kafka, set the origin property to equal to or less than the Kafka cluster property. Attempts to write messages larger than the specified Kafka cluster property fail, returning an HTTP 500 error to the originating HTTP client.
For example, if the Kafka cluster allows a maximum message size of 2 MB, configure the Maximum Message Size property in the origin to 2 MB or less to avoid HTTP 500 errors for larger messages.
By default, the maximum message size in a Kafka cluster is 1 MB, as defined by the message.max.bytes property.
Kafka Security
You can configure the HTTP to Kafka origin to connect securely to Kafka through SSL/TLS, Kerberos, or both. For more information about the methods and details on how to configure each method, see Security in Kafka Stages.
Configuring an HTTP to Kafka Origin
Configure an HTTP to Kafka origin to write high volumes of HTTP POST requests directly to Kafka.