Configure Kafka as a Streaming Data Source

Streaming platforms such as Kafka provide real-time event data as it is generated. Kafka is commonly used for messaging, telemetry ingestion, activity tracking, and stream processing.

In Data Fusion, Kafka enables continuous ingestion of event-driven data for use cases such as:

Real-time ingestion — Process telemetry or event streams from applications and devices
AI-driven workflows — Support near real-time inference and decision-making
Operational monitoring — Power dashboards and alerting systems with up-to-date data

Before you begin

Streaming data is typically append-only and time-series in nature. Consider how this data will be stored and accessed:

For high-throughput or time-series workloads, map data to entities backed by a key-value store
Ensure the target canonical maps to an entity configured with the Key-Value tag in the Object Model

Generate Kafka API credentials

Before configuring the connector, obtain access credentials for your Kafka cluster.

Request an API key and API secret from your Kafka administrator
For managed services such as Confluent Cloud, follow provider-specific steps to generate credentials
Ensure the credentials have read access to the Kafka topic you plan to ingest

Add a Kafka connector

Open your application in C3 AI Studio
Navigate to Data Fusion
In the Data Sources panel, select Add data source
Select Kafka from the streaming connectors list
Select Next

Configure connector details

Provide the following:

Name — Identifier for the connector
Description — Optional description

Configure authentication

Provide connection details for your Kafka cluster:

Endpoint — Kafka bootstrap server (for example, host:port)
API Key — Kafka API key
API Secret — Kafka API secret

Select Next to proceed.

Select a Kafka data stream

After configuring the connector, define the source collection by selecting the Kafka topic.

Configure source collection

Provide the following:

Name — Identifier for the source collection
Description — Optional description

Select data stream

Broker name — Enter the Kafka topic name

If the system cannot automatically list available streams, manually enter the topic name.

The Broker name field corresponds to the Kafka topic from which data will be consumed.

Preview data (optional)

If access permissions allow, Data Fusion displays a preview of messages from the selected topic.

If no preview appears:

Verify topic name accuracy (case-sensitive)
Confirm API key has read access to the topic

Select Save and Test, then proceed to schema configuration.

Continue the data integration workflow

After saving the Kafka source, continue configuring the data pipeline:

Define the schema — Review and adjust inferred fields from the streaming data
Map to a canonical — Connect the source schema to a target data model
Configure transformations — Apply projections, transformations, or filters as needed

These steps complete the data integration workflow and enable ingestion into your application.

Troubleshoot Kafka connections

Unable to list available streams

If the system displays “Unable to list available streams”:

Enter the Kafka topic manually in the Broker name field
Verify that the Kafka credentials are valid and correctly entered
Confirm that the Kafka credentials have DESCRIBE permission on the topic

No partitions available

If the Partition (preview only) dropdown is empty:

Confirm that the Kafka credentials have DESCRIBE permission
Verify that the topic exists in the Kafka cluster

No preview data displayed

Data is retrieved only after selecting a partition.

If no data appears after selecting a partition:

Confirm that the Kafka credentials have READ permission
Verify that the selected partition contains data
Ensure the topic name and partition are entered correctly

If the request succeeds but returns no data, the UI displays No data available.

If the request fails, an error is displayed, which may indicate:

Invalid credentials
Incorrect topic or partition

Connection fails during setup

If the connector fails to validate:

Verify the endpoint (bootstrap server) is correct
Confirm that the API key and API secret are valid and active
Ensure network access to the Kafka cluster is available

Key considerations

Kafka topics must exist in the external Kafka system; Data Fusion does not create topics
Topic discovery may not be available depending on permissions
Data preview requires selecting a partition and depends on both access permissions and data availability

Copy link to this sectionBefore you begin

Copy link to this sectionGenerate Kafka API credentials

Copy link to this sectionAdd a Kafka connector

Copy link to this sectionConfigure connector details

Copy link to this sectionConfigure authentication

Copy link to this sectionSelect a Kafka data stream

Copy link to this sectionConfigure source collection

Copy link to this sectionSelect data stream

Copy link to this sectionPreview data (optional)

Copy link to this sectionContinue the data integration workflow

Copy link to this sectionTroubleshoot Kafka connections

Copy link to this sectionUnable to list available streams

Copy link to this sectionNo partitions available

Copy link to this sectionNo preview data displayed

Copy link to this sectionConnection fails during setup

Copy link to this sectionKey considerations

Copy link to this sectionSee also