C3 AI Documentation Home

Configure Kafka as a Streaming Data Source

Streaming platforms such as Kafka provide real-time event data as it is generated. Kafka is commonly used for messaging, telemetry ingestion, activity tracking, and stream processing.

In Data Fusion, Kafka enables continuous ingestion of event-driven data for use cases such as:

  • Real-time ingestion — Process telemetry or event streams from applications and devices
  • AI-driven workflows — Support near real-time inference and decision-making
  • Operational monitoring — Power dashboards and alerting systems with up-to-date data

Before you begin

Streaming data is typically append-only and time-series in nature. Consider how this data will be stored and accessed:

  • For high-throughput or time-series workloads, map data to entities backed by a key-value store
  • Ensure the target canonical maps to an entity configured with the Key-Value tag in the Object Model

Generate Kafka API credentials

Before configuring the connector, obtain access credentials for your Kafka cluster.

  • Request an API key and API secret from your Kafka administrator
  • For managed services such as Confluent Cloud, follow provider-specific steps to generate credentials
  • Ensure the credentials have read access to the Kafka topic you plan to ingest

Add a Kafka connector

  • Open your application in C3 AI Studio
  • Navigate to Data Fusion
  • In the Data Sources panel, select Add data source
  • Select Kafka from the streaming connectors list
  • Select Next

Configure connector details

Provide the following:

  • Name — Identifier for the connector
  • Description — Optional description

Configure authentication

Provide connection details for your Kafka cluster:

  • Endpoint — Kafka bootstrap server (for example, host:port)
  • API Key — Kafka API key
  • API Secret — Kafka API secret

Select Next to proceed.

Select a Kafka data stream

After configuring the connector, define the source collection by selecting the Kafka topic.

Configure source collection

Provide the following:

  • Name — Identifier for the source collection
  • Description — Optional description

Select data stream

  • Broker name — Enter the Kafka topic name

If the system cannot automatically list available streams, manually enter the topic name.

The Broker name field corresponds to the Kafka topic from which data will be consumed.

Preview data (optional)

If access permissions allow, Data Fusion displays a preview of messages from the selected topic.

If no preview appears:

  • Verify topic name accuracy (case-sensitive)
  • Confirm API key has read access to the topic

Select Save and Test, then proceed to schema configuration.

Continue the data integration workflow

After saving the Kafka source, continue configuring the data pipeline:

  • Define the schema — Review and adjust inferred fields from the streaming data
  • Map to a canonical — Connect the source schema to a target data model
  • Configure transformations — Apply projections, transformations, or filters as needed

These steps complete the data integration workflow and enable ingestion into your application.

Troubleshoot Kafka connections

Unable to list available streams

If the system displays “Unable to list available streams”:

  • Enter the Kafka topic manually in the Broker name field
  • Verify that the Kafka credentials are valid and correctly entered
  • Confirm that the Kafka credentials have DESCRIBE permission on the topic

No partitions available

If the Partition (preview only) dropdown is empty:

  • Confirm that the Kafka credentials have DESCRIBE permission
  • Verify that the topic exists in the Kafka cluster

No preview data displayed

Data is retrieved only after selecting a partition.

If no data appears after selecting a partition:

  • Confirm that the Kafka credentials have READ permission
  • Verify that the selected partition contains data
  • Ensure the topic name and partition are entered correctly

If the request succeeds but returns no data, the UI displays No data available.

If the request fails, an error is displayed, which may indicate:

  • Invalid credentials
  • Incorrect topic or partition

Connection fails during setup

If the connector fails to validate:

  • Verify the endpoint (bootstrap server) is correct
  • Confirm that the API key and API secret are valid and active
  • Ensure network access to the Kafka cluster is available

Key considerations

  • Kafka topics must exist in the external Kafka system; Data Fusion does not create topics
  • Topic discovery may not be available depending on permissions
  • Data preview requires selecting a partition and depends on both access permissions and data availability

See also

Was this page helpful?