Data Integration Techniques Overview

C3 AI provides you with options to handle large and complex data efficiently. Use data integration options provided by the C3 Agentic AI Platform to support data aggregation from multiple disparate sources.

Declarative pipelines — Use minimal code to create pipelines, and monitor them in real-time.
Batch and stream processing — Scale solutions to monitor terabytes of data from remote file systems or streaming systems.
Data persistence — Transform data to minimize latency and allow for highly optimized experiences for accessing data from applications.
Virtualization — Virtualize external data sources without loading data directly on to the platform. See Virtualization.
Data visualization — Use the Object Model tab in Data Fusion to view an automatically-generated ERD for your application's data. See Manipulate ERD Views with Object Model for more information.

Data Fusion is in Beta. Please contact your C3 AI representative to enable this feature.

The C3 Agentic AI Platform offers a powerful abstraction layer for your data infrastructure, and supports the polyglot approach to data management taken by many enterprises. The platform provides a common data fabric to accelerate the time-to-value for AI applications without compromising strategic investments across any public or private cloud.

This guide is built for developers and data engineers that intend to ingest or virtualize data on the C3 Agentic AI Platform. The platform supports key data integration patterns, including ETL/ELT, virtualization, and streaming. The platform includes dozens of out-of-the-box connectors and a simple framework for integrating new sources, transforming, and loading data for use in an enterprise AI application.

Declarative pipelines

The C3 Agentic AI Platform allows data engineers to build production-ready pipelines with significantly less code. To build a data pipeline with the platform, a data engineer connects to a source system, maps schemas, defines any required transformations, and configures the execution criteria for the pipeline.

After declaring the pipeline, no extra code is required for:

Ensuring the performance, scalability, and reliability of the pipeline
Real-time or historical monitoring of the pipeline and the data that it transforms
Contextual error logging and recovery of pipeline failures
Tracking data lineage and provenance

Deferring complexity to the platform accelerates time-to-value and reduces the total cost of ownership for operating production data pipelines.

For more information on declaring data pipelines, see Declare Pipelines for File Sources.

Copy link to this sectionDeclarative pipelines

Copy link to this sectionSee also

Declarative pipelines

See also