Data Pipeline Architecture
The C3 Agentic AI Platform allows you to build production-ready data pipelines with significantly less code through a metadata-driven approach. All data pipelines developed with the platform leverage a canonical data model, allowing for communication and translation between different data formats.
The C3 Agentic AI Platform approaches data using a canonical data model. A canonical data model decouples the schema of a source system from an application's data model, allowing your application to evolve independently from its data source system.
In the platform, canonical data models are built using the SourceSystem, SourceCollection, Source, Canonical, and Transform Types.

SourceSystem
A SourceSystem represents any external system, streaming service, data warehouse or database from which data is imported into the C3 Agentic AI Platform for processing, analysis, or integration. These source systems can be referred to as the data that is imported into the C3 AI environment, where the data can then be transformed, integrated, and utilized across various applications and analytics tools. This source system can be an SAP system or a CSV file.
This example uses City.csv as the source file.
City.csv
id,name,state_abbr,state,location
RWC,Redwood City,CA,California,37.4848_-122.2281
SCL,San Carlos,CA,California,37.5072_-122.2605
MPK,Menlo Park,CA,California,37.453_-122.1817
EPA,East Palo Alto,CA,California,37.4688_-122.1411
CHI,Chicago,IL,Illinois,41.881944_-87.627778
EVN,Evanston,IL,Illinois,42.046389_-87.694722
MTG,Morton Grove,IL,Illinois,42.040556_-87.7825
PTL,Portland,OR,Oregon,45.52_-122.681944SourceCollection
A SourceCollection refers to a logical grouping of data objects, each linked to a specific data source and associated with a common source system, all imported into the C3 Agentic AI Platform. Each entity within the collection contains unique data from different sources but shares a unified processing framework or integration path. This allows for efficient management, transformation, and ingestion of data from multiple sources into a target system.
Creating a source collection helps organize and manage data from various external sources efficiently. The example below shows a source collection consisting of the following entities:
- FixtureSource, linked to SourceFixture
- CitySource, linked to SourceCity
- ManufacturerSource, linked to SourceManufacturer
Each source entity (FixtureSource, CitySource, ManufacturerSource) is associated with the same source system, Canonical. This shared system ensures that data from different sources is processed consistently within a unified framework.
Examples
An example of FixtureSource.json:
{
"name": "FixtureSource",
"source": "SourceFixture",
"sourceSystem": {
"name": "Canonical"
}
}An example of CitySource.json:
{
"name": "CitySource",
"source": "SourceCity",
"sourceSystem": {
"name": "Canonical"
}
}An example of ManufacturerSource.json:
{
"name": "ManufacturerSource",
"source": "SourceManufacturer",
"sourceSystem": {
"name": "Canonical"
}
}Source
A Source type in the context of data integration is used to model data objects that are imported to a source system. In the source type, you can define the nature and characteristics of the data source to represent the schema of an object in a source collection. This influences how data is ingested, processed, and integrated within the platform.
Example:
type SourceCity mixes Source {
/**
* The city ID
*/
id: string
/**
* The city name
*/
name: string
/**
* State name abbrevation
*/
state_abbr: string
/**
* The state name
*/
state: string
/**
* Latitude and longitude of the city
*/
location: string
}Canonical
Canonical is a subclass of Source. Canonical Types are the inbound interface used as a contract that specifies the schema of data that is being loaded into the platform. Canonical Types are used as a way to future-proof an application data model, which relies on Entity types to persist data for use in the application. This pattern allows the entity data model to evolve as new functionality is required from the application, without changing the integration points to the outside world.
Transform
Transforms are defined between Source, Canonical, and Entity Types to re-map schemas and apply data transformations at varying levels of complexity. For example, a simple transform might re-map fields to normalize the contents of a flat file. A more complex transformation might return a specific value if a field matches a defined regular expression. The platform offers a rich library of expressions out-of-the-box, and supports the implementation of custom JavaScript transformations when required.
C3 AI applications are responsible for supplying the canonical types and entity types that define the application data model. For each implementation of a C3 AI application, the appropriate source systems, collections, sources, and transforms must be defined to enable data integration.