C3 AI Documentation Home

Data Sharing

The C3 Agentic AI Platform provides the ability to share data across applications and environments within the same cluster.

The C3 Agentic AI Platform ensures data is segregated for each application. This happens by default, so you are not required to perform extra steps when implementing the data model of your application. You can modify the default behavior for data sharing between applications and environments.

To enable data sharing, the C3 Agentic AI Platform introduces the concept of a Db.Domain to help manage how shared data is organized.

A Db.Domain is a logical grouping of data. It refers to a collection of interrelated data pertaining to a common purpose, object, or concept. A Db.Domain may span one or more application packages. However, a Db.Domain can only be deployed in a single application.

For more information about the file naming conventions and where files in a package belong, see Package Management Overview.

Db.Domain metadata

A Db.Domain is defined using metadata and must be authored in the package that implements the domain or data to be shared. Type definitions that are a member of the Db.Domain reference the Db.Domain name in their implementation. The json file below contains a single key/value pair representing the name of the Db.Domain, datalake. The json file must be stored in the metadata folder of a package:

<pkg>/metadata/Db.Domain/*.json.

JSON
{
    "name": "datalake"
}

The corresponding Type definition is below and must be in the src folder of your package.

Type
@db(domain='datalake')
entity type DataLakeRepository mixes Obj schema name 'DTLKRPST' {
   fieldID:       string
}

The db annotation that indicates the Db.Domain.

Db.Domain configuration

Applications that consume data from a Db.Domain must declare their dependency on the Db.Domain in two ways:

  1. By defining a dependency on the Db.Domain application package.
  2. Authoring a Db.Domain.Config and either seeding it into the application package or setting the Db.Domain configuration after the application is deployed. When defining the Db.Domain.Config, set the name key to the name of the Db.Domain and the app key to the name or the identifier of the hosting Db.Domain application, shown below:
JSON
{
    "name": "datalake",
    "app": "datalake"
}

If the Db.Domain application and consuming application are deployed in the same environment, the Db.Domain.Config can specify the application name only (for example, datalake). If the Db.Domain application resides in a different environment, the fully qualified application identifier must be specified. For example, {cluster}-{env}-{app}

To set the Db.Domain configuration, run the command:

Db.Domain.Config.forName('datalake').setConfigValue('app', '{cluster}-{env}-{app}').

In the above command, datalake is the name of the Db.Domain and {cluster}-{env}-{app} is the fully qualified application identifier.

Recommendations for data sharing and management

In the C3 Agentic AI Platform, data-shared Types are designed to synchronize and share data across multiple instances. On the other hand, non-data-shared Types contain data that is specific to a single instance and is not synchronized across environments.

  • Enforce read-only data shared Types: Roles assigned to consuming applications should contain read-only permissions, as write is not supported.

  • Avoid join queries on data shared Types with non-data shared references: Performing join queries between a data-shared Type and a non-data-shared Type can result in data inconsistencies. This is because the non-data-shared Type's data may not be in sync with the data-shared Type, leading to unreliable query results.

  • Partition join queries: You can fetch data-shared Types from non-data-shared Types if no filtering is applied. However, fetching or accessing non-data-shared Types from data-shared Types within a join query should be avoided. To address this, an alternative solution is to partition the join query into two separate queries. First, query the data-shared Type and then, use the result to query the non-data-shared Type. This approach ensures that the data remains consistent and reliable.

  • Restrict modifications of data shared Types across packages: Do not remix data-shared Types. To maintain data consistency across the system, data-shared Types within a package should remain unmodified in other packages. Instead, consider expanding the common data-shareable data model in data lake by adding fields and relationships as a superset.

  • Develop applications with data lake setup: Ensure that applications are developed with a data lake configuration to support efficient data sharing.

  • Enable data sharing in non-production environments: In non-production environments, data sharing must be enabled to allow data ingestion once, enabling team members to work with a complete dataset. This also applies to multiple applications or multiple instances of the same application.

  • Enable data sharing in production environments: In production environments, data sharing must be enabled to allow applications with a common data model to load data once, optimizing data usage and efficiency. This is true for multiple and different applications.

Limitations and considerations

  • Incompatibility exists between shared and non-shared data in certain scenarios. For example, in C3 AI Demand Forecasting, updating the Type definition of DemandForecastSubject with a stored calculation that references demand information (for example, link SalesOrderLine) can cause issues. This scenario is incompatible with data sharing because the stored calculation would attempt to join a shared table (like SalesOrderLine) with a non-shared table (DemandForecastSubject).

  • The C3 Agentic AI Platform’s data sharing architecture currently prevents certain operations, such as filtering on a non-shared Type when it involves data from shared domains.

  • Data sharing is designed to facilitate efficient resource use, particularly in environments with multiple non-production scenarios. However, the current lack of support for performing joins across shared and non-shared Db domains can affect complex data access logic. This limitation may require rethinking the use of data sharing in certain scenarios, particularly when planning to implement intricate data operations across domains.

  • If you have a fixed data model, data sharing works well for development. If you have a data model that's in flux and being edited by multiple developers, data sharing is not recommended.

  • Do not remix data shared Types in other packages to add fields and relationships.

  • Avoid loading the same data with a common data model multiple times for different applications.

  • Do not data share Types without proper validation.

  • Do not fetch or access non-data-shared Types from data shared Types.

Ann.Db

A new field has been added to the @db annotation to indicate the Db.Domain the Type is a member of called domain.

Sharing data between applications

In the following example there are two applications, datalake and WindTurbine. The datalake application manages measurement, event, and vibration data coming from various historians. The WindTurbine application manages asset, asset hierarchy, and work order data, and has a dependency on the datalake application.

The C3 Agentic AI Platform does not support joins between shared and non-shared tables. This also extends to joins across Db.Domain boundaries. While stored calculations, which are refreshed and not based on SQL joins, can imply a joining of shared and non-shared tables, this is not the case. The C3 Agentic AI Platform's design ensures that stored calculations are run during an earlier stage of the workflow without errors related to data sharing constraints.

Datalake application

Measurement data modeled by the WindTurbineMeasurement resides in the datalake Db.Domain, and is deployed in the datalake application.

Type
@db(domain='datalake'
    datastore='kv',
    persistenceOrder='timestamp',
    partitionKeyField='turbineId')
entity type WindTurbineMeasurement {
  turbineId:              string
  gearOilTemperature:     double
  generatorRotationSpeed: double
  timestamp:              datetime
  activePower:            double
}

Db.Domain metadata, in the <pkg>/metadata/Db.Domain/ folder:

JSON
{
    "name": "datalake"
}

The datalake application produces data.

WindTurbine application

The WindTurbine application consumes measurement data from the datalake application, so a dependency must be defined in the WindTurbine.c3pkg.json package as shown below in the <pkg>/config/Db.Domain.Config/ folder.

See Package Management Overview documentation for more information about dependencies and the folder structure within a package.

JSON
{
    "name" : "WindTurbine",
    "version": "1.0.0",
    "dependencies": {
        "datalake": "*"
    }
}

The WindTurbine application must be configured to reference the datalake application and Db.Domain in a Db.Domain.Config configuration file as shown below in the <pkg>/config/Db.Domain.Config/ folder.

JSON
{
    "name": "datalake",
    "app": "datalake"
}

With the above configurations set and applications deployed, the WindTurbine application is almost ready to access data from the datalake application. However, before data access can occur, you must first authorize the WindTurbine application.

Authorization

Authorization for data sharing utilizes the C3 Agentic AI Platform's role based access control security framework. When sharing data, the Db.Domain owner is responsible for granting access and permissions to client applications. Access is granted to an application using the identity of the application. To grant an application access to a Db.Domain, execute the command below as an environment administrator in the Db.Domain application.

JavaScript
C3.app().allowAccess("<client_app_id>", [<roles to assign to the client app>]);

It is recommended that application specific roles be used to grant access to client applications. This ensures that access is narrowly scoped to the data that client application(s) require and the operations they need to perform on the data.

It is recommended that the role below be used as the basis for application specific roles. The role below provides read-only access to client applications requiring access to Measurement and MeasurementSeries data (illustrative example only). The permissions listed below are required to access ingested and normalized time series data and header records.

JSON
{
    "id" : "Measurements.ReadOnly.Role",
    "permissions" : [
      "allow:Cassandra::get",
      "allow:Cassandra::config",
      "allow:Cluster::env",
      "allow:ContentValue::readObjs",
      "allow:DataPartitionBucketValue::readObjs",
      "allow:Db.Domain::config",
      "allow:Db.Domain::hasArbitraryFolderHierarchy",
      "allow:Db.Domain::isValidMetadataJson",
      "allow:Db.Domain::metadataFolder",
      "allow:Env::app",
      "allow:KvDataPartition::canBucket",
      "allow:Measurement:read:",
      "allow:MeasurementSeries::fetchObjStream",
      "allow:MeasurementSeries::fetchNormalizedData",
      "allow:MeasurementSeries:read:",
      "allow:SqlKvStore::get",
      "allow:SqlKvStore::config"
    ]
}

More information about roles and permissions can be found in the C3 AI Security Guide.

Example application with package structure

This example highlights the package structure of a working WindTurbine application. The main folder structure includes two packages, assets and measurements. The measurements is a dependency.

The assets package consumes the data, and the measurements package produces the data.

Assets

  • assets/assets.c3pkg.json : The package structure file.
  • assets/config/Db.Domain.Config/measurements.json : Configuration data.
  • assets/src/WindTurbine.c3typ : The main WindTurbine *.c3typ file.

The folder structure for the assets package:

Text
assets (folder)
|
|--- config
|      |
|      |--- Db.Domain.Config
|      |     |
|      |     |-- measurements.json
|      |
|--- src
|      |
|      |-- WindTurbine.c3typ
|
|--- assets.c3pkg.json   

assets/assets.c3pkg.json

JSON
{
   "name" : "assets",
   "description" : "Demo workshop package",
   "version": "1.0.0",
   "dependencies": {
        "measurements": "*"
    }
}

assets/config/Db.Domain.Config/measurements.json is the Db.Domain.Config configuration file.

JSON
{
    "name": "measurements",
    "app": "measurements"
}

assets/src/WindTurbine.c3typ is the Type file.

Type
entity type WindTurbine schema name 'WNDT' {

  turbineId:    string
  location:     string
  power:        int
  manufacturer: string

  measurements: [WindTurbineMeasurement] (turbineId, turbineId)
}

Measurements

The dependency for the above assets package is below in a folder called measurements.

  • measurements/measurements.c3pkg.json : The package structure file for the dependency.
  • measurements/config/Db.Domain.Config/measurements.json : Configuration file.
  • measurements/metadata/Db.Domain/Measurements.json : Domain file.
  • measurements/metadata/Role/Measurements.ReadOnly.Role.json : Role required to read measurement data. This role will be granted to applications that require read-only access to measurement data.
  • measurements/src/WindTurbineEvent.c3typ : *.c3typ file
  • measurements/src/WindTurbineMeasurement.c3typ : *.c3typ file
  • measurements/src/WindTurbineVibration.c3typ : *.c3typ file

The folder structure for the measurements package:

Text
measurements (folder)
|
|--- config
|      |
|      |--- Db.Domain.Config
|      |     |
|      |     |-- measurements.json
|      |
|--- metadata
|     |
|     |-- Db.Domain
|           |
|           |-- Measurements.json
|
|     |-- Role
|	  		|-- Measurements.ReadOnly.Role.json
|
|--- src
|     |
|     |-- WindTurbineEvent.c3typ
|     |-- WindTurbineMeasurement.c3typ
|     |-- WindTurbineVibration.c3typ
|
|--- measurements.c3pkg.json   

measurements/measurements.c3pkg.json

JSON
{
    "name" : "measurements",
    "description" : "Demo workshop package"
}

measurements/config/Db.Domain.Config/measurements.json

JSON
{
    "name": "measurements",
    "app": "measurements"
}

measurements/metadata/Db.Domain/Measurements.json

JSON
{
    "name": "measurements"
}

measurements/metadata/Role/Measurements.ReadOnly.Role.json

JSON
{
    "id" : "Measurements.ReadOnly.Role",
    "permissions" : [
      "allow:Cassandra::get",
      "allow:Cassandra::config",
      "allow:Cluster::env",
      "allow:ContentValue::readObjs",
      "allow:DataPartitionBucketValue::readObjs",
      "allow:Db.Domain::config",
      "allow:Db.Domain::hasArbitraryFolderHierarchy",
      "allow:Db.Domain::isValidMetadataJson",
      "allow:Db.Domain::metadataFolder",
      "allow:Env::app",
      "allow:KvDataPartition::canBucket",
      "allow:SqlKvStore::get",
      "allow:SqlKvStore::config",
      "allow:WindTurbineMeasurement:read:",
      "allow:WindTurbineMeasurementSeries::fetchNormalizedData",
      "allow:WindTurbineMeasurementSeries::fetchObjStream"
    ]
}

measurements/src/WindTurbineEvent.c3typ

Type
entity type WindTurbineEvent mixes Obj schema name 'EVENT' {

  start:      datetime
  turbineId:  string
  end:        datetime
  event_code: string
}

The measurements/src/WindTurbineMeasurement.c3typ type uses the @db annotation to indicate the Db.Domain the Type is a member of, in this case measurements.

Type
@db(domain='measurements')
entity type WindTurbineMeasurement mixes Obj schema name 'WT_BLB_MSRMNT' {

  gearOilTemperature:     double
  turbineId:              string
  generatorRotationSpeed: double
  timestamp:              datetime
  activePower:            double
}

measurements/src/WindTurbineVibration.c3typ

Type
@db(domain='measurements')
entity type WindTurbineVibration mixes Obj schema name 'WINDTURBINEVIBRATION' {

  turbineId:        string
  energyNear2xRpm:  double
  energyNear1xRpm:  double
  measurement_date: datetime
}

Local App JVM Execution for Db.Domain

In certain deployment scenarios, where calling the Db domain app adds network latency or too many calls via leaders for that app, we can distribute the load by making the db domain call be executed in the consumer app itself.

The Db.Domain.Config supports a callAppInLocalJvm setting that, when enabled, executes method calls locally using C3.callInLocalJvmApp instead of remoting to the target application over the network.

Configuration

To enable local JVM execution for a Db.Domain, add callAppInLocalJvm: true to the Db.Domain.Config:

JSON
{
    "name": "datalake",
    "app": "datalake",
    "callAppInLocalJvm": true
}

How it works

When callAppInLocalJvm is enabled:

  1. Method calls on types in the configured Db.Domain perform a lightweight context switch to the target app
  2. The call executes within the same JVM, avoiding network serialization overhead
  3. Results are still serialized/deserialized to maintain proper data isolation
  4. The original context is restored after the call completes

When to use

This feature is appropriate when:

  • Network latency is a concern and apps share the same database

Prerequisites

  • Proper authorization must still be configured between apps

Limitations

  • The metadata of the target app and the consumer app may differ
  • There are currently no guards to prevent data corruption if metadata is conflicting
  • This setting only affects the app remoting, not datastores configuration

See also

Was this page helpful?