Data folder structure

The data/ folder serves as a storage location for instances of Persistable Types that you manually load during development using package.upsertAllSeed(). It can contain instances of Persistable Types for deployment. The resource/ folder accommodates sample files, tutorial resources, and other data assets that don't fit into the more structured folders (seed, metadata, data, or config). The C3 Agentic AI Platform processes Persistable type instances during provisioning, while resource files remain available as package resources. This topic explains what belongs in the data/ folder and how different file types are handled.

Understand what belongs in data/

The data folder stores instances of Persistable Types that are not SeedData, Metadata, or Config. It commonly contains:

Persistable type instances: Entity instances that undergo standard database persistence (reference data, lookup tables, test datasets)
Development data: Temporary or experimental data used during development

For other types of files such as tutorial resources, sample datasets, static assets, and example files, use the resource/ folder instead.

This clarity makes data/ distinct from the more strictly governed seed/, metadata/, config/, and resource/ folders.

Examples:

Text

data/
  Customer/                          # Persistable type instances
    customer-001.json
    customer-002.json
  
  Order/                             # Persistable type instances
    order-001.json
    order-002.json

Common use cases

Persistable type instances

When storing Persistable type instances in data/, these requirements apply:

Must be Persistable entity types with database persistence
Must NOT mix SeedData, Metadata, or Config
The data folder serves as the default category for Persistable types that don't fit other specialized categories

The C3 Agentic AI Platform deploys these instances to the database using standard persistence operations during provisioning.

JSON

data/Customer/customer-001.json
{
  "id": "customer-001",
  "name": "Acme Corporation",
  "industry": "Manufacturing",
  "region": "North America",
  "active": true
}

Tutorial, example data, and static resources

Tutorial packages should use the resource/ folder to store sample datasets, CSVs for demonstrations, static resources like images, PDFs, and other assets. These files don't necessarily map to Types but serve as resources for documentation and learning.

Text

resource/
  histogram_sharks.csv
  scatter_plot.csv
  penguinsMissingCategorical.csv
  geospatial_data.csv
  tutorials/
    sample_png.png
    sample_pdf.pdf
    documentation_asset.jpg

These files remain available as package resources but aren't processed for database deployment.

File formats

The C3 Agentic AI Platform supports the same file formats for data as it does for seed data:

JSON files (`*.json`)

JSON is the standard format for data instances. Each file contains a single instance or an array of instances.

JSON

data/Customer/customer-001.json
{
  "id": "customer-001",
  "name": "Acme Corporation",
  "industry": "Manufacturing",
  "region": "North America",
  "active": true
}

Single instance: The id field identifies the instance in the database.

Multiple instances: Use an array to define multiple instances in one file.

JSON

data/Customer/customers.json
[
  {
    "id": "customer-001",
    "name": "Acme Corporation",
    "industry": "Manufacturing"
  },
  {
    "id": "customer-002",
    "name": "Tech Solutions Inc",
    "industry": "Technology"
  }
]

CSV files (`*.csv`)

CSV format is efficient for bulk data instances. Each row after the header creates one database record.

CSV

data/Product/products.csv
id,name,category,price
prod-001,Widget A,Hardware,29.99
prod-002,Widget B,Hardware,39.99
prod-003,Software License,Software,199.00

This CSV file creates three Product instances in the database.

How data/ relates to other folders

Several folders depend on or interact with the content in your data/ folder.

Source folder (src/) The data folder references Types declared in src/. When you change a Type definition, the C3 Agentic AI Platform revalidates data against the new Type structure during deployment. See Source Folder.
Seed folder (seed/) Both data and seed folders can contain instances for database persistence, and both are collected during automatic provisioning when the package fingerprint changes. However, they serve different purposes. The seed folder stores instances of Types that mix SeedData, which provide initial data with special upgrade-preservation features and dual-ownership semantics. The data folder stores instances of regular Persistable types, which undergo standard persistence without the sophisticated field-level tracking and user modification preservation that SeedData provides. Use seed for configuration and defaults that need to survive upgrades with user customizations intact. Use data for regular entity instances and test data. See Seed Folder.
Test folder (test/) Test data in test/data/ provides entity instances for tests. The C3 Agentic AI Platform deploys test data in test mode, allowing you to provide test-specific data without affecting production. See Test Folder.

These interactions make the data/ folder useful for populating your package with entity instances.

Structural rules

The C3 Agentic AI Platform enforces structural rules in the data/ folder to discover instances, validate them, and deploy them reliably.

Path conventions

The directory structure determines which Type the data belongs to.

Standard path structure:

Text

/packageName/data/TypeName/instanceName.ext

The directory name (second-to-last path segment) identifies the Type. For example:

data/Customer/customer-001.json: Type is Customer.
data/Product/products.csv: Type is Product.
data/Location/warehouses.json: Type is Location.

File naming

Instance filenames should reflect the content but don't need to match the id field. The C3 Agentic AI Platform uses the id field from the instance data to identify records in the database.

Example:

Text

data/Customer/acme-corp.json
{
  "id": "customer-001",  // ID used in database
  "name": "Acme Corporation"
  ...
}

The filename acme-corp.json describes the content, while the id field "customer-001" identifies the database record.

ID requirements

The C3 Agentic AI Platform enforces different ID requirements based on whether data is for production or testing:

Production data: ID is required in all instances. Missing IDs generate errors.

JSON

// Error: Missing required ID
{
  "name": "Acme Corporation"
  // Missing required "id" field
}

Test data (in test/data/): ID is optional. If omitted, the platform generates a UUID automatically.

JSON

// Valid in test/data/ - auto-generated ID
{
  "name": "Test Customer"
  // ID will be auto-generated
}

Validation rules

The C3 Agentic AI Platform validates data files before deployment and reports issues through Pkg.Issue.

You can implement custom validation rules by defining an optional validateSeedData() method on your type.

Below are examples of built-in validation rules:

Type category mismatch: Files must be in the correct category folder based on their Type.

Text

// Error: Type mixing SeedData in data/ folder
data/CronJob/job.json  // CronJob mixes SeedData, belongs in seed/

Required fields: Missing required fields generate errors.

JSON

// Error: Missing required field
{
  "id": "customer-001"
  // Missing required fields for Customer type
}

Invalid fields: Fields not declared in the Type generate errors.

JSON

// Error: Invalid field
{
  "id": "customer-001",
  "name": "Acme Corporation",
  "invalidField": true  // Type does not declare this field
}

Type mismatches: Field values must match the declared Type.

JSON

// Error: Type mismatch
{
  "id": "customer-001",
  "active": "should be boolean"  // Field expects boolean
}

How data is deployed

The C3 Agentic AI Platform handles data folder contents differently based on file type.

What gets deployed vs what remains as resources

Deployed to database (during provisioning):

JSON files in Type-specific directories (for example, data/Customer/*.json).
CSV files in Type-specific directories with valid serialization formats.
Only files representing Persistable Type instances.

Remains as package resources (not deployed):

All files in the resource/ folder.
Tutorial files (PDFs, PNGs, sample CSVs without Type mapping).
Static assets and documentation resources.

Deployment timing and process

The C3 Agentic AI Platform deploys Persistable Type instances during package provisioning using the upsertAllSeed() operation, which collects from both seed/** and data/**.

Initial deployment: When you provision a package, it:

Checks if the package fingerprint has changed since last deployment
If changed, collects all data files from data/** (along with seed/**)
Filters for valid Persistable Type instances based on:
- File location follows Type directory pattern.
- File has a supported serialization format (JSON, CSV).
- Type is Persistable and not mixing SeedData, Metadata, or Config.
Deploys each valid instance to the database using upsert operations.
Validates constraints against Type definitions.
Updates the deployment state with fingerprint and any issues

Reprovisioning: When you reprovision a package, the C3 Agentic AI Platform:

Compares the current package fingerprint with the stored fingerprint
Deploys data only if the fingerprint changed or deployment never completed
Updates or creates database records using standard upsert semantics
Does not track user modifications like SeedData does

Merge behavior

When multiple files define instances with the same ID, the platform merges them similar to seed data:

JSON

// data/Customer/base.json
{
  "id": "customer-001",
  "name": "Acme Corporation",
  "industry": "Manufacturing"
}

// data/Customer/extended.json
{
  "id": "customer-001",
  "region": "North America"
}

// Result after merge:
{
  "id": "customer-001",
  "name": "Acme Corporation",
  "industry": "Manufacturing",
  "region": "North America"
}

Differences from SeedData

Data instances undergo standard persistence without the special features of SeedData:

Feature	SeedData (seed/)	Data (data/)
Type requirement	Must mix SeedData	Must be Persistable (but NOT SeedData, Metadata, or Config)
User modification tracking	Tracks `userUpdatedFields`	No tracking
Upgrade preservation	Preserves user customizations during upgrades	Standard upsert behavior
Permissions	Controlled by `@seed` annotation	Standard entity permissions
Dual-ownership	System + user ownership model	Single ownership
Field-level updates	Granular field-level preservation	Standard field updates
Special admin methods	`createSeedData()`, `updateSeedData()`, etc.	Standard CRUD methods
Hidden state	Supports soft delete with `hidden` flag	Standard delete operations
Common use cases	Configuration, cron jobs, system defaults	Reference data, test data, entity instances

Runtime operations

The C3 Agentic AI Platform provides standard entity APIs for accessing, modifying, and removing data instances in the database.

Access instances

Retrieve instances from the database using standard entity methods.

JavaScript

// Access by id
Customer customer = Customer.forId("customer-001");

// Fetch with filters
FetchSpec spec = FetchSpec.make()
  .withFilter("active == true");
FetchResult<Customer> customers = Customer.fetch(spec);

// Fetch all instances
FetchResult<Customer> allCustomers = Customer.fetch();

Create instances

Users can create new instances using standard entity operations.

JavaScript

// Create new instance
Customer newCustomer = Customer.make()
  .withId("customer-003")
  .withName("New Company")
  .withIndustry("Services")
  .withActive(true)
  .upsert();

Update instances

Update instances using standard upsert() or update() methods.

JavaScript

// Update existing instance
Customer customer = Customer.forId("customer-001");
customer.withRegion("Europe")
  .withActive(false)
  .upsert();

Remove instances

Remove instances using standard delete operations.

JavaScript

// Delete instance
Customer customer = Customer.forId("customer-001");
customer.remove();  // Permanently deletes from database

Copy link to this sectionUnderstand what belongs in data/

Copy link to this sectionCommon use cases

Copy link to this sectionPersistable type instances

Copy link to this sectionTutorial, example data, and static resources

Copy link to this sectionFile formats

Copy link to this sectionJSON files (*.json)

Copy link to this sectionCSV files (*.csv)

Copy link to this sectionHow data/ relates to other folders

Copy link to this sectionStructural rules

Copy link to this sectionPath conventions

Copy link to this sectionFile naming

Copy link to this sectionID requirements

Copy link to this sectionValidation rules

Copy link to this sectionHow data is deployed

Copy link to this sectionWhat gets deployed vs what remains as resources

Copy link to this sectionDeployment timing and process

Copy link to this sectionMerge behavior

Copy link to this sectionDifferences from SeedData

Copy link to this sectionRuntime operations

Copy link to this sectionAccess instances

Copy link to this sectionCreate instances

Copy link to this sectionUpdate instances

Copy link to this sectionRemove instances

Copy link to this sectionSee also

Understand what belongs in data/

Common use cases

Persistable type instances

Tutorial, example data, and static resources

File formats

JSON files (`*.json`)

CSV files (`*.csv`)

How data/ relates to other folders

Structural rules

Path conventions

File naming

ID requirements

Validation rules

How data is deployed

What gets deployed vs what remains as resources

Deployment timing and process

Merge behavior

Differences from SeedData

Runtime operations

Access instances

Create instances

Update instances

Remove instances

See also