C3 AI Documentation Home

Demystifying SeedData: Architectural Concepts and Design Patterns

Demystifying SeedData: Architectural Concepts and Design Patterns

The Fundamental Problem SeedData Solves

At its core, SeedData addresses a fundamental challenge in enterprise software: How do you provide default application data while allowing users to customize it, all while ensuring seamless application updates?

This is not a trivial problem. Traditional approaches often lead to:

  • Data loss during application updates
  • Configuration drift between environments
  • User frustration when customizations disappear
  • Deployment complexity managing different data states

SeedData solves this through a sophisticated dual-ownership model that treats data as having both system and user components, with intelligent merging and preservation logic.

Core Architectural Concepts

1. The Dual-Ownership Model

SeedData implements a revolutionary approach to data ownership that recognizes two distinct but interconnected layers:

System Layer (Provisioner)

  • Owned by the application package
  • Represents "canonical" or "default" state
  • Updated during application deployments
  • Immutable by users (by default)

User Layer (Customizer)

  • Owned by human users
  • Represents customizations and overrides
  • Preserved across application updates
  • Can override system values (when permitted)

This dual-ownership model is fundamentally different from traditional approaches that treat data as either "system" or "user" owned. Instead, SeedData recognizes that enterprise data often exists in a hybrid state where users customize system-provided defaults.

2. Field-Level Granularity

Unlike systems that treat entire records as owned by either system or user, SeedData operates at field-level granularity. This means:

  • A single record can have some fields owned by the system and others by the user
  • Users can customize specific aspects while preserving system defaults for others
  • The system can update non-user-modified fields while preserving user customizations

This granular approach enables surgical updates where application upgrades can enhance system-provided data without destroying user customizations.

3. Temporal Preservation Strategy

SeedData implements a sophisticated temporal preservation strategy that maintains the history of ownership changes:

  • Creation time: Records when each field was first modified by a user
  • Modification tracking: Maintains a list of user-updated fields
  • Ownership inheritance: New fields inherit system ownership unless explicitly user-modified

This temporal approach enables the system to make intelligent decisions about which updates to apply and which to preserve.

Design Patterns

1. The Annotation-Driven Behavior Pattern

SeedData uses annotations as behavioral contracts rather than just metadata. The @seed annotation doesn't just describe data—it defines how the data behaves:

  • userUpdatable: Establishes the contract for user modification
  • userRemovable: Defines removal behavior
  • notUserUpdatable: Creates field-level exceptions to type-level rules

This pattern enables declarative data governance where behavior is defined at design time rather than implemented at runtime.

2. The Merge-First Architecture

SeedData implements a merge-first architecture where all updates are treated as merges rather than replacements:

  • Package merging: Dependent packages can override seed data from dependencies
  • Test merging: Test data merges with production data in test environments
  • Update merging: Application updates merge with existing user customizations

This merge-first approach ensures that no data is ever lost during transitions between different states.

3. The Dependency Graph Pattern

SeedDataDeps implements a dependency graph pattern that enables efficient traversal of related data:

  • Forward dependencies: What does this seed data depend on?
  • Backward dependencies: What depends on this seed data?
  • Cached traversal: Pre-computed dependency relationships for performance

This pattern enables intelligent data management where changes to one piece of data can trigger appropriate updates to related data.

Key Methods: The SeedData API

SeedData provides a unique set of methods that embody its dual-ownership philosophy. These methods operate in a provisioner context rather than a user context, enabling system-level operations while respecting user customizations.

Administrative Methods: The "As Authorizer" Pattern

SeedData introduces a novel concept: administrative methods that operate with system privileges while maintaining user data integrity.

createSeedData() and createSeedDataBatch()

  • Creates records as the provisioner, not as the current user
  • Establishes system ownership from the start
  • Enables bulk provisioning operations
  • Sets the foundation for future user customization

updateSeedData() and updateSeedDataBatch()

  • Updates records as the provisioner, bypassing user restrictions
  • Respects existing user modifications (user-modified fields are preserved)
  • Enables system-level data evolution
  • Implements the merge-first philosophy

removeSeedData() and removeSeedDataBatch()

  • Performs true deletion rather than user-style hiding
  • Used for cleanup operations and data lifecycle management
  • Distinguishes between "user removed" (hidden) and "system removed" (deleted)

User Modification Management

clearUserUpdates() and clearUserUpdatesBatch()

  • The "reset to defaults" functionality
  • Removes user customizations while preserving system data
  • Enables users to "start over" with system defaults
  • Critical for troubleshooting and configuration management

This method represents a reversible customization model where users can experiment with customizations knowing they can always return to system defaults.

Metadata and Discovery

seedPath()

  • Returns the canonical path to the seed file that created this instance
  • Enables traceability from runtime data back to source files
  • Critical for debugging and understanding data provenance
  • Supports the "living document" mental model

dependencies()

  • Returns the SeedDataDeps instance for dependency management
  • Enables efficient traversal of related data
  • Supports the dependency graph pattern
  • Critical for understanding data relationships

Custom Validation

validateSeedData()

  • Optional method for custom validation logic
  • Enables domain-specific validation beyond type constraints
  • Runs in a controlled context with appropriate permissions
  • Supports the fail-safe error handling philosophy

The Hidden State Management

doUnremove()

  • Restores hidden records to active state
  • Implements the temporal preservation strategy
  • Enables "undo" functionality for user removals
  • Critical for the state machine model

The Annotation System: Behavioral Contracts

The @seed annotation system represents a declarative approach to data governance where behavior is defined at design time rather than implemented at runtime.

The Contract-Based Design

@seed(userUpdatable=true)

  • Establishes a contract that users can modify this type
  • Creates expectations for both users and system behavior
  • Enables predictable customization patterns
  • Implements the social contract model

@seed(userRemovable=true)

  • Defines the removal contract for user operations
  • Controls whether users can "hide" system-provided data
  • Balances user agency with system integrity
  • Enables controlled data lifecycle management

@seed(notUserCreatable=true)

  • Establishes creation restrictions at the type level
  • Prevents users from creating instances of system-managed types
  • Maintains data consistency and prevents configuration drift
  • Implements the canonical defaults pattern

Field-Level Exception Handling

@seed(notUserUpdatable=true) on fields

  • Creates field-level exceptions to type-level rules
  • Enables granular control over user customization
  • Protects critical system fields while allowing other customization
  • Implements the field-level granularity concept

The Inheritance Model

The annotation system implements a sophisticated inheritance model:

  • Type-level annotations establish the base contract
  • Field-level annotations create exceptions to the base contract
  • Default behavior follows the principle of least restriction
  • Explicit overrides provide precise control when needed

This inheritance model enables flexible data governance where developers can establish broad policies while allowing specific exceptions where needed.

Conceptual Models

1. The State Machine Model

SeedData records exist in a state machine with well-defined transitions:

Text
[System Created] → [User Modified] → [System Updated] → [User Modified]
       ↓                ↓                   ↓                ↓
   [Hidden]         [Hidden]           [Hidden]         [Hidden]
       ↓                ↓                   ↓                ↓
   [Deleted]        [Deleted]          [Deleted]        [Deleted]

Each transition has specific rules about what data is preserved and what is updated.

2. The Layered Data Model

SeedData implements a layered data model where data exists in multiple layers:

Layer 1: Base Data (from seed files)

  • Original system-provided values
  • Never changes during user operations

Layer 2: User Modifications (userUpdatedFields)

  • Fields that users have customized
  • Preserved across system updates

Layer 3: System Updates (new deployments)

  • Updates to non-user-modified fields
  • Merged with existing user modifications

Layer 4: Metadata (hidden, userOwned flags)

  • System-managed state information
  • Controls behavior and visibility

3. The Conflict Resolution Model

When system updates conflict with user modifications, SeedData uses a user-preference model:

  • User-modified fields: User values always win
  • System-only fields: System updates always apply
  • New fields: System values are applied unless user explicitly modifies them

This model ensures user agency while maintaining system integrity.

Advanced Concepts

1. The Provisioning Pipeline

SeedData implements a sophisticated provisioning pipeline with multiple stages:

Discovery Stage

  • Scans package files for seed data
  • Validates file formats and structure
  • Builds dependency order

Processing Stage

  • Deserializes data into objects
  • Validates against type definitions
  • Handles merging and conflicts

Deployment Stage

  • Applies upsert operations
  • Tracks user modifications
  • Updates dependency caches

Validation Stage

  • Performs constraint checking
  • Runs custom validation logic
  • Reports issues and errors

2. The Caching Strategy

SeedDataCache implements a multi-level caching strategy:

Level 1: Object Cache

  • Caches individual seed data objects
  • Enables fast lookups by ID

Level 2: Type Cache

  • Caches all objects of a specific type
  • Enables bulk operations and queries

Level 3: Dependency Cache

  • Caches dependency relationships
  • Enables efficient graph traversal

This multi-level approach optimizes for different access patterns while maintaining consistency.

3. The Error Handling Philosophy

SeedData implements a fail-safe error handling philosophy:

  • Non-blocking errors: Warnings don't stop deployment
  • Critical error isolation: Critical errors are isolated to specific records
  • Comprehensive reporting: All issues are tracked and reported
  • Graceful degradation: System continues operating with partial data

This philosophy ensures that deployment resilience is maintained even when some seed data has issues.

Mental Models for Understanding SeedData

1. The "Living Document" Model

Think of SeedData as a living document that evolves over time:

  • Initial version: Created from seed files
  • User annotations: Users add their notes and modifications
  • System revisions: System updates add new content
  • Preserved annotations: User notes are preserved across revisions

This model helps understand why user modifications are preserved even when the underlying system data changes.

2. The "Configuration Inheritance" Model

SeedData implements configuration inheritance similar to CSS or object-oriented programming:

  • Base configuration: From seed files (like CSS base styles)
  • User overrides: User customizations (like CSS overrides)
  • System updates: New base configurations (like CSS updates)
  • Preserved overrides: User overrides persist (like CSS specificity)

This model explains why some fields can be updated while others are preserved.

3. The "Version Control" Model

SeedData operates like a distributed version control system:

  • System commits: Application deployments create new versions
  • User branches: User modifications create local branches
  • Merge operations: Updates merge user branches with system commits
  • Conflict resolution: User changes take precedence in conflicts

This model helps understand the merge-first architecture and conflict resolution.

Key Insights

1. SeedData is About Relationships, Not Just Data

The real power of SeedData lies not in storing data, but in managing the relationships between system-provided defaults and user customizations. It's a system for negotiating ownership of data between different stakeholders.

2. SeedData Enables Continuous Evolution

Unlike traditional configuration management, SeedData enables continuous evolution where applications can improve their default data while preserving user customizations. This enables agile application development without breaking user workflows.

3. SeedData Implements Social Contracts

The annotation system in SeedData implements social contracts between application developers and users:

  • userUpdatable: "Users can customize this"
  • userRemovable: "Users can remove this"
  • notUserCreatable: "Only the system can create this"

These contracts enable predictable behavior and clear expectations.

4. SeedData Solves the "Configuration Drift" Problem

Traditional systems suffer from configuration drift where environments diverge over time. SeedData prevents this by:

  • Canonical defaults: System always provides authoritative defaults
  • Controlled customization: Users can only modify permitted aspects
  • Automatic reconciliation: Updates merge with customizations

This ensures environment consistency while allowing necessary customization.

Conclusion: The SeedData Philosophy

SeedData represents a philosophical shift in how enterprise software handles data:

From: "Data is either system-owned or user-owned" To: "Data exists in a collaborative space between system and users"

From: "Updates replace existing data"
To: "Updates merge with existing customizations"

From: "Configuration is static" To: "Configuration evolves continuously"

From: "Users adapt to software" To: "Software adapts to users"

This philosophy enables human-centered software that respects user agency while maintaining system integrity—a fundamental requirement for successful enterprise applications.

The genius of SeedData lies not in its technical implementation, but in its conceptual framework for thinking about data ownership, user agency, and system evolution. It provides a model for building software that grows with its users rather than against them.

Was this page helpful?