C3 AI Documentation Home

Feature Set Snapshot

Data scientists and data engineers can find inconsistent values for the same Feature Set for the same timestamp. This behavior can result from the underlying source data (like database entries) that feed into a Feature Set changing. Also, the Feature Set itself could be removed or modified.

C3 AI provides you with solutions to persist a snapshot of a certain Feature Set data at a given point in time along with metadata. This is most useful when you need to inspect the exact training data that was used to train a specific ML model and reuse the same training data.

The C3 AI Feature Set Snapshot provides the following features:

  • Immutability — Snapshots cannot be altered. No subjects can be added to the snapshot. No subject's data can be updated.

  • Reproducibility — You can reproduce the Feature Set when it was materialized to the Feature Store.

  • Auditability — You can inspect the snapshot metadata to know what Feature Set it used, what data it used, and what subjects it used.

Use the Feature Set APIs

This section covers how to use the Feature.Set APIs.

After you create or get a featureSet and subjectFilter/subjects, call featureSet.createSnapshot.

JavaScript
job = featureSet.createSnapshot(
            subjectFilter="id == 'TURBINE-1'", batchSize=100, 
            snapshotId='windTurbine-snapshot-1')

Note: If you want to run the createSnapshotJob immediately, you should call job.waitForCompletion().

To read from the created snapshot, specify the snapshotId in the evalFeatureSetBatch.

JavaScript
WindTurbine.evalFeatureSetBatch(subjectFilter=filter, featureSet=fs1, 
      start=2021-01-01, end=2021-06-01, snapshotId=”snapshot1”)

To delete a snapshot, use the deleteSnapshot API.

JavaScript
featureSet.deleteSnapshot(snapshotId=existingSnapshotId, confirm=True)

Additional notes

  • snapshotId is unique.

  • You cannot call createSnapshot with the same snapshotId. The snapshot is immutable once created.

  • You cannot update an existing snapshot. You must first delete the snapshot, and create a new snapshot with the updated snapshot metadata.

  • snapshotId is different from the id of the snapshot. The snapshotId is specified by the user, and the id is created in runtime by prepending the given snapshotId with the subjectType. The id field differentiates cases where the same snapshotId is used for different subjTypes.

See also

Was this page helpful?