C3 AI Documentation Home

Metrics and Features

This tutorial shows examples of using metrics and features together, including the following tasks:

  1. Retrieving metric data using the new eval() API.
  2. Creating features from metrics, creating feature sets referencing such features, and publishing the features or feature sets.
  3. Defining features and feature sets directly as seed data and loading them into the application.

Metrics, features, and feature sets used

The table below lists the features and metrics used in this tutorial:

FieldMetricHow createdFeatureHow created
activePowerActivePowerAvgIn featureactivePowerAvgFeatureAPI
generatorRotationSpeedGeneratorRotationSpeedAvgMetadatageneratorRotationSpeedAvgFeatureAPI
gearOilTemperatureGearOilTemperatureAvgMetadatagearOilTemperatureAvgFeatureAPI
activePowerActivePowerDiffIn featureactivePowerDiffFeatureSeed
generatorRotationSpeedGeneratorRotationSpeedDiffMetadatageneratorRotationSpeedDiffFeatureSeed
gearOilTemperatureGearOilTemperatureDiffMetadatagearOilTemperatureDiffFeatureSeed

Here are the feature sets used in this tutorial.

Feature SetHow createdFeatures
metricsFeatureSet1APIactivePowerAvgFeature, generatorRotationSpeedAvgFeature, gearOilTemperatureAvgFeature
metricsFeatureSet2SeedactivePowerDiffFeature, generatorRotationSpeedDiffFeature, gearOilTemperatureDiffFeature

Alternative approach

You can also create features from Python functions using Lamdba Feature Sets. This approach is very flexible and allows you to define features using Pandas.

See Create, Materialize, and Evaluate Features in the C3 AI Feature Store Using Lambda Feature Sets for more information.

Review the source data

We first take a look at the three entities we will be using: WindTurbine, WindTurbineMeasurementSeries, and WindTurbineMeasurement.

Python
c3.WindTurbine.eval()
IDnamelocationpowermanufacturer
demo_TURBINE-1TURBINE-1York150demo_Siemens
demo_TURBINE-2TURBINE-2York50demo_Siemens
Python
c3.WindTurbineMeasurementSeries.eval()
IDwindTurbinetreatment
demo_TURBINE-1_seriesdemo_TURBINE-1rate
demo_TURBINE-2_seriesdemo_TURBINE-2rate
Python
c3.WindTurbineMeasurement.eval().head()
startisEstimatedparentIDgearOilTemperaturegeneratorRotationSpeedactivePower
2022-01-01 00:45:16Falsedemo_TURBINE-1_seriesdemo_TURBINE-1_series#136.417585018
2022-01-01 02:27:19Falsedemo_TURBINE-1_seriesdemo_TURBINE-1_series#346.824465474
2022-01-01 05:42:09Falsedemo_TURBINE-1_seriesdemo_TURBINE-1_series#731.921844989
2022-01-01 08:47:58Falsedemo_TURBINE-1_seriesdemo_TURBINE-1_series#633.818464568
2022-01-01 11:21:47Falsedemo_TURBINE-1_seriesdemo_TURBINE-1_series#914.520685688

1. Use the eval call to retrieve metric data

Given a FeatureEvaluatable Type like WindTurbine, we can use the eval() method to retrieve fields, metrics, and features.

Here, we show how it can be used to retrieve metric data. We can dynamically create a metric definition to evaluate by placing the definition in overrideMetrics parameter of eval(). We can then refer to the metric by name in the projection parameter.

Python
metric_1 = c3.SimpleMetric.make({
    'id': 'WindTurbine_ActivePowerAvg',
    'name': 'ActivePowerAvg',
    'srcType': 'WindTurbine',
    'path': 'turbineMeasurements',
    'expression': "window('AVG', avg(normalized.data.activePower), -23, 24, 1)"
})

eval_result = c3.WindTurbine.eval(filter="name=='TURBINE-1'", projection='ActivePowerAvg',
                                  interval='HOUR', start='2022-01-02', end='2022-01-03',
                                  overrideMetrics=[metric_1])

eval_result.head()
subjecttimestampActivePowerAvg
demo_TURBINE-12022-01-02 00:00:005036.65
demo_TURBINE-12022-01-02 01:00:005042.61
demo_TURBINE-12022-01-02 02:00:005039.07
demo_TURBINE-12022-01-02 03:00:005030.53
demo_TURBINE-12022-01-02 04:00:005009.82

We next query for the two metrics that we have included in the package metadata. Since the metrics are already defined, we just refer to them by name in the projection parameter.

Python
eval_result = c3.WindTurbine.eval(projection='GeneratorRotationSpeedAvg, GearOilTemperatureAvg', 
                                  filter="name=='TURBINE-1'", interval='HOUR', start='2022-01-02', 
                                  end='2022-01-03')
eval_result.head()
subjecttimestampGeneratorRotationSpeedAvgGearOilTemperatureAvg
demo_TURBINE-12022-01-02 00:00:002351.4826.8538
demo_TURBINE-12022-01-02 01:00:002373.1425.6913
demo_TURBINE-12022-01-02 02:00:002380.4824.3122
demo_TURBINE-12022-01-02 03:00:002374.3922.7872
demo_TURBINE-12022-01-02 04:00:002360.9821.633

2. Create features from metrics using APIs

Now, let's create some features on top of metrics. Although we were able to directly eval() the metrics, defining a Feature will enable caching of the data in the feature store and make it available to use as an input to MlModels.

We can use Feature.fromMetric() and reference either the metric name or provide a definition dynamically. We then call create() to save to the database.

Create a feature from a dynamically defined metric

For our first feature, we dynamically define a metric ActivePowerAvg in the metric field of LegacyMetric.

Python
ActivePowerAvg = c3.SimpleMetric.make({
    'id': 'WindTurbine_ActivePowerAvg',
    'name': 'ActivePowerAvg',
    'srcType': 'WindTurbine',
    'path': 'turbineMeasurements',
    'expression': "window('AVG', avg(normalized.data.activePower), -23, 24, 1)"
})
ActivePowerAvg
Text
{
  "type" : "SimpleMetric",
  "name" : "ActivePowerAvg",
  "expression" : "window('AVG', avg(normalized.data.activePower), -23, 24, 1)",
  "id" : "WindTurbine_ActivePowerAvg",
  "srcType" : "WindTurbine",
  "path" : "turbineMeasurements"
}
Python
f1 = c3.Feature.fromMetric(subjectType=c3.WindTurbine,
                           legacy=c3.LegacyMetric(metric=ActivePowerAvg, interval='HOUR'),
                           name="activePowerAvgFeature")
f1 = f1.withDescription("Rolling average over 24 hours, created via API")
f1 = f1.create()
f1 = f1.get()
f1
Text
{
  "type" : "Feature",
  "id" : "WindTurbine#activePowerAvgFeature",
  "name" : "activePowerAvgFeature",
  "meta" : {
    "appCode" : 1767973528752887867,
    "env" : "c3",
    "app" : "wt58",
    "created" : "2023-06-06T21:09:36Z",
    "createdBy" : "BA",
    "updated" : "2023-06-06T21:09:36Z",
    "updatedBy" : "BA",
    "timestamp" : "2023-06-06T21:09:36Z",
    "fetchInclude" : "[]",
    "fetchType" : "Feature"
  },
  "version" : 1,
  "subjectType" : "WindTurbine",
  "description" : "Rolling average over 24 hours, created via API",
  "_data" : {
    "type" : "Data.Lazy",
    "lazies" : {
      "0" : {
        "type" : "Data.Lazy",
        "this" : "WindTurbine",
        "action" : "eval",
        "args" : {
          "spec" : {
            "type" : "EvalSpec",
            "projection" : "ActivePowerAvg",
            "interval" : "HOUR",
            "timeZone" : {
              "name" : "NONE"
            },
            "overrideMetrics" : [ {
              "type" : "SimpleMetric",
              "name" : "ActivePowerAvg",
              "expression" : "window('AVG', avg(normalized.data.activePower), -23, 24, 1)",
              "id" : "WindTurbine_ActivePowerAvg",
              "srcType" : "WindTurbine",
              "path" : "turbineMeasurements"
            } ],
            "actualize" : true
          }
        }
      }
    }
  }
}

We can use the method toPySrc() on the feature definition to see the equivalent python code to generate the feature.

Python
print(f1.toPySrc())
Text
tmp_df0 = WindTurbine.eval(projection='ActivePowerAvg', interval='HOUR', timeZone={
  "type" : "TimeZone",
  "name" : "NONE"
}, overrideMetrics=[{
  "type" : "SimpleMetric",
  "name" : "ActivePowerAvg",
  "expression" : "window('AVG', avg(normalized.data.activePower), -23, 24, 1)",
  "id" : "WindTurbine_ActivePowerAvg",
  "srcType" : "WindTurbine",
  "path" : "turbineMeasurements"
}], materialize=True)
ret = tmp_df0

Create features from a metric defined using metadata

For the next two features, we will reference by name metrics defined in the application metadata.

Python
f2 = c3.Feature.fromMetric(subjectType=c3.WindTurbine, 
                           legacy=c3.LegacyMetric(metric="GeneratorRotationSpeedAvg", interval='HOUR'), 
                           name="generatorRotationSpeedAvgFeature")

f2 = f2.withDescription("Rolling average over 24 hours, created via API")
f2 = f2.create()
f2 = f2.get()
f2
Text
{
  "type" : "Feature",
  "id" : "WindTurbine#generatorRotationSpeedAvgFeature",
  "name" : "generatorRotationSpeedAvgFeature",
  "meta" : {
    "appCode" : 1767973528752887867,
    "env" : "c3",
    "app" : "wt58",
    "created" : "2023-06-06T21:09:36Z",
    "createdBy" : "BA",
    "updated" : "2023-06-06T21:09:36Z",
    "updatedBy" : "BA",
    "timestamp" : "2023-06-06T21:09:36Z",
    "fetchInclude" : "[]",
    "fetchType" : "Feature"
  },
  "version" : 1,
  "subjectType" : "WindTurbine",
  "description" : "Rolling average over 24 hours, created via API",
  "_data" : {
    "type" : "Data.Lazy",
    "lazies" : {
      "0" : {
        "type" : "Data.Lazy",
        "this" : "WindTurbine",
        "action" : "eval",
        "args" : {
          "spec" : {
            "type" : "EvalSpec",
            "projection" : "GeneratorRotationSpeedAvg",
            "interval" : "HOUR",
            "timeZone" : {
              "name" : "NONE"
            },
            "actualize" : true
          }
        }
      }
    }
  }
}

We can view the source code of this metric as well. Note that it does not include the definition of the metric.

Python
print(f2.toPySrc())
Text
tmp_df0 = WindTurbine.eval(projection='GeneratorRotationSpeedAvg', interval='HOUR', timeZone={
  "type" : "TimeZone",
  "name" : "NONE"
}, materialize=True)
ret = tmp_df0
Python
f3 = c3.Feature.fromMetric(subjectType=c3.WindTurbine, 
                           legacy=c3.LegacyMetric(metric="GearOilTemperatureAvg", interval='HOUR'), 
                           name="gearOilTemperatureAvgFeature")

f3 = f3.withDescription("Rolling average over 24 hours, created via API")
f3 = f3.create()
f3 = f3.get()
f3
Text
{
  "type" : "Feature",
  "id" : "WindTurbine#gearOilTemperatureAvgFeature",
  "name" : "gearOilTemperatureAvgFeature",
  "meta" : {
    "appCode" : 1767973528752887867,
    "env" : "c3",
    "app" : "wt58",
    "created" : "2023-06-06T21:09:36Z",
    "createdBy" : "BA",
    "updated" : "2023-06-06T21:09:36Z",
    "updatedBy" : "BA",
    "timestamp" : "2023-06-06T21:09:36Z",
    "fetchInclude" : "[]",
    "fetchType" : "Feature"
  },
  "version" : 1,
  "subjectType" : "WindTurbine",
  "description" : "Rolling average over 24 hours, created via API",
  "_data" : {
    "type" : "Data.Lazy",
    "lazies" : {
      "0" : {
        "type" : "Data.Lazy",
        "this" : "WindTurbine",
        "action" : "eval",
        "args" : {
          "spec" : {
            "type" : "EvalSpec",
            "projection" : "GearOilTemperatureAvg",
            "interval" : "HOUR",
            "timeZone" : {
              "name" : "NONE"
            },
            "actualize" : true
          }
        }
      }
    }
  }
}

Materialize and eval the features

Before we can read our feature data, we must materialize the features first. In production, this will be handled by a cron job.

Note: Materialization for metric-based features goes back only 5 years by default. This is because metrics require a time range to evaluate. If you have data that goes back more than five years, you can address this by setting materializeTimeRange on each feature when you create it.

For example, when creating activePowerAvgFeature, we can set the feature to only materialize data from the year 2012 as follows:

Python
f1 = c3.Feature.fromMetric(subjectType=c3.WindTurbine,
                           legacy=c3.LegacyMetric(metric=ActivePowerAvg, interval='HOUR'),
                           name="activePowerAvgFeature")
f1 = f1.withMaterializeTimeRange(c3.Pair.ofStr('dateTime("2012-01-01")', 'dateTime("2013-01-01")'),)
f1 = f1.withDescription("Rolling average over 24 hours, created via API")
f1 = f1.create()
Python
f1.materialize(sync=True)
f2.materialize(sync=True)
f3.materialize(sync=True)

We can now use eval() to retrieve our features.

Python
projection = 'activePowerAvgFeature, generatorRotationSpeedAvgFeature, gearOilTemperatureAvgFeature'
eval_result = c3.WindTurbine.eval(projection=projection, filter="name=='TURBINE-1'", 
                                  start='2022-01-02', end='2022-01-03', interval='HOUR')

eval_result.head()
subjecttimestampactivePowerAvgFeaturegeneratorRotationSpeedAvgFeaturegearOilTemperatureAvgFeature
demo_TURBINE-12022-01-02 00:00:005036.652351.4826.8538
demo_TURBINE-12022-01-02 01:00:005042.612373.1425.6913
demo_TURBINE-12022-01-02 02:00:005039.072380.4824.3122
demo_TURBINE-12022-01-02 03:00:005030.532374.3922.7872
demo_TURBINE-12022-01-02 04:00:005009.822360.9821.633

We can also use feature-specific APIs, like evalFeaturesBatch().

Python
feature_list = ['activePowerAvgFeature', 'generatorRotationSpeedAvgFeature', 'gearOilTemperatureAvgFeature']
eval_result = c3.WindTurbine.evalFeaturesBatch(features=feature_list, filter="name=='TURBINE-1'",
                                               start='2022-01-02', end='2022-01-03')
eval_result.head()
subjecttimestampactivePowerAvgFeaturegeneratorRotationSpeedAvgFeaturegearOilTemperatureAvgFeature
demo_TURBINE-12022-01-02 00:00:005036.652351.4826.8538
demo_TURBINE-12022-01-02 01:00:005042.612373.1425.6913
demo_TURBINE-12022-01-02 02:00:005039.072380.4824.3122
demo_TURBINE-12022-01-02 03:00:005030.532374.3922.7872
demo_TURBINE-12022-01-02 04:00:005009.822360.9821.633

Create a feature set

Now, we can build a feature set from these features.

Python
feature_set = c3.Feature.Set(name='metricsFeatureSet1',
                             id='WindTurbine#metricsFeatureSet1',
                             subjectType=c3.WindTurbine,
                             features=['activePowerAvgFeature',
                                       'generatorRotationSpeedAvgFeature',
                                       'gearOilTemperatureAvgFeature'])
feature_set = feature_set.withInterval("HOUR")  # a feature set has an associated interval to be evaluated
feature_set = feature_set.withDescription("Rolling average features, created via API")
feature_set = feature_set.create()
feature_set = feature_set.get()
feature_set
Text
{
  "type" : "Feature.Set",
  "id" : "WindTurbine#metricsFeatureSet1",
  "name" : "metricsFeatureSet1",
  "meta" : {
    "appCode" : 1767973528752887867,
    "env" : "c3",
    "app" : "wt58",
    "created" : "2023-06-06T21:09:40Z",
    "createdBy" : "BA",
    "updated" : "2023-06-06T21:09:40Z",
    "updatedBy" : "BA",
    "timestamp" : "2023-06-06T21:09:40Z",
    "fetchInclude" : "[]",
    "fetchType" : "Feature.Set"
  },
  "version" : 1,
  "subjectType" : "WindTurbine",
  "description" : "Rolling average features, created via API",
  "interval" : "HOUR",
  "features" : [ "activePowerAvgFeature", "generatorRotationSpeedAvgFeature", "gearOilTemperatureAvgFeature" ]
}

We can retrieve data from this feature set using evalFeatureSetBatch.

Python
eval_fs_result = c3.WindTurbine.evalFeatureSetBatch(filter="name=='TURBINE-1'", featureSet=feature_set, 
                                                    start='2022-01-02', end='2022-01-03')

eval_fs_result.head()

Publish features and feature sets as seed data

We can now call the publish() method on each feature and feature set to have them written as seed data to the package.

Note: Publishing seed data is only supported if your application was created through VS Code.

Python
f1.publish()
Text
{
  "type" : "Feature",
  "id" : "WindTurbine#activePowerAvgFeature",
  "name" : "activePowerAvgFeature",
  "meta" : {
    "appCode" : 1767973528752887867,
    "env" : "c3",
    "app" : "wt58",
    "created" : "2023-06-06T21:09:36Z",
    "createdBy" : "BA",
    "updated" : "2023-06-06T21:09:36Z",
    "updatedBy" : "BA",
    "timestamp" : "2023-06-06T21:09:36Z",
    "fetchInclude" : "[]",
    "fetchType" : "Feature"
  },
  "version" : 1,
  "subjectType" : "WindTurbine",
  "description" : "Rolling average over 24 hours, created via API",
  "_data" : {
    "type" : "Data.Lazy",
    "lazies" : {
      "0" : {
        "type" : "Data.Lazy",
        "this" : "WindTurbine",
        "action" : "eval",
        "args" : {
          "spec" : {
            "type" : "EvalSpec",
            "projection" : "ActivePowerAvg",
            "interval" : "HOUR",
            "timeZone" : {
              "name" : "NONE"
            },
            "overrideMetrics" : [ {
              "type" : "SimpleMetric",
              "name" : "ActivePowerAvg",
              "expression" : "window('AVG', avg(normalized.data.activePower), -23, 24, 1)",
              "id" : "WindTurbine_ActivePowerAvg",
              "srcType" : "WindTurbine",
              "path" : "turbineMeasurements"
            } ],
            "actualize" : true
          }
        }
      }
    }
  }
}

The published feature has been written under the root directory of the package at seed/Feature/WindTurbine#activePowerAvgFeature.json.

Python
f2.publish()
Text
{
  "type" : "Feature",
  "id" : "WindTurbine#generatorRotationSpeedAvgFeature",
  "name" : "generatorRotationSpeedAvgFeature",
  "meta" : {
    "appCode" : 1767973528752887867,
    "env" : "c3",
    "app" : "wt58",
    "created" : "2023-06-06T21:09:36Z",
    "createdBy" : "BA",
    "updated" : "2023-06-06T21:09:36Z",
    "updatedBy" : "BA",
    "timestamp" : "2023-06-06T21:09:36Z",
    "fetchInclude" : "[]",
    "fetchType" : "Feature"
  },
  "version" : 1,
  "subjectType" : "WindTurbine",
  "description" : "Rolling average over 24 hours, created via API",
  "_data" : {
    "type" : "Data.Lazy",
    "lazies" : {
      "0" : {
        "type" : "Data.Lazy",
        "this" : "WindTurbine",
        "action" : "eval",
        "args" : {
          "spec" : {
            "type" : "EvalSpec",
            "projection" : "GeneratorRotationSpeedAvg",
            "interval" : "HOUR",
            "timeZone" : {
              "name" : "NONE"
            },
            "actualize" : true
          }
        }
      }
    }
  }
}
Python
f3.publish()
Text
{
  "type" : "Feature",
  "id" : "WindTurbine#gearOilTemperatureAvgFeature",
  "name" : "gearOilTemperatureAvgFeature",
  "meta" : {
    "appCode" : 1767973528752887867,
    "env" : "c3",
    "app" : "wt58",
    "created" : "2023-06-06T21:09:36Z",
    "createdBy" : "BA",
    "updated" : "2023-06-06T21:09:36Z",
    "updatedBy" : "BA",
    "timestamp" : "2023-06-06T21:09:36Z",
    "fetchInclude" : "[]",
    "fetchType" : "Feature"
  },
  "version" : 1,
  "subjectType" : "WindTurbine",
  "description" : "Rolling average over 24 hours, created via API",
  "_data" : {
    "type" : "Data.Lazy",
    "lazies" : {
      "0" : {
        "type" : "Data.Lazy",
        "this" : "WindTurbine",
        "action" : "eval",
        "args" : {
          "spec" : {
            "type" : "EvalSpec",
            "projection" : "GearOilTemperatureAvg",
            "interval" : "HOUR",
            "timeZone" : {
              "name" : "NONE"
            },
            "actualize" : true
          }
        }
      }
    }
  }
}

We can also publish the feature set. This will be written under the package at seed/Feature.Set/WindTurbine#metricsFeatureSet1.json.

Python
feature_set.publish()
Text
{
  "type" : "Feature.Set",
  "id" : "WindTurbine#metricsFeatureSet1",
  "name" : "metricsFeatureSet1",
  "meta" : {
    "appCode" : 1767973528752887867,
    "env" : "c3",
    "app" : "wt58",
    "created" : "2023-06-06T21:09:40Z",
    "createdBy" : "BA",
    "updated" : "2023-06-06T21:09:40Z",
    "updatedBy" : "BA",
    "timestamp" : "2023-06-06T21:09:40Z",
    "fetchInclude" : "[]",
    "fetchType" : "Feature.Set"
  },
  "version" : 1,
  "subjectType" : "WindTurbine",
  "description" : "Rolling average features, created via API",
  "interval" : "HOUR",
  "features" : [ "activePowerAvgFeature", "generatorRotationSpeedAvgFeature", "gearOilTemperatureAvgFeature" ]
}

Review our work so far

Now, let's look at what we've created.

Python
c3.Feature.eval(projection='id,name,description')
subjectnamedescription
Asset#activePower_diffactivePower_diffNone
Asset#activePower_median_deviationactivePower_median_deviationNone
Asset#activePower_rolling_meanactivePower_rolling_meanNone
Asset#activePower_rolling_stdactivePower_rolling_stdNone
Asset#gearOilTemperature_diffgearOilTemperature_diffNone
Asset#gearOilTemperature_median_deviationgearOilTemperature_median_deviationNone
Asset#gearOilTemperature_rolling_meangearOilTemperature_rolling_meanNone
Asset#gearOilTemperature_rolling_stdgearOilTemperature_rolling_stdNone
WindTurbine#activePowerAvgFeatureactivePowerAvgFeatureRolling average over 24 hours, created via API
WindTurbine#activePowerDiffFeatureactivePowerDiffFeatureMetric-backed feature, seeded in application.
WindTurbine#activePower_diffactivePower_diffNone
WindTurbine#activePower_median_deviationactivePower_median_deviationNone
WindTurbine#activePower_rolling_meanactivePower_rolling_meanNone
WindTurbine#activePower_rolling_stdactivePower_rolling_stdNone
WindTurbine#gearOilTemperatureAvgFeaturegearOilTemperatureAvgFeatureRolling average over 24 hours, created via API
WindTurbine#gearOilTemperatureDiffFeaturegearOilTemperatureDiffFeatureMetric-backed feature, seeded in application. ...
WindTurbine#gearOilTemperature_diffgearOilTemperature_diffNone
WindTurbine#gearOilTemperature_median_deviationgearOilTemperature_median_deviationNone
WindTurbine#gearOilTemperature_rolling_meangearOilTemperature_rolling_meanNone
WindTurbine#gearOilTemperature_rolling_stdgearOilTemperature_rolling_stdNone
WindTurbine#generatorRotationSpeedAvgFeaturegeneratorRotationSpeedAvgFeatureRolling average over 24 hours, created via API
WindTurbine#generatorRotationSpeedDiffFeaturegeneratorRotationSpeedDiffFeatureMetric-backed feature, seeded in application. ...
WindTurbine#willFailNextDayFeaturewillFailNextDayFeatureA seeded feature that can be used as a label. ...
Python
c3.Feature.Set.eval(projection='id,name,description')
subjectnamedescription
Asset#windTurbineFeatureswindTurbineFeaturesNone
WindTurbine#labelFeatureSetlabelFeatureSetFeature is True if a failure should be predict...
WindTurbine#metricsFeatureSet1metricsFeatureSet1Rolling average features, created via API
WindTurbine#metricsFeatureSet2metricsFeatureSet2These are metrics-backed features, created via...
WindTurbine#windTurbineFeatureswindTurbineFeaturesNone
WindTurbine#windTurbineFeaturesCustomwindTurbineFeaturesCustomNone

3. Define features and feature sets as seed

We have seeded several features and one feature set as seed data. We will view the feature metadata, materialize the features and feature set, and then retrieve data.

Seed data format

Here is an example of a feature with the metric defined inline. In the seed/Feature/WindTurbine#activePowerDiffFeature.json file, we have:

JSON
{
  "type": "Feature",
  "id": "WindTurbine#activePowerDiffFeature",
  "name": "activePowerDiffFeature",
  "subjectType": "WindTurbine",
  "legacy": {
    "metric": {
      "type": "SimpleMetric",
      "id": "WindTurbine_ActivePowerDiff",
      "name": "ActivePowerDiff",
      "srcType": "WindTurbine",
      "path": "turbineMeasurements",
      "expression": "window(\"MAX\", rollingDiff(avg(normalized.data.activePower)), -23, 24, 1)"
    },
    "interval": "HOUR"
  },
  "description": "max difference, seeded in application"
}

Here is an example of a feature referencing a metric defined in metadata and referenced by name. In the seed/Feature/WindTurbine#gearOilTemperatureDiffFeature.json file, we have:

JSON
{
  "type": "Feature",
  "id": "WindTurbine#gearOilTemperatureDiffFeature",
  "name": "gearOilTemperatureDiffFeature",
  "subjectType": "WindTurbine",
  "legacy": {
    "metric": "GearOilTemperatureDiff",
    "interval": "HOUR"
  },
  "description": "max difference, seeded in application"
}

Here is another feature referencing a metric in the seed/Feature/WindTurbine#generatorRotationSpeedDiffFeature.json file:

JSON
{
  "type": "Feature",
  "id": "WindTurbine#generatorRotationSpeedDiffFeature",
  "name": "generatorRotationSpeedDiffFeature",
  "subjectType": "WindTurbine",
  "legacy": {
    "metric": "GeneratorRotationSpeedDiff",
    "interval": "HOUR"
  },
  "description": "max difference, seeded in application"
}

Here is a feature set in the seed/Feature.Set/WindTurbine#metricsFeatureSet2.json file:

JSON
{
  "type": "Feature.Set",
  "id": "WindTurbine#metricsFeatureSet2",
  "name": "metricsFeatureSet2",
  "subjectType": "WindTurbine",
  "description": "Max difference features, created via seed",
  "interval": "HOUR",
  "features": [
    "activePowerDiffFeature",
    "generatorRotationSpeedDiffFeature",
    "gearOilTemperatureDiffFeature"
  ]
}

Note: If you want to seed a Feature with a specific materialization time range, you can add that to the JSON as follows:

JSON
{
  "type": "Feature",
  ...
  "materializeTimeRange": {
    "type": "Pair<string, string>",
    "fst": "dateTime(\"2012-01-01\")",
    "snd": "dateTime(\"2013-01-01\")"
  }
}  

Now, we can verify that the new features and feature set are in our application.

Python
c3.Feature.eval(projection='id,name,description', filter="contains(description, 'seed')")
subjectnamedescription
WindTurbine#activePowerDiffFeatureactivePowerDiffFeatureMetric-backed feature, seeded in application.
WindTurbine#gearOilTemperatureDiffFeaturegearOilTemperatureDiffFeatureMetric-backed feature, seeded in application. ...
WindTurbine#generatorRotationSpeedDiffFeaturegeneratorRotationSpeedDiffFeatureMetric-backed feature, seeded in application. ...
WindTurbine#willFailNextDayFeaturewillFailNextDayFeatureA seeded feature that can be used as a label. ...
Python
c3.Feature.Set.eval(projection='id,name,description', filter="contains(description, 'seed')")
subjectnamedescription
WindTurbine#labelFeatureSetlabelFeatureSetFeature is True if a failure should be predict...
WindTurbine#metricsFeatureSet2metricsFeatureSet2These are metrics-backed features, created via...

Materialize the new features

We still need to explicitly materialize our features to see the data. In production, this will be done through a cron job.

Python
for f_id in ['WindTurbine#activePowerDiffFeature',
             'WindTurbine#gearOilTemperatureDiffFeature',
             'WindTurbine#generatorRotationSpeedDiffFeature']:
    print(f"Materializing {f_id}...", end='')
    f = c3.Feature.make({'id':f_id, 'subjectType':c3.WindTurbine}).get()
    f.materialize(sync=True)
    print("successful.")
Text
Materializing WindTurbine#activePowerDiffFeature...successful.
Materializing WindTurbine#gearOilTemperatureDiffFeature...successful.
Materializing WindTurbine#generatorRotationSpeedDiffFeature...successful.

We also materialize the feature set.

Python
fs_metrics2 = c3.Feature.Set(id="WindTurbine#metricsFeatureSet2").get()
fs_metrics2.materialize(sync=True)

Retrieve the data from the new features

Python
projection = 'activePowerDiffFeature, generatorRotationSpeedDiffFeature, gearOilTemperatureDiffFeature'
eval_result = c3.WindTurbine.eval(projection=projection, filter="id=='demo_TURBINE-1'", 
                                  start='2022-01-02', end='2022-01-03', interval='HOUR')

eval_result.head()
subjecttimestampactivePowerDiffFeaturegeneratorRotationSpeedDiffFeaturegearOilTemperatureDiffFeature
demo_TURBINE-12022-01-02 00:00:0056070110.3333
demo_TURBINE-12022-01-02 01:00:0056070110.3333
demo_TURBINE-12022-01-02 02:00:0056070110.3333
demo_TURBINE-12022-01-02 03:00:0056070110.3333
demo_TURBINE-12022-01-02 04:00:0056070110.3333

We can also reference our feature set to retrieve data using evalFeatureSetBatch.

Python
fs_metrics2 = c3.Feature.Set.make({'id':'WindTurbine#metricsFeatureSet2'})
fs_eval_result = c3.WindTurbine.evalFeatureSetBatch(filter="id=='demo_TURBINE-1'", featureSet=fs_metrics2, 
                                                    start='2022-01-02', end='2022-01-03')

fs_eval_result.head()
subjecttimestampactivePowerDiffFeaturegearOilTemperatureDiffFeaturegeneratorRotationSpeedDiffFeature
demo_TURBINE-12022-01-02 00:00:0056010.3333701
demo_TURBINE-12022-01-02 01:00:0056010.3333701
demo_TURBINE-12022-01-02 02:00:0056010.3333701
demo_TURBINE-12022-01-02 03:00:0056010.3333701
demo_TURBINE-12022-01-02 04:00:0056010.3333701

Clean up

We cleanup the artifacts we created.

Note: Features and feature sets are seed data. If you want to delete the feature and feature set definitions, you need to call removeSeedData() instead of remove() or removeAll().

Python
c3.Feature(id='WindTurbine#activePowerAvgFeature').removeSeedData()
c3.Feature(id='WindTurbine#generatorRotationSpeedAvgFeature').removeSeedData()
c3.Feature(id='WindTurbine#gearOilTemperatureAvgFeature').removeSeedData()
c3.Feature.Set(id='WindTurbine#metricsFeatureSet1').removeSeedData()
Text
True

See also

Was this page helpful?