Metrics and Features
This tutorial shows examples of using metrics and features together, including the following tasks:
- Retrieving metric data using the new
eval()API. - Creating features from metrics, creating feature sets referencing such features, and publishing the features or feature sets.
- Defining features and feature sets directly as seed data and loading them into the application.
Metrics, features, and feature sets used
The table below lists the features and metrics used in this tutorial:
| Field | Metric | How created | Feature | How created |
|---|---|---|---|---|
| activePower | ActivePowerAvg | In feature | activePowerAvgFeature | API |
| generatorRotationSpeed | GeneratorRotationSpeedAvg | Metadata | generatorRotationSpeedAvgFeature | API |
| gearOilTemperature | GearOilTemperatureAvg | Metadata | gearOilTemperatureAvgFeature | API |
| activePower | ActivePowerDiff | In feature | activePowerDiffFeature | Seed |
| generatorRotationSpeed | GeneratorRotationSpeedDiff | Metadata | generatorRotationSpeedDiffFeature | Seed |
| gearOilTemperature | GearOilTemperatureDiff | Metadata | gearOilTemperatureDiffFeature | Seed |
Here are the feature sets used in this tutorial.
| Feature Set | How created | Features |
|---|---|---|
| metricsFeatureSet1 | API | activePowerAvgFeature, generatorRotationSpeedAvgFeature, gearOilTemperatureAvgFeature |
| metricsFeatureSet2 | Seed | activePowerDiffFeature, generatorRotationSpeedDiffFeature, gearOilTemperatureDiffFeature |
Alternative approach
You can also create features from Python functions using Lamdba Feature Sets. This approach is very flexible and allows you to define features using Pandas.
See Create, Materialize, and Evaluate Features in the C3 AI Feature Store Using Lambda Feature Sets for more information.
Review the source data
We first take a look at the three entities we will be using: WindTurbine, WindTurbineMeasurementSeries, and WindTurbineMeasurement.
c3.WindTurbine.eval()| ID | name | location | power | manufacturer |
|---|---|---|---|---|
| demo_TURBINE-1 | TURBINE-1 | York | 150 | demo_Siemens |
| demo_TURBINE-2 | TURBINE-2 | York | 50 | demo_Siemens |
c3.WindTurbineMeasurementSeries.eval()| ID | windTurbine | treatment |
|---|---|---|
| demo_TURBINE-1_series | demo_TURBINE-1 | rate |
| demo_TURBINE-2_series | demo_TURBINE-2 | rate |
c3.WindTurbineMeasurement.eval().head()| start | isEstimated | parent | ID | gearOilTemperature | generatorRotationSpeed | activePower |
|---|---|---|---|---|---|---|
| 2022-01-01 00:45:16 | False | demo_TURBINE-1_series | demo_TURBINE-1_series#1 | 36.4 | 1758 | 5018 |
| 2022-01-01 02:27:19 | False | demo_TURBINE-1_series | demo_TURBINE-1_series#3 | 46.8 | 2446 | 5474 |
| 2022-01-01 05:42:09 | False | demo_TURBINE-1_series | demo_TURBINE-1_series#7 | 31.9 | 2184 | 4989 |
| 2022-01-01 08:47:58 | False | demo_TURBINE-1_series | demo_TURBINE-1_series#6 | 33.8 | 1846 | 4568 |
| 2022-01-01 11:21:47 | False | demo_TURBINE-1_series | demo_TURBINE-1_series#9 | 14.5 | 2068 | 5688 |
1. Use the eval call to retrieve metric data
Given a FeatureEvaluatable Type like WindTurbine, we can use the eval() method to retrieve fields, metrics, and features.
Here, we show how it can be used to retrieve metric data. We can dynamically create a metric definition to evaluate by placing the definition in overrideMetrics parameter of eval(). We can then refer to the metric by name in the projection parameter.
metric_1 = c3.SimpleMetric.make({
'id': 'WindTurbine_ActivePowerAvg',
'name': 'ActivePowerAvg',
'srcType': 'WindTurbine',
'path': 'turbineMeasurements',
'expression': "window('AVG', avg(normalized.data.activePower), -23, 24, 1)"
})
eval_result = c3.WindTurbine.eval(filter="name=='TURBINE-1'", projection='ActivePowerAvg',
interval='HOUR', start='2022-01-02', end='2022-01-03',
overrideMetrics=[metric_1])
eval_result.head()| subject | timestamp | ActivePowerAvg |
|---|---|---|
| demo_TURBINE-1 | 2022-01-02 00:00:00 | 5036.65 |
| demo_TURBINE-1 | 2022-01-02 01:00:00 | 5042.61 |
| demo_TURBINE-1 | 2022-01-02 02:00:00 | 5039.07 |
| demo_TURBINE-1 | 2022-01-02 03:00:00 | 5030.53 |
| demo_TURBINE-1 | 2022-01-02 04:00:00 | 5009.82 |
We next query for the two metrics that we have included in the package metadata. Since the metrics are already defined, we just refer to them by name in the projection parameter.
eval_result = c3.WindTurbine.eval(projection='GeneratorRotationSpeedAvg, GearOilTemperatureAvg',
filter="name=='TURBINE-1'", interval='HOUR', start='2022-01-02',
end='2022-01-03')
eval_result.head()| subject | timestamp | GeneratorRotationSpeedAvg | GearOilTemperatureAvg |
|---|---|---|---|
| demo_TURBINE-1 | 2022-01-02 00:00:00 | 2351.48 | 26.8538 |
| demo_TURBINE-1 | 2022-01-02 01:00:00 | 2373.14 | 25.6913 |
| demo_TURBINE-1 | 2022-01-02 02:00:00 | 2380.48 | 24.3122 |
| demo_TURBINE-1 | 2022-01-02 03:00:00 | 2374.39 | 22.7872 |
| demo_TURBINE-1 | 2022-01-02 04:00:00 | 2360.98 | 21.633 |
2. Create features from metrics using APIs
Now, let's create some features on top of metrics. Although we were able to directly eval() the metrics, defining a Feature will enable caching of the data in the feature store and make it available to use as an input to MlModels.
We can use Feature.fromMetric() and reference either the metric name or provide a definition dynamically. We then call create() to save to the database.
Create a feature from a dynamically defined metric
For our first feature, we dynamically define a metric ActivePowerAvg in the metric field of LegacyMetric.
ActivePowerAvg = c3.SimpleMetric.make({
'id': 'WindTurbine_ActivePowerAvg',
'name': 'ActivePowerAvg',
'srcType': 'WindTurbine',
'path': 'turbineMeasurements',
'expression': "window('AVG', avg(normalized.data.activePower), -23, 24, 1)"
})
ActivePowerAvg{
"type" : "SimpleMetric",
"name" : "ActivePowerAvg",
"expression" : "window('AVG', avg(normalized.data.activePower), -23, 24, 1)",
"id" : "WindTurbine_ActivePowerAvg",
"srcType" : "WindTurbine",
"path" : "turbineMeasurements"
}f1 = c3.Feature.fromMetric(subjectType=c3.WindTurbine,
legacy=c3.LegacyMetric(metric=ActivePowerAvg, interval='HOUR'),
name="activePowerAvgFeature")
f1 = f1.withDescription("Rolling average over 24 hours, created via API")
f1 = f1.create()
f1 = f1.get()
f1{
"type" : "Feature",
"id" : "WindTurbine#activePowerAvgFeature",
"name" : "activePowerAvgFeature",
"meta" : {
"appCode" : 1767973528752887867,
"env" : "c3",
"app" : "wt58",
"created" : "2023-06-06T21:09:36Z",
"createdBy" : "BA",
"updated" : "2023-06-06T21:09:36Z",
"updatedBy" : "BA",
"timestamp" : "2023-06-06T21:09:36Z",
"fetchInclude" : "[]",
"fetchType" : "Feature"
},
"version" : 1,
"subjectType" : "WindTurbine",
"description" : "Rolling average over 24 hours, created via API",
"_data" : {
"type" : "Data.Lazy",
"lazies" : {
"0" : {
"type" : "Data.Lazy",
"this" : "WindTurbine",
"action" : "eval",
"args" : {
"spec" : {
"type" : "EvalSpec",
"projection" : "ActivePowerAvg",
"interval" : "HOUR",
"timeZone" : {
"name" : "NONE"
},
"overrideMetrics" : [ {
"type" : "SimpleMetric",
"name" : "ActivePowerAvg",
"expression" : "window('AVG', avg(normalized.data.activePower), -23, 24, 1)",
"id" : "WindTurbine_ActivePowerAvg",
"srcType" : "WindTurbine",
"path" : "turbineMeasurements"
} ],
"actualize" : true
}
}
}
}
}
}We can use the method toPySrc() on the feature definition to see the equivalent python code to generate the feature.
print(f1.toPySrc())tmp_df0 = WindTurbine.eval(projection='ActivePowerAvg', interval='HOUR', timeZone={
"type" : "TimeZone",
"name" : "NONE"
}, overrideMetrics=[{
"type" : "SimpleMetric",
"name" : "ActivePowerAvg",
"expression" : "window('AVG', avg(normalized.data.activePower), -23, 24, 1)",
"id" : "WindTurbine_ActivePowerAvg",
"srcType" : "WindTurbine",
"path" : "turbineMeasurements"
}], materialize=True)
ret = tmp_df0Create features from a metric defined using metadata
For the next two features, we will reference by name metrics defined in the application metadata.
f2 = c3.Feature.fromMetric(subjectType=c3.WindTurbine,
legacy=c3.LegacyMetric(metric="GeneratorRotationSpeedAvg", interval='HOUR'),
name="generatorRotationSpeedAvgFeature")
f2 = f2.withDescription("Rolling average over 24 hours, created via API")
f2 = f2.create()
f2 = f2.get()
f2{
"type" : "Feature",
"id" : "WindTurbine#generatorRotationSpeedAvgFeature",
"name" : "generatorRotationSpeedAvgFeature",
"meta" : {
"appCode" : 1767973528752887867,
"env" : "c3",
"app" : "wt58",
"created" : "2023-06-06T21:09:36Z",
"createdBy" : "BA",
"updated" : "2023-06-06T21:09:36Z",
"updatedBy" : "BA",
"timestamp" : "2023-06-06T21:09:36Z",
"fetchInclude" : "[]",
"fetchType" : "Feature"
},
"version" : 1,
"subjectType" : "WindTurbine",
"description" : "Rolling average over 24 hours, created via API",
"_data" : {
"type" : "Data.Lazy",
"lazies" : {
"0" : {
"type" : "Data.Lazy",
"this" : "WindTurbine",
"action" : "eval",
"args" : {
"spec" : {
"type" : "EvalSpec",
"projection" : "GeneratorRotationSpeedAvg",
"interval" : "HOUR",
"timeZone" : {
"name" : "NONE"
},
"actualize" : true
}
}
}
}
}
}We can view the source code of this metric as well. Note that it does not include the definition of the metric.
print(f2.toPySrc())tmp_df0 = WindTurbine.eval(projection='GeneratorRotationSpeedAvg', interval='HOUR', timeZone={
"type" : "TimeZone",
"name" : "NONE"
}, materialize=True)
ret = tmp_df0f3 = c3.Feature.fromMetric(subjectType=c3.WindTurbine,
legacy=c3.LegacyMetric(metric="GearOilTemperatureAvg", interval='HOUR'),
name="gearOilTemperatureAvgFeature")
f3 = f3.withDescription("Rolling average over 24 hours, created via API")
f3 = f3.create()
f3 = f3.get()
f3{
"type" : "Feature",
"id" : "WindTurbine#gearOilTemperatureAvgFeature",
"name" : "gearOilTemperatureAvgFeature",
"meta" : {
"appCode" : 1767973528752887867,
"env" : "c3",
"app" : "wt58",
"created" : "2023-06-06T21:09:36Z",
"createdBy" : "BA",
"updated" : "2023-06-06T21:09:36Z",
"updatedBy" : "BA",
"timestamp" : "2023-06-06T21:09:36Z",
"fetchInclude" : "[]",
"fetchType" : "Feature"
},
"version" : 1,
"subjectType" : "WindTurbine",
"description" : "Rolling average over 24 hours, created via API",
"_data" : {
"type" : "Data.Lazy",
"lazies" : {
"0" : {
"type" : "Data.Lazy",
"this" : "WindTurbine",
"action" : "eval",
"args" : {
"spec" : {
"type" : "EvalSpec",
"projection" : "GearOilTemperatureAvg",
"interval" : "HOUR",
"timeZone" : {
"name" : "NONE"
},
"actualize" : true
}
}
}
}
}
}Materialize and eval the features
Before we can read our feature data, we must materialize the features first. In production, this will be handled by a cron job.
Note: Materialization for metric-based features goes back only 5 years by default. This is because metrics require a time range to evaluate. If you have data that goes back more than five years, you can address this by setting materializeTimeRange on each feature when you create it.
For example, when creating activePowerAvgFeature, we can set the feature to only materialize data from the year 2012 as follows:
f1 = c3.Feature.fromMetric(subjectType=c3.WindTurbine,
legacy=c3.LegacyMetric(metric=ActivePowerAvg, interval='HOUR'),
name="activePowerAvgFeature")
f1 = f1.withMaterializeTimeRange(c3.Pair.ofStr('dateTime("2012-01-01")', 'dateTime("2013-01-01")'),)
f1 = f1.withDescription("Rolling average over 24 hours, created via API")
f1 = f1.create()f1.materialize(sync=True)
f2.materialize(sync=True)
f3.materialize(sync=True)We can now use eval() to retrieve our features.
projection = 'activePowerAvgFeature, generatorRotationSpeedAvgFeature, gearOilTemperatureAvgFeature'
eval_result = c3.WindTurbine.eval(projection=projection, filter="name=='TURBINE-1'",
start='2022-01-02', end='2022-01-03', interval='HOUR')
eval_result.head()| subject | timestamp | activePowerAvgFeature | generatorRotationSpeedAvgFeature | gearOilTemperatureAvgFeature |
|---|---|---|---|---|
| demo_TURBINE-1 | 2022-01-02 00:00:00 | 5036.65 | 2351.48 | 26.8538 |
| demo_TURBINE-1 | 2022-01-02 01:00:00 | 5042.61 | 2373.14 | 25.6913 |
| demo_TURBINE-1 | 2022-01-02 02:00:00 | 5039.07 | 2380.48 | 24.3122 |
| demo_TURBINE-1 | 2022-01-02 03:00:00 | 5030.53 | 2374.39 | 22.7872 |
| demo_TURBINE-1 | 2022-01-02 04:00:00 | 5009.82 | 2360.98 | 21.633 |
We can also use feature-specific APIs, like evalFeaturesBatch().
feature_list = ['activePowerAvgFeature', 'generatorRotationSpeedAvgFeature', 'gearOilTemperatureAvgFeature']
eval_result = c3.WindTurbine.evalFeaturesBatch(features=feature_list, filter="name=='TURBINE-1'",
start='2022-01-02', end='2022-01-03')
eval_result.head()| subject | timestamp | activePowerAvgFeature | generatorRotationSpeedAvgFeature | gearOilTemperatureAvgFeature |
|---|---|---|---|---|
| demo_TURBINE-1 | 2022-01-02 00:00:00 | 5036.65 | 2351.48 | 26.8538 |
| demo_TURBINE-1 | 2022-01-02 01:00:00 | 5042.61 | 2373.14 | 25.6913 |
| demo_TURBINE-1 | 2022-01-02 02:00:00 | 5039.07 | 2380.48 | 24.3122 |
| demo_TURBINE-1 | 2022-01-02 03:00:00 | 5030.53 | 2374.39 | 22.7872 |
| demo_TURBINE-1 | 2022-01-02 04:00:00 | 5009.82 | 2360.98 | 21.633 |
Create a feature set
Now, we can build a feature set from these features.
feature_set = c3.Feature.Set(name='metricsFeatureSet1',
id='WindTurbine#metricsFeatureSet1',
subjectType=c3.WindTurbine,
features=['activePowerAvgFeature',
'generatorRotationSpeedAvgFeature',
'gearOilTemperatureAvgFeature'])
feature_set = feature_set.withInterval("HOUR") # a feature set has an associated interval to be evaluated
feature_set = feature_set.withDescription("Rolling average features, created via API")
feature_set = feature_set.create()
feature_set = feature_set.get()
feature_set{
"type" : "Feature.Set",
"id" : "WindTurbine#metricsFeatureSet1",
"name" : "metricsFeatureSet1",
"meta" : {
"appCode" : 1767973528752887867,
"env" : "c3",
"app" : "wt58",
"created" : "2023-06-06T21:09:40Z",
"createdBy" : "BA",
"updated" : "2023-06-06T21:09:40Z",
"updatedBy" : "BA",
"timestamp" : "2023-06-06T21:09:40Z",
"fetchInclude" : "[]",
"fetchType" : "Feature.Set"
},
"version" : 1,
"subjectType" : "WindTurbine",
"description" : "Rolling average features, created via API",
"interval" : "HOUR",
"features" : [ "activePowerAvgFeature", "generatorRotationSpeedAvgFeature", "gearOilTemperatureAvgFeature" ]
}We can retrieve data from this feature set using evalFeatureSetBatch.
eval_fs_result = c3.WindTurbine.evalFeatureSetBatch(filter="name=='TURBINE-1'", featureSet=feature_set,
start='2022-01-02', end='2022-01-03')
eval_fs_result.head()Publish features and feature sets as seed data
We can now call the publish() method on each feature and feature set to have them written as seed data to the package.
Note: Publishing seed data is only supported if your application was created through VS Code.
f1.publish(){
"type" : "Feature",
"id" : "WindTurbine#activePowerAvgFeature",
"name" : "activePowerAvgFeature",
"meta" : {
"appCode" : 1767973528752887867,
"env" : "c3",
"app" : "wt58",
"created" : "2023-06-06T21:09:36Z",
"createdBy" : "BA",
"updated" : "2023-06-06T21:09:36Z",
"updatedBy" : "BA",
"timestamp" : "2023-06-06T21:09:36Z",
"fetchInclude" : "[]",
"fetchType" : "Feature"
},
"version" : 1,
"subjectType" : "WindTurbine",
"description" : "Rolling average over 24 hours, created via API",
"_data" : {
"type" : "Data.Lazy",
"lazies" : {
"0" : {
"type" : "Data.Lazy",
"this" : "WindTurbine",
"action" : "eval",
"args" : {
"spec" : {
"type" : "EvalSpec",
"projection" : "ActivePowerAvg",
"interval" : "HOUR",
"timeZone" : {
"name" : "NONE"
},
"overrideMetrics" : [ {
"type" : "SimpleMetric",
"name" : "ActivePowerAvg",
"expression" : "window('AVG', avg(normalized.data.activePower), -23, 24, 1)",
"id" : "WindTurbine_ActivePowerAvg",
"srcType" : "WindTurbine",
"path" : "turbineMeasurements"
} ],
"actualize" : true
}
}
}
}
}
}The published feature has been written under the root directory of the package at seed/Feature/WindTurbine#activePowerAvgFeature.json.
f2.publish(){
"type" : "Feature",
"id" : "WindTurbine#generatorRotationSpeedAvgFeature",
"name" : "generatorRotationSpeedAvgFeature",
"meta" : {
"appCode" : 1767973528752887867,
"env" : "c3",
"app" : "wt58",
"created" : "2023-06-06T21:09:36Z",
"createdBy" : "BA",
"updated" : "2023-06-06T21:09:36Z",
"updatedBy" : "BA",
"timestamp" : "2023-06-06T21:09:36Z",
"fetchInclude" : "[]",
"fetchType" : "Feature"
},
"version" : 1,
"subjectType" : "WindTurbine",
"description" : "Rolling average over 24 hours, created via API",
"_data" : {
"type" : "Data.Lazy",
"lazies" : {
"0" : {
"type" : "Data.Lazy",
"this" : "WindTurbine",
"action" : "eval",
"args" : {
"spec" : {
"type" : "EvalSpec",
"projection" : "GeneratorRotationSpeedAvg",
"interval" : "HOUR",
"timeZone" : {
"name" : "NONE"
},
"actualize" : true
}
}
}
}
}
}f3.publish(){
"type" : "Feature",
"id" : "WindTurbine#gearOilTemperatureAvgFeature",
"name" : "gearOilTemperatureAvgFeature",
"meta" : {
"appCode" : 1767973528752887867,
"env" : "c3",
"app" : "wt58",
"created" : "2023-06-06T21:09:36Z",
"createdBy" : "BA",
"updated" : "2023-06-06T21:09:36Z",
"updatedBy" : "BA",
"timestamp" : "2023-06-06T21:09:36Z",
"fetchInclude" : "[]",
"fetchType" : "Feature"
},
"version" : 1,
"subjectType" : "WindTurbine",
"description" : "Rolling average over 24 hours, created via API",
"_data" : {
"type" : "Data.Lazy",
"lazies" : {
"0" : {
"type" : "Data.Lazy",
"this" : "WindTurbine",
"action" : "eval",
"args" : {
"spec" : {
"type" : "EvalSpec",
"projection" : "GearOilTemperatureAvg",
"interval" : "HOUR",
"timeZone" : {
"name" : "NONE"
},
"actualize" : true
}
}
}
}
}
}We can also publish the feature set. This will be written under the package at seed/Feature.Set/WindTurbine#metricsFeatureSet1.json.
feature_set.publish(){
"type" : "Feature.Set",
"id" : "WindTurbine#metricsFeatureSet1",
"name" : "metricsFeatureSet1",
"meta" : {
"appCode" : 1767973528752887867,
"env" : "c3",
"app" : "wt58",
"created" : "2023-06-06T21:09:40Z",
"createdBy" : "BA",
"updated" : "2023-06-06T21:09:40Z",
"updatedBy" : "BA",
"timestamp" : "2023-06-06T21:09:40Z",
"fetchInclude" : "[]",
"fetchType" : "Feature.Set"
},
"version" : 1,
"subjectType" : "WindTurbine",
"description" : "Rolling average features, created via API",
"interval" : "HOUR",
"features" : [ "activePowerAvgFeature", "generatorRotationSpeedAvgFeature", "gearOilTemperatureAvgFeature" ]
}Review our work so far
Now, let's look at what we've created.
c3.Feature.eval(projection='id,name,description')| subject | name | description |
|---|---|---|
Asset#activePower_diff | activePower_diff | None |
Asset#activePower_median_deviation | activePower_median_deviation | None |
Asset#activePower_rolling_mean | activePower_rolling_mean | None |
Asset#activePower_rolling_std | activePower_rolling_std | None |
Asset#gearOilTemperature_diff | gearOilTemperature_diff | None |
Asset#gearOilTemperature_median_deviation | gearOilTemperature_median_deviation | None |
Asset#gearOilTemperature_rolling_mean | gearOilTemperature_rolling_mean | None |
Asset#gearOilTemperature_rolling_std | gearOilTemperature_rolling_std | None |
WindTurbine#activePowerAvgFeature | activePowerAvgFeature | Rolling average over 24 hours, created via API |
WindTurbine#activePowerDiffFeature | activePowerDiffFeature | Metric-backed feature, seeded in application. |
WindTurbine#activePower_diff | activePower_diff | None |
WindTurbine#activePower_median_deviation | activePower_median_deviation | None |
WindTurbine#activePower_rolling_mean | activePower_rolling_mean | None |
WindTurbine#activePower_rolling_std | activePower_rolling_std | None |
WindTurbine#gearOilTemperatureAvgFeature | gearOilTemperatureAvgFeature | Rolling average over 24 hours, created via API |
WindTurbine#gearOilTemperatureDiffFeature | gearOilTemperatureDiffFeature | Metric-backed feature, seeded in application. ... |
WindTurbine#gearOilTemperature_diff | gearOilTemperature_diff | None |
WindTurbine#gearOilTemperature_median_deviation | gearOilTemperature_median_deviation | None |
WindTurbine#gearOilTemperature_rolling_mean | gearOilTemperature_rolling_mean | None |
WindTurbine#gearOilTemperature_rolling_std | gearOilTemperature_rolling_std | None |
WindTurbine#generatorRotationSpeedAvgFeature | generatorRotationSpeedAvgFeature | Rolling average over 24 hours, created via API |
WindTurbine#generatorRotationSpeedDiffFeature | generatorRotationSpeedDiffFeature | Metric-backed feature, seeded in application. ... |
WindTurbine#willFailNextDayFeature | willFailNextDayFeature | A seeded feature that can be used as a label. ... |
c3.Feature.Set.eval(projection='id,name,description')| subject | name | description |
|---|---|---|
Asset#windTurbineFeatures | windTurbineFeatures | None |
WindTurbine#labelFeatureSet | labelFeatureSet | Feature is True if a failure should be predict... |
WindTurbine#metricsFeatureSet1 | metricsFeatureSet1 | Rolling average features, created via API |
WindTurbine#metricsFeatureSet2 | metricsFeatureSet2 | These are metrics-backed features, created via... |
WindTurbine#windTurbineFeatures | windTurbineFeatures | None |
WindTurbine#windTurbineFeaturesCustom | windTurbineFeaturesCustom | None |
3. Define features and feature sets as seed
We have seeded several features and one feature set as seed data. We will view the feature metadata, materialize the features and feature set, and then retrieve data.
Seed data format
Here is an example of a feature with the metric defined inline. In the seed/Feature/WindTurbine#activePowerDiffFeature.json file, we have:
{
"type": "Feature",
"id": "WindTurbine#activePowerDiffFeature",
"name": "activePowerDiffFeature",
"subjectType": "WindTurbine",
"legacy": {
"metric": {
"type": "SimpleMetric",
"id": "WindTurbine_ActivePowerDiff",
"name": "ActivePowerDiff",
"srcType": "WindTurbine",
"path": "turbineMeasurements",
"expression": "window(\"MAX\", rollingDiff(avg(normalized.data.activePower)), -23, 24, 1)"
},
"interval": "HOUR"
},
"description": "max difference, seeded in application"
}Here is an example of a feature referencing a metric defined in metadata and referenced by name. In the seed/Feature/WindTurbine#gearOilTemperatureDiffFeature.json file, we have:
{
"type": "Feature",
"id": "WindTurbine#gearOilTemperatureDiffFeature",
"name": "gearOilTemperatureDiffFeature",
"subjectType": "WindTurbine",
"legacy": {
"metric": "GearOilTemperatureDiff",
"interval": "HOUR"
},
"description": "max difference, seeded in application"
}Here is another feature referencing a metric in the seed/Feature/WindTurbine#generatorRotationSpeedDiffFeature.json file:
{
"type": "Feature",
"id": "WindTurbine#generatorRotationSpeedDiffFeature",
"name": "generatorRotationSpeedDiffFeature",
"subjectType": "WindTurbine",
"legacy": {
"metric": "GeneratorRotationSpeedDiff",
"interval": "HOUR"
},
"description": "max difference, seeded in application"
}Here is a feature set in the seed/Feature.Set/WindTurbine#metricsFeatureSet2.json file:
{
"type": "Feature.Set",
"id": "WindTurbine#metricsFeatureSet2",
"name": "metricsFeatureSet2",
"subjectType": "WindTurbine",
"description": "Max difference features, created via seed",
"interval": "HOUR",
"features": [
"activePowerDiffFeature",
"generatorRotationSpeedDiffFeature",
"gearOilTemperatureDiffFeature"
]
}Note: If you want to seed a Feature with a specific materialization time range, you can add that to the JSON as follows:
{
"type": "Feature",
...
"materializeTimeRange": {
"type": "Pair<string, string>",
"fst": "dateTime(\"2012-01-01\")",
"snd": "dateTime(\"2013-01-01\")"
}
} Now, we can verify that the new features and feature set are in our application.
c3.Feature.eval(projection='id,name,description', filter="contains(description, 'seed')")| subject | name | description |
|---|---|---|
WindTurbine#activePowerDiffFeature | activePowerDiffFeature | Metric-backed feature, seeded in application. |
WindTurbine#gearOilTemperatureDiffFeature | gearOilTemperatureDiffFeature | Metric-backed feature, seeded in application. ... |
WindTurbine#generatorRotationSpeedDiffFeature | generatorRotationSpeedDiffFeature | Metric-backed feature, seeded in application. ... |
WindTurbine#willFailNextDayFeature | willFailNextDayFeature | A seeded feature that can be used as a label. ... |
c3.Feature.Set.eval(projection='id,name,description', filter="contains(description, 'seed')")| subject | name | description |
|---|---|---|
WindTurbine#labelFeatureSet | labelFeatureSet | Feature is True if a failure should be predict... |
WindTurbine#metricsFeatureSet2 | metricsFeatureSet2 | These are metrics-backed features, created via... |
Materialize the new features
We still need to explicitly materialize our features to see the data. In production, this will be done through a cron job.
for f_id in ['WindTurbine#activePowerDiffFeature',
'WindTurbine#gearOilTemperatureDiffFeature',
'WindTurbine#generatorRotationSpeedDiffFeature']:
print(f"Materializing {f_id}...", end='')
f = c3.Feature.make({'id':f_id, 'subjectType':c3.WindTurbine}).get()
f.materialize(sync=True)
print("successful.")Materializing WindTurbine#activePowerDiffFeature...successful.
Materializing WindTurbine#gearOilTemperatureDiffFeature...successful.
Materializing WindTurbine#generatorRotationSpeedDiffFeature...successful.We also materialize the feature set.
fs_metrics2 = c3.Feature.Set(id="WindTurbine#metricsFeatureSet2").get()
fs_metrics2.materialize(sync=True)Retrieve the data from the new features
projection = 'activePowerDiffFeature, generatorRotationSpeedDiffFeature, gearOilTemperatureDiffFeature'
eval_result = c3.WindTurbine.eval(projection=projection, filter="id=='demo_TURBINE-1'",
start='2022-01-02', end='2022-01-03', interval='HOUR')
eval_result.head()| subject | timestamp | activePowerDiffFeature | generatorRotationSpeedDiffFeature | gearOilTemperatureDiffFeature |
|---|---|---|---|---|
| demo_TURBINE-1 | 2022-01-02 00:00:00 | 560 | 701 | 10.3333 |
| demo_TURBINE-1 | 2022-01-02 01:00:00 | 560 | 701 | 10.3333 |
| demo_TURBINE-1 | 2022-01-02 02:00:00 | 560 | 701 | 10.3333 |
| demo_TURBINE-1 | 2022-01-02 03:00:00 | 560 | 701 | 10.3333 |
| demo_TURBINE-1 | 2022-01-02 04:00:00 | 560 | 701 | 10.3333 |
We can also reference our feature set to retrieve data using evalFeatureSetBatch.
fs_metrics2 = c3.Feature.Set.make({'id':'WindTurbine#metricsFeatureSet2'})
fs_eval_result = c3.WindTurbine.evalFeatureSetBatch(filter="id=='demo_TURBINE-1'", featureSet=fs_metrics2,
start='2022-01-02', end='2022-01-03')
fs_eval_result.head()| subject | timestamp | activePowerDiffFeature | gearOilTemperatureDiffFeature | generatorRotationSpeedDiffFeature |
|---|---|---|---|---|
| demo_TURBINE-1 | 2022-01-02 00:00:00 | 560 | 10.3333 | 701 |
| demo_TURBINE-1 | 2022-01-02 01:00:00 | 560 | 10.3333 | 701 |
| demo_TURBINE-1 | 2022-01-02 02:00:00 | 560 | 10.3333 | 701 |
| demo_TURBINE-1 | 2022-01-02 03:00:00 | 560 | 10.3333 | 701 |
| demo_TURBINE-1 | 2022-01-02 04:00:00 | 560 | 10.3333 | 701 |
Clean up
We cleanup the artifacts we created.
Note: Features and feature sets are seed data. If you want to delete the feature and feature set definitions, you need to call removeSeedData() instead of remove() or removeAll().
c3.Feature(id='WindTurbine#activePowerAvgFeature').removeSeedData()
c3.Feature(id='WindTurbine#generatorRotationSpeedAvgFeature').removeSeedData()
c3.Feature(id='WindTurbine#gearOilTemperatureAvgFeature').removeSeedData()
c3.Feature.Set(id='WindTurbine#metricsFeatureSet1').removeSeedData()True