Model Time Series Data
In the Create an Entity tutorial, you created a simple persistable Type to represent entities in an object model.
Time series data is a common data type in application development.
A time series is a set of data values that are associated with discrete timestamps. Examples of time series data include:
- Temperature measurements from a city over the last month.
- GDP of a country over the last decade.
- Number of documented earthquakes over the last 50 years.
In this tutorial, you create new Types to model the time series temperature measurements recorded from cities in your object model.
The following diagram displays an Entity Relation Diagram (ERD) describing how these new Types connect with the existing City Type in your object model.
Follow the steps in this tutorial to learn how to implement these Types in your object model.
Header and Data point Types
Raw time series data exists in various formats due to differences in recording and structure. Some examples of variations that occur in time series data include:
- Different units of measurement
- Different time zones used by different sensors
- Different data frequencies
- Overlapping and duplicate data
- Out of order data
The C3 Agentic AI Platform uses Header and Data point Types to structure time series data for processing and analysis.
Header Types
The header Type contains metadata that describes the attributes of the time series data stored in the data point Type and includes a reference to the corresponding data point Type.
Headers are persisted in relational data stores, such as PostgresSQL.
To implement a header Type, use the appropriate time series header Type for your use case. To learn more about the different header Types available in the C3 Agentic AI Platform, see Time Series Mixin Types.
The following code block displays an example of a time series header Type using the TimedDataHeader mix in Type:
/**
* MyTimeSeriesHeaderTypeName.c3typ
* A series of measurements taken from a single {@link MyType}.
*/
entity type MyTimeSeriesHeaderTypeName mixes TimedDataHeader<MyTimedDataPointType> {
/**
* The associated {@link MyType} for these measurements
*/
fieldName: !MyType
}In the case of the global data tracking application, each city records point-based temperature data at regular intervals. Since these values are recorded at specific points in time, mix in TimedDataHeader.
Create a time series header Type called CityMeasurementHeader. It should contain the following components:
- Mix in
TimedDataHeader - Have an associated data point Type
CityMeasurementDataPoint - Have a reference field to a specific
City
The following code block displays the completed CityMeasurementHeader Type:
/**
* CityMeasurementHeader.c3typ
* A series of temperature measurements from a single city.
*/
entity type CityMeasurementHeader mixes TimedDataHeader<CityMeasurementDataPoint> {
/**
* The associated city for these measurements
*/
city: !City
}Data point Types
Data point Types are key-value pairs that represent the individual data points in a time series.
Each row in the data point Type represents a single measurement, stored in a non-relational key-value store.
To implement a data point Type, mix in the relevant C3 Type based on the expected shape of the incoming data. Key factors to evaluate when choosing a modeling option include:
- Is the incoming time series data interval or point based?
- Is data ingestion high or low frequency?
- Does the data need to be normalized?
To learn more about how to choose the correct data point Type to mix into your data point Type, see Time Series Mixin Types.
Data point Types specify how time series data is stored by specifying attributes such as:
- Database type and structure
- Data treatment
The following code block displays an example of a data point Type using the TimedDataPoint mix in Type:
/**
* MyTimedDataPointType.c3typ
* A single measurement taken from a single {@link myType}
*/
@db(datastore='kv',
partitionKeyField='parent',
persistenceOrder='start')
entity type MyTimedDataPointType mixes TimedDataPoint<MyTimedDataHeaderType> {
@ts(treatment='avg')
fieldName: fieldType
}In the case of the global data tracking application, each city records point-based temperature data at regular intervals. Since these values are recorded at specific points in time, mix in TimedDataPoint.
Create a time series data point Type called CityMeasurementDataPoint. It should contain the following components:
- Mix in
TimedDataPoint - Use a key value database to store data
- Persist records in order of the
startfield - Have a reference field to a specific
CityMeasurementHeader
The following code block displays the CityMeasurementDataPoint Type without any fields. You add fields and treatments in the next section.
/**
* CityMeasurementDataPoint.c3typ
* A single temperature measurement from a single city
*/
@db(datastore='kv',
partitionKeyField='parent',
persistenceOrder='start')
entity type CityMeasurementDataPoint mixes TimedDataPoint<CityMeasurementHeader> {
// Add fields to this Type in the next section
}Normalization
Normalization transforms raw time series data into a structured format, stored in a C3 AI data store for faster access and analysis.
The Normalization Engine uses a preprocessing pipeline to identify issues in raw data and performs corrections. Examples of issues the Normalization Engine targets in incoming data include:
- Overlapping or duplicate data
- Gaps in data
- Inconsistent formatting or units
- Inconsistent time intervals
To implement normalization, specify the treatments the normalization engine must apply for each field. Treatments are used to define the aggregation behavior of fields during the normalization process.
Use the @ts annotation in a field declaration to apply a treatment to a data field.
The following code block displays an example of a time series Type specifying the avg treatment for a field:
/**
* An example Type with data treatments for some fields
*/
@db(datastore='kv',
partitionKeyField='parent',
persistenceOrder='start')
entity type MyType mixes TimedDataPoint<MyMeasurementHeader> {
@ts(treatment='avg')
myField: double
}To learn more about the time series treatments available, see Time Series Data Treatments.
Complete the CityMeasurementDataPoint Type by adding the following field representing the temperature data recorded by a city in the application.
/**
* CityMeasurementDataPoint.c3typ
* A single temperature measurement from a single city
*/
@db(datastore='kv',
partitionKeyField='parent',
persistenceOrder='start')
entity type CityMeasurementDataPoint mixes TimedDataPoint<CityMeasurementHeader> schema name "CITYMEASUREMENTDATAPOINT" {
// The measured temperature in degrees Celsius
@ts(treatment='avg')
temperature: double
}Update the City Type
Add a collection field to the City Type to reference the time series measurements created in this tutorial.
The following code block shows the updated City Type with the measurementSeries field:
/**
* City.c3typ
* A single city within a country.
*/
entity type City {
/**
* The name of this city.
*/
name: !string
/**
* The country in which this city is located.
*/
country: !Country
/**
* The collection of CountryCapitalHistory for this city.
*/
@db(order='descending(start)')
capitalHistory: [CountryCapitalHistory](to)
/**
* The country for which this city currently serves as capital (if any).
*/
currentCountry: Country stored calc capitalHistory[0].(end == null).from
/**
* The collection of CityMeasurementHeader for this city.
*/
@db(order='descending(start)')
measurementSeries: [CityMeasurementHeader](city)
}This field allows you to access all temperature measurement series associated with this city.