Fetch and Filter Data From the Database - Tutorial
After you turn a Type into an Entity Type (also known as Persistable), you can persist data in a database for later retrieval.
This article covers how to fetch data from the database using the Python SDK.
Prerequisities
To run this notebook, you need to create your application with a rootPkg of windTurbine or include windTurbine in your list of package dependencies. This tutorial creates new Types in your root package.
Data modeling
Let's assume you've defined a Type to keep track of information about wind turbines like their power and manufacturer:
entity type WindTurbine schema name 'WNDT' {
/**
* The ID of the wind turbine.
*/
turbineId: string
/**
* The location of the wind turbine.
*/
location: string
/**
* Nominal power of the wind turbine (e.g., 150 mega watts).
*/
power: int
/**
* The manufacturer of the wind turbine.
*/
manufacturer: string
}Also, let's also assume you've added some wind turbines to the database:
In [2]:
# Create wind turbines
wts = [
c3.WindTurbine(id="oBxDEs", turbineId="wt1", location="US", power=151, manufacturer="Philips"),
c3.WindTurbine(turbineId="wt2", location="US", power=162, manufacturer="Philips"),
c3.WindTurbine(turbineId="wt3", location="CA", power=148, manufacturer="Philips"),
c3.WindTurbine(turbineId="wt4", location="EN", power=152, manufacturer="GE"),
c3.WindTurbine(turbineId="wt5", location="CA", power=154, manufacturer="GE")
]
# Insert multiple wind turbines in batch in the database. Wind turbines without pre-assigned
# ID will get an automatically generated one
wts = c3.WindTurbine.createBatch(wts).objs
wts[2].get()Out [2]:
{
"type" : "WindTurbine",
"id" : "24a66773-7f85-40c6-901b-1ea492497652",
"meta" : {
"appCode" : 1756674330237804048,
"env" : "c3",
"app" : "wt",
"created" : "2023-02-09T01:02:14Z",
"createdBy" : "BA",
"updated" : "2023-02-09T01:02:14Z",
"updatedBy" : "BA",
"timestamp" : "2023-02-09T01:02:14Z",
"fetchInclude" : "[]",
"fetchType" : "WindTurbine"
},
"version" : 1,
"turbineId" : "wt3",
"location" : "CA",
"power" : 148,
"manufacturer" : "Philips"
}Find a single instance by ID
If you know the unique ID of an instance, you can fetch it from the database using Persistable#get:
In [3]:
wt = c3.WindTurbine(id="oBxDEs").get()
wtOut [3]:
{
"type" : "WindTurbine",
"id" : "oBxDEs",
"meta" : {
"appCode" : 1756674330237804048,
"env" : "c3",
"app" : "wt",
"created" : "2023-02-09T01:02:14Z",
"createdBy" : "BA",
"updated" : "2023-02-09T01:02:14Z",
"updatedBy" : "BA",
"timestamp" : "2023-02-09T01:02:14Z",
"fetchInclude" : "[]",
"fetchType" : "WindTurbine"
},
"version" : 1,
"turbineId" : "wt1",
"location" : "US",
"power" : 151,
"manufacturer" : "Philips"
}Find all instances that match criteria
Another common use case is to find all instances that match certain criteria. Sometimes you want to just display that list to the user, other times you want to do some data manipulation. Regardless of your use case, the first step is to fetch data from the database.
You can achieve this with the Persistable#fetch method:
In [4]:
# CA is a string literal, so it needs to have quotes
efficient_US_wind_turbine = c3.WindTurbine.fetch(filter = "location == 'US' && power > 150")
# The return value is an object with the results, and some metadata about the result set
print(f'There are {efficient_US_wind_turbine.count} efficient US wind turbines')
# There's 2 efficient US wind turbines
# And the actual results are acessible in the .objs field
list([wt.turbineId for wt in efficient_US_wind_turbine.objs])There are 2 efficient US wind turbines
Out [4]:
['wt2', 'wt1']With the filter parameter, you can filter data using comparison operators, as well as expression engine functions. Learn more using help(c3.FetchFilterSpec).
Besides filter criteria, you can also specify which fields to include, how to sort the results, and other settings for your data fetch:
In [5]:
c3.WindTurbine.fetch(
# Filter criteria
filter = "intersects(location, ['US', 'CA']) && power > 150",
# Which fields to return
include = "[turbineId, location, power]",
# How to sort results
order = "manufacturer, ascending(power)",
# Maximum results to return
limit = 10
)Out [5]:
| ID | turbineId | location | power | meta.fetchType | meta.fetchInclude | version |
|---|---|---|---|---|---|---|
| 87de81e8-c8c5-4c76-8292-1916863a2c94 | wt5 | CA | 154 | WindTurbine | [turbineId,location,power,id,version] | 1 |
| oBxDEs | wt1 | US | 151 | WindTurbine | [turbineId,location,power,id,version] | 1 |
| 7bb3fc12-4e63-40ed-9301-f9db0d11d779 | wt2 | US | 162 | WindTurbine | [turbineId,location,power,id,version] | 1 |
As you can tell, the filter criteria can get quite complex. Notice how when filtering for locations, you need to quote the string US and CA. To prevent mistakes, and keep things organized, use the c3.Filter Type:
In [6]:
c3.WindTurbine.fetch(
# Filter criteria
filter = c3.Filter().intersects('location', ['US', 'CA']).and_().gt("power", 150),
# Which fields to return
include = "[turbineId, location, power]",
# How to sort results
order = "manufacturer, ascending(power)",
# Maximum results to return
limit = 10
)Out [6]:
| ID | turbineId | location | power | meta.fetchType | meta.fetchInclude | version |
|---|---|---|---|---|---|---|
| 87de81e8-c8c5-4c76-8292-1916863a2c94 | wt5 | CA | 154 | WindTurbine | [turbineId,location,power,id,version] | 1 |
| oBxDEs | wt1 | US | 151 | WindTurbine | [turbineId,location,power,id,version] | 1 |
| 7bb3fc12-4e63-40ed-9301-f9db0d11d779 | wt2 | US | 162 | WindTurbine | [turbineId,location,power,id,version] | 1 |
The keyword and is reserved in Python, so you must use the and_() method.
Check if any instances match criteria
Sometimes you just want to check if there's any instance that matches given criteria without actually fetching the instances from the database. Use Persistable#exists to get a Boolean, or Persistable#fetchCount to get how many instances match the filtering criteria:
In [7]:
# Check if there's at least one wind turbine that complies with the criteria
c3.WindTurbine.exists(filter = c3.Filter().eq("location", "US").and_().gt("power", 160))
# TrueOut [7]:
TrueIn [8]:
# Or count how many comply with the criteria
c3.WindTurbine.fetchCount(filter = c3.Filter().eq("location", "US").and_().gt("power", 160))
# 1Out [8]:
1Fetch and transform data
The .fetch() method allows you to query data and filter by relevant criteria, but it doesn't offer a way to transform the results. To transform data, you have to iterate through the results and process them with your own logic.
If you need to query and transform the data, use the Persistable#eval method:
In [9]:
c3.WindTurbine.eval(
# Filter criteria, optional
filter = "location != 'EN' && power > 150",
# For each instance that matches the criteria, return
# ID of the instance
# manufacturer name in upper case
# power, rounded to the nearest integer
projection = "id, upperCase(manufacturer), round(power)"
)Out [9]:
| subject | upperCase_manufacturer | round_power |
|---|---|---|
| 7bb3fc12-4e63-40ed-9301-f9db0d11d779 | PHILIPS | 162 |
| 87de81e8-c8c5-4c76-8292-1916863a2c94 | GE | 154 |
| oBxDEs | PHILIPS | 151 |
Aggregate data
In [10]:
manufacturerSummary = c3.WindTurbine.eval(
# Filter criteria
filter = "intersects(location, ['US', 'CA'])",
# Aggregate data by manufacturer
group = "manufacturer",
# Return manufacturer and aggregate data about it
projection = "manufacturer, min(power), max(power), avg(power)"
)
# The return value is an object with the results, and some metadata about the result set
print(f'There are {manufacturerSummary.shape[0]} manufacturers with efficient wind turbines')
# There are 2 manufacturers with efficient wind turbines
manufacturerSummaryThere are 2 manufacturers with efficient wind turbinesOut [10]:
| manufacturer | min_power | max_power | avg_power |
|---|---|---|---|
| GE | 154 | 154 | 154 |
| Philips | 148 | 162 | 154 |
Clean up
In [11]:
c3.WindTurbine.removeBatch(wts)Out [11]:
{
"type" : "ObjList<mixing WindTurbine>",
"stats" : {
"createdObjCount" : 0,
"updatedObjCount" : 0,
"removedObjCount" : 5,
"failedObjCount" : 0,
"objCount" : 5,
"dbWait" : 0.0,
"time" : 0.01,
"dbTime" : 0.0,
"objsPerSecond" : 500.0,
"dbObjsPerSecond" : 0.0
}
}Fetch Data From Reference and Collection Fields (Python)
Once you relate two Entity Types using a reference field, you are ready to persist relational data for later retrieval. This also means your data spans several Types, so when fetching data you need to specify which Types to fetch from.
By default, the Persistable#fetch method only fetches data for fields that are directly stored in the underlying database table associated with the Entity Type. This means that only fields with a primitive Type and stored calculated fields are fetched. This ensures that queries are performant.
Collections and non-stored calculated fields are not fetched by default since that requires fetching data that spans multiple Types or computing values in real time. To fetch those fields, you need to mention them explicitly in your query.
Data Modeling
Let's assume you've defined two Types to keep track of facilities and wind turbines powering those facilities. The definition of wind turbines looks like this:
/* WindTurbine.c3typ */
entity type WindTurbine schema name 'WNDT' {
/**
* The ID of the wind turbine.
*/
turbineId: string
/**
* The location of the wind turbine.
*/
location: string
/**
* Nominal power of the wind turbine (e.g., 150 mega watts).
*/
power: int
/**
* The manufacturer of the wind turbine.
*/
manufacturer: string
/**
* for which facility this wind turbine is providing power
*/
facility: Facility
}And the definition of facility looks like:
/* Facility.c3typ */
entity type Facility schema name "FCLT" {
address: !string
windTurbines: [WindTurbine](facility)
}Also, let's also assume you've added some data to the database:
In [12]:
# Create multiple wind turbines
wts = [
c3.WindTurbine(id="wt1", location="US", manufacturer="Philips", power=150, facility='fc1'),
c3.WindTurbine(id="wt2", location="US", manufacturer="Philips", power=150, facility='fc3'),
c3.WindTurbine(id="wt3", location="CA", manufacturer="Philips", power=150, facility='fc2'),
c3.WindTurbine(id="wt4", location="CA", manufacturer="GE", power=150, facility='fc1'),
c3.WindTurbine(id="wt5", location="US", manufacturer="GE", power=150, facility='fc3'),
c3.WindTurbine(id="wt6", location="US", manufacturer="GE", power=150, facility='fc2'),
]
# Create multiple facilities in memory
facilities = [
c3.Facility(id="fc1", address="1300 Seaport Blvd"),
c3.Facility(id="fc2", address="1400 Seaport Blvd"),
c3.Facility(id="fc3", address="1500 Seaport Blvd"),
c3.Facility(id="fc4", address="1500 Seaport Blvd")
]
# Persist wind turbines and facilities in the database:
wts = c3.WindTurbine.createBatch(wts).objs
facilities = c3.Facility.createBatch(facilities).objsFetch all persisted fields
By default, if you don't specify any fields, only fields with primitive data types and stored calculated fields are fetched:
In [13]:
c3.WindTurbine.fetch()
# Same as
c3.WindTurbine.fetch(include = "[this]")Out [13]:
| ID | location | power | manufacturer | facility.id | meta | version |
|---|---|---|---|---|---|---|
| wt1 | US | 150 | Philips | fc1 | {…} | 1 |
| wt2 | US | 150 | Philips | fc3 | {…} | 1 |
| wt3 | CA | 150 | Philips | fc2 | {…} | 1 |
| wt4 | CA | 150 | GE | fc1 | {…} | 1 |
| wt5 | US | 150 | GE | fc3 | {…} | 1 |
| wt6 | US | 150 | GE | fc2 | {…} | 1 |
Fetch specific fields
To specify which fields to fetch, use the include parameter:
In [14]:
# Include the list of fields to fetch
c3.WindTurbine.fetch(include = "[id, manufacturer, power]")Out [14]:
| ID | power | manufacturer | meta.fetchType | meta.fetchInclude | version |
|---|---|---|---|---|---|
| wt1 | 150 | Philips | WindTurbine | [id,manufacturer,power,version] | 1 |
| wt2 | 150 | Philips | WindTurbine | [id,manufacturer,power,version] | 1 |
| wt3 | 150 | Philips | WindTurbine | [id,manufacturer,power,version] | 1 |
| wt4 | 150 | GE | WindTurbine | [id,manufacturer,power,version] | 1 |
| wt5 | 150 | GE | WindTurbine | [id,manufacturer,power,version] | 1 |
| wt6 | 150 | GE | WindTurbine | [id,manufacturer,power,version] | 1 |
Fetch and filter reference fields
When a Type references another, you can fetch information for both of them by listing which fields of the referenced Type you want to fetch:
In [15]:
# Similar to
# SELECT WindTurbine.id, WindTurbine.manufacturer, Facility.id, Facility.address
# FROM WindTurbine LEFT JOIN Facility ON WindTurbine.facility = Facility.id
# WHERE facility.address LIKE '1400%'
c3.WindTurbine.fetch(
include = "[id, manufacturer, facility.id, facility.address]",
filter = "startsWith(facility.address, '1400')"
)
Out [15]:
| ID | manufacturer | facility.address | facility.id | meta.fetchType | meta.fetchInclude | version |
|---|---|---|---|---|---|---|
| wt3 | Philips | 1400 Seaport Blvd | fc2 | WindTurbine | [id,manufacturer,{facility:[id,address]},version] | 1 |
| wt6 | GE | 1400 Seaport Blvd | fc2 | WindTurbine | [id,manufacturer,{facility:[id,address]},version] | 1 |
In [16]:
# Similar to
# SELECT WindTurbine.*, Facility.*
# FROM WindTurbine LEFT JOIN Facility ON WindTurbine.facility = Facility.id
# WHERE facility.address LIKE '1400%'
c3.WindTurbine.fetch(
include = "[this, facility.this]",
filter = "startsWith(facility.address, '1400')"
)Out [16]:
| ID | location | power | manufacturer | facility | meta | version |
|---|---|---|---|---|---|---|
| wt3 | CA | 150 | Philips | fc2 | {…} | 1 |
| wt6 | US | 150 | GE | fc2 | {…} | 1 |
The data model in this example is simple, but you can include as many reference fields in your query as needed.
Fetch and filter collection fields
Both collection and reference fields are not fetched by default. You need to list them explicitly in your query:
In [17]:
# Similar to
# SELECT Facility.address, WindTurbine
# FROM Facility LEFT JOIN WindTurbine ON Facility.id = WindTurbine.facility
# WHERE WindTurbine.manufacturer = 'GE'
c3.Facility.fetch(
include = "[address, windTurbines]",
filter = "windTurbines.manufacturer == 'GE'"
)Out [17]:
| ID | address | windTurbines | meta.fetchType | meta.fetchInclude | version |
|---|---|---|---|---|---|
| fc1 | 1300 Seaport Blvd | [2] | Facility | [address,{windTurbines:[id]},id,version] | 1 |
| fc2 | 1400 Seaport Blvd | [2] | Facility | [address,{windTurbines:[id]},id,version] | 1 |
| fc3 | 1500 Seaport Blvd | [2] | Facility | [address,{windTurbines:[id]},id,version] | 1 |
As you noticed, the fetch call above returns a list of facilities with at least one wind turbine's manufacturer equal to 'GE'. In the returned result, it still returns all the wind turbines associated with the facility. If you want to filter the results within the windTurbine field, you can use a dot filter in the include spec:
In [18]:
c3.Facility.fetch(
include = "[address, windTurbines.(manufacturer=='GE')]",
filter = "windTurbines.manufacturer == 'GE'"
)Out [18]:
| ID | address | windTurbines | meta.fetchType | meta.fetchInclude | version |
|---|---|---|---|---|---|
| fc1 | 1300 Seaport Blvd | [1] | Facility | [address,{windTurbines:[id]}.(manufacturer == 'GE'),id,version] | 1 |
| fc2 | 1400 Seaport Blvd | [1] | Facility | [address,{windTurbines:[id]}.(manufacturer == 'GE'),id,version] | 1 |
| fc3 | 1500 Seaport Blvd | [1] | Facility | [address,{windTurbines:[id]}.(manufacturer == 'GE'),id,version] | 1 |
Again, the data model in this example is simple, but you can include as many collection fields in your query as needed.
You can also include both reference and collection fields in your query to be able to fetch data between any two Types that have a relationship.
Fetch only instances with a relationship
As you noticed, Facility.fetch({include: [address, windTurbines.this]}) fetches all facilities, no matter if they have wind turbines to provide power or not. To fetch only facilities that have at least one wind turbine associated, use the filter parameter:
In [19]:
# Similar to an inner join
c3.Facility.fetch(
include = "[address, windTurbines.this]",
filter = "exists(windTurbines.id)"
)Out [19]:
| ID | address | windTurbines | meta.fetchType | meta.fetchInclude | version |
|---|---|---|---|---|---|
| fc1 | 1300 Seaport Blvd | [2] | Facility | [address,{windTurbines:[]},id,version] | 1 |
| fc2 | 1400 Seaport Blvd | [2] | Facility | [address,{windTurbines:[]},id,version] | 1 |
| fc3 | 1500 Seaport Blvd | [2] | Facility | [address,{windTurbines:[]},id,version] | 1 |
Clean up
In [20]:
c3.WindTurbine.removeBatch(wts)
c3.Facility.removeBatch(facilities)Out [20]:
{
"type" : "ObjList<mixing Facility>",
"stats" : {
"createdObjCount" : 0,
"updatedObjCount" : 0,
"removedObjCount" : 4,
"failedObjCount" : 0,
"objCount" : 4,
"dbWait" : 0.0,
"time" : 0.012,
"dbTime" : 0.0,
"objsPerSecond" : 333.3333333333333,
"dbObjsPerSecond" : 0.0
}
}