C3 AI Data Science and Machine Learning Introduction
C3 AI Data Science and Machine Learning is designed for a variety of use cases. Use C3 AI Data Science to extract meaningful insights for business and technology.
- Use Python — Interact with the Type System using Python and enables access to the entire Python ecosystem of libraries.
- Data — Access a single, unified, data interface for all data operations. This data interface is called C3 AI Datasets, and is available through the Data C3 Type.
- Feature Store — Access a centralized repository of materialized (pre-computed) feature data.
- Model Development — Create machine learning workflows of arbitrary complexity.
- Model Deployment — Use the Model Deployment Framework (MDF) to train, deploy, and manage the life cycle of machine learning models.
- Model Registry — Add, update, version, load, and find models in the Model Registry.
This guide is built primarily for data scientists who intend to build, train, and operate machine learning models and pipelines on the C3 Agentic AI Platform. You may need some experience working with machine learning and using Python. Additionally, it may be helpful to have previously used libraries like TensorFlow, Keras, pandas, NumPy, and scikit-learn.
This guide walks you through all the essential steps to train and deploy multiple classification models to predict failure in a segmented fleet of wind turbines.
Use Python
C3 AI supports any Python library that is accessible by a conda channel (public or private), made possible through the C3 AI Python SDK. This means you can use Python at C3 AI just like you would most anywhere else.
Use Python at C3 AI to:
- Work with C3 AI Datasets
- Access feature engineering
- Build, train, and deploy models
- Define methods on Types that are implemented in Python
For more information on using Python at C3 AI, see Use Python with C3 Agentic AI Platform Overview.
Data
The C3 Agentic AI Platform offers a single, unified, data interface for all data operations. This data interface is called C3 AI Datasets, and is available through the Data C3 Type.
- Pandas-like APIs — Data provides APIs that are similar to pandas, enabling users to perform data loading, data exploration, and feature engineering using familiar APIs.
For more information on using the Data Type, see Work with C3 AI's Data Interface.
Feature Store
Features are data inputs to a machine learning model. Typically, features involve transformations from source data. Use C3 AI Feature Store to maintain a repository of pre-computed feature data.
- Share and discover features across teams.
- Reuse named features in both training and prediction/inference contexts.
- See a point-in-time view of multiple features (for example, see the most recent data defined in each feature at a specific point in time).
For more information on using C3 AI Feature Store, see C3 AI Feature Store Overview.
Model Development
Most production machine learning systems rely on a complex series of data transformation and training operations to serve their final predictions. C3 AI Model Development allows you to share and re-use implementation across projects, and removes the need to write glue code stitching components together.
C3 AI Model Development is integrated with C3 AI Model Deployment and C3 AI Model Registry, allowing you to deploy and share models seamlessly. Additionally, use Model Development to scale and optimize performance by parallelizing pipeline operations, and optimizing data transfer operations between nodes of the pipeline directed acyclic graph (DAG) during execution.
For more information on C3 AI Model Development, see Model Development on the C3 Agentic AI Platform Overview.
Model Deployment
The Model Deployment Framework (MDF) helps data scientists and application developers train, deploy, and manage the life cycle of machine learning models. When an application requests a prediction, it handles the management of routing the predictions to the correct models and persisting the outputs.
Model deployment enables the flexible configuration of both simple single-model deployments (for example, one model serving all predictions) and complex multi-model deployments. This enables data scientists and application developers to:
Serve models for predictions.
Handle the life cycle of models: deployment, retraining, retiring.
Route requests of predictions to the correct model.
Persist predictions in a consistent way to ensure traceability back to the model.
For more information on C3 AI Model Deployment, see Model Deployment Overview.
Model Registry
The C3 Agentic AI Platform provides you with robust ML model deployment and health checks. Use the C3 Agentic AI Platform to ensure that you can reproduce, deploy, and scale models from training. You can also share models across applications, with capabilities to register, load, and search for existing registered models from other users.
You can add, update and version, load, and find models using C3 AI Studio or C3 AI provided APIs.
For more information on C3 AI Model Registry, see Overview of Model Registry.