Work With File Systems
To load data from files into the C3 Agentic AI Platform, the platform must have access to the file system where the files are stored. The platform comes with a configured out-of-the-box local file system, and has connectors to popular cloud file systems, such as Amazon S3, Azure Blob storage, and Google Cloud Storage.
The C3 AI provides both a high-code and a low-code solution for connecting to file systems:
- Data Fusion in C3 Studio
- The FileSystem Type
Data Fusion is in Beta. Please contact your C3 AI representative to enable this feature.
This topic covers the use of the FileSystem Type, which abstracts remote object stores and network drives into standard file systems. To access a remote file system, the first step is to connect the file system to the C3 Agentic AI Platform by following one of the guides above.
For cloud-based deployments of C3 AI, the object storage of the cloud provider is used as the default file system. For example, if the C3 Agentic AI Platform is deployed on Google Cloud, the default file system will be a Google Cloud Storage Bucket. For on-premise or edge deployments, the C3 Agentic AI Platform uses the local file system c3fs by default.
After you've connected a file system, you can instantiate it using the following:
// Access the default file system
FileSystem.inst();
// Access the local file system
FileSystem.c3();
// Access the S3 file system
FileSystem.s3();
// Access the Azure Blob file system
FileSystem.azure();
// Access the Google Cloud Storage file system
FileSystem.gcs();FileSystem mounts
To interact with the contents of a file system in a C3 AI Application, you need a file system mount that points to a directory that contains the file. When you create an environment or application, the C3 Agentic AI Platform creates a set of file system mounts that are private to your environment/application. These mounts are subdirectories within the default bucket or object storage of the cluster, but they are protected from being viewed or modified by other applications.
View mounts
You can view all mounts on an application with the following command:
// Checking the mounts of the default file system
FileSystem.inst().mounts();This returns the mounts associated with the default file system. An example of the returned mounts is displayed below:
{
"artifact": "gcs://c3--platform/artifacts/",
"system": "gcs://c3--platform/av85/windturbine/system/",
"attachment": "gcs://c3--platform/av85/windturbine/attachment/",
"data-load": "gcs://c3--platform/av85/windturbine/dl/",
"key-value": "gcs://c3--platform/av85/windturbine/kv/",
"telemetry": "gcs://c3--platform/av85/windturbine/telemetry/",
"datasets": "gcs://c3--platform/datasets/",
"etl": "gcs://c3--platform/av85/windturbine/etl/",
"/": "gcs://c3--platform/av85/windturbine/fs/"
}In the above result, there are a few shared mounts - artifact and dataset - to allow for sharing files across applications. The other mounts on an application will always be of the following format (note the <env_name> and <app_name> in the paths):
<cloud_provider>://<bucket_name>/<env_name>/<app_name>/<mount_name>The platform can read and write to any child directory of a mounted file path, provided that the configured credential has the appropriate corresponding privilege on the cloud provider's side. To learn how to work with files, see Work with Files.
Default mount
Many of these file system mounts are used to implement the platform's core interactions with the default file system (see the FileSystemMount Type for more information), but the / mount is used as the 'root' or default mount path for the connected file system. You can see the path by running the following:
// Checking the default mount of the default file system
FileSystem.inst().mountUrl();This returns the default mount path gcs://c3--platform/av85/windturbine/fs/ from the example above. Alternatively, you can see the default mount path by accessing the / element from the list of mounts:
// Get the `/` mount path from the list of all mounts
FileSystem.inst().mounts().get('/');Set new mounts
You can also add mount paths to new file systems. This is useful when you have an existing remote file system whose contents you would like to access from the C3 Agentic AI Platform. The platform allows mount paths to be set at the application, environment, or cluster level.
Overriding a mount path at the environment or cluster level impacts all applications running in that environment or cluster. Use this with caution. Only a Cluster Administrator (ClusterAdmin) is able to set a mount at the cluster level.
The naming convention for mounts on the C3 Agentic AI Platform is kebab-case. Use the following code example to add a new file system mount path or update an existing file system mount path.
var mountName = "<my-mount-name>"; // e.g. 'my-new-mount'
var path = "<my_mount_path>" // e.g. 'gcs://my-bucket/my-new-mount-path/'
var overrideLevel = ConfigOverride.APP // change to ConfigOverride.ENV or ConfigOverride.CLUSTER with caution
// Create or update a mount path for the default file system
FileSystem.inst().setMount(mountName, path, overrideLevel);
FileSystem.inst().mounts().get(mountName);This returns the mount path that you just created. You can also see your new mount when you run FileSystem.inst().mounts().
If you are setting a mount to a non-default file system (e.g. attaching an Azure Blob or s3 bucket to a GCP-hosted cluster), ensure that you use the appropriate file system in the above commands, such as FileSystem.azure() or FileSystem.s3() instead of FileSystem.inst().
Common issues when setting a mount
In general, if you are configuring a new mount from within an application, you will not be able to set mounts that belong to other applications, or create mounts from outside of your application to access any files on the same bucket. For example, assume a user was working in an application, and their default mount was gcs://c3--platform/av85/windturbine/fs/. If they tried to set a mount to gcs://c3--platform/c3/c3/ (to access files from the c3/c3 application), they would run into the following error:
// Run in av85/windturbine
FileSystem.inst().setMount('c3-mount', 'gcs://c3--platform/c3/c3/', ConfigOverride.APP);
// Fails with the following error
> Uncaught Error: Error invoking Java method GcsFileSystem#setMount:
> Not permitted to add gcs://c3--platform/c3/c3/ url to mount c3-mountIf the goal is to share files between applications, it is recommended to use the datasets mount.
Change the default file system
The default file system can be changed to point to a different file system. You can check the default file system configured with the following:
// Checking the scheme of the default file system
FileSystem.inst().scheme(); After connecting to the remote file system, a Cluster Administrator (ClusterAdmin) can change the default remote file system with the following commands:
// Make Amazon S3 the default file system
FileSystem.s3().makeDefault();
// Make an Azure Blob storage account the default file system
FileSystem.azure().makeDefault();
// Make Google Cloud Storage the default file system
FileSystem.gcs().makeDefault();Changing the default file system is not a requirement to interact with the file system. It is always possible to instantiate the S3 file system with FileSystem.s3() or the Azure Blob file system with FileSystem.azure(). However, the default instance provides a consistent way for applications to instantiate relevant file systems if another file system is required in an application workflow.