C3 AI Documentation Home

Google Drive File System Connector

Overview

The Google Drive FileSystem integration enables applications to access Google Drive through the C3 FileSystem abstraction. Once configured, developers can interact with Google Drive using standard filesystem APIs to list, read, and optionally write files.

The integration supports two access modes:

  • Read/write access using a Google Shared Drive
  • Read-only access to files explicitly shared with a service account

At a high level, the integration follows this architecture:

Mount → FileSourceSystem → Credentials → FileSystem operations

Each layer depends on the previous one. All components must be configured consistently for the integration to function correctly.

Prerequisites

Before configuring the Google Drive FileSystem, ensure the following prerequisites are met.

Google Service Account

  1. Create a service account in the Google Cloud Console.
  2. Generate a JSON key for the service account.
  3. Capture the following values from the key:
    • secretKey (private key)
    • accountId (client ID)
    • accountEmail
    • accessKey (private key ID)

Shared Drive (Required for Read/Write Mode)

  1. Create a Google Shared Drive (not a regular shared folder).
  2. Grant the service account Organizer permission on the Shared Drive.

Shared Drive ID

  • Retrieve the Shared Drive ID.
    This ID is required to enable write access.

What Is a Google Shared Drive?

A Google Shared Drive (formerly known as a Team Drive) is a Google Drive construct designed for team-owned content rather than user-owned content. Files in a Shared Drive are owned by the organization, not by an individual user.

Key characteristics

  • Access is granted at the drive level, not by sharing individual files or folders.
  • Files persist even if individual users or service accounts are removed.
  • Shared Drives support fine-grained permission roles:
    • Viewer
    • Contributor
    • Content Manager
    • Organizer

Important
A Shared Drive is not the same as a regular folder that is shared with another user or service account.
Sharing a folder does not provide the same ownership model or write guarantees required by this integration.

For the Google Drive FileSystem integration

  • Shared Drives are required for read/write access.
  • The service account must be granted Contributor permission on the Shared Drive.

Retrieve the Shared Drive ID

The Shared Drive ID uniquely identifies the Shared Drive and is required to enable read/write access.

You can retrieve the Shared Drive ID using one of the following methods.

Option 1: Google Drive UI

  1. Open Google Drive in a browser.
  2. Navigate to Shared drives and select the target Shared Drive.
  3. Copy the URL from the browser address bar.

The URL will look similar to:

https://drive.google.com/drive/folders/<id>

In this example, the Shared Drive ID is:

Text
<id>

Option 2: Google Drive API (Optional)

If you are programmatically managing drives, you can retrieve Shared Drive IDs using the Google Drive API (drives.list). This approach is optional and typically not required for initial setup.

Step 1: Configure the Google Drive Mount

The Google Drive FileSystem requires a default mount to resolve gdrive:// URLs. This mount defines how Google Drive paths are routed inside the platform and must be configured before any FileSourceSystem or filesystem operations can succeed.

Configure the Default Mount (Current Requirement)

At present, the default Google Drive mount must be configured using the FileSystem APIs. Instead, configure the default mount programmatically using the GoogleDriveFileSystem configuration API to map the root path (/) to the desired Google Drive location.

Example

Python
config = c3.GoogleDriveFileSystem.inst().config()
config.setMount("/", "gdrive://TestDrive1/")

This configuration tells the FileSystem which Google Drive resource should be treated as the root when resolving file paths. In this example:

  • TestDrive1 becomes the resourceName used in all gdrive:// URLs.
  • Any path of the form gdrive://TestDrive1/<path> is routed to Google Drive through this mount.

Note
You can verify this configuration by calling the List Files operation described in the final step of this procedure.

This verification can be performed only after the Google Drive credentials have been successfully configured, as the FileSystem requires valid credentials to authenticate and list files from the mounted drive.

Common Failure Cases

  • The mount does not exist.
  • The mount name does not match the FileSourceSystem resource name specified in the root URL.
  • The gdrive:// URL uses a resource name that does not correspond to the configured mount.

If any of these occur, filesystem operations will fail because the URL cannot be resolved.

Step 2: Define the FileSourceSystem (Metadata)

The Google Drive FileSourceSystem must be authored via metadata and must exist before runtime. Create a JSON file in the /metadata/FileSourceSystem directory, similar to the example below.

Metadata Example

JSON
{
  "type": "FileSourceSystem",
  "name": "testgdrive1",
  "rootUrlOverride": "gdrive://TestDrive1/"
}

Required Fields

  • rootUrlOverride
    Must match the mounted gdrive:// root and bind the FileSourceSystem to the mount.

Important Notes

  • The FileSourceSystem must be created via metadata (for example, in the application repository).
  • Runtime creation or upsert of the FileSourceSystem is not supported for this workflow.
  • If the name or rootUrlOverride does not align with the configured mount, the connector will not function.

Step 3: Configure Drive Access Mode

The behavior of the Google Drive FileSystem depends on how driveId is configured.

Read/Write Mode (Shared Drive)

Recommended and required for write-capable access

Text
driveId = "<SharedDriveId>"

When driveId is set to a Google Shared Drive ID:

  • The filesystem operates within the Shared Drive namespace.
  • The service account must have Organizer or Contributor permission on the Shared Drive.
  • Read, write, create, move, copy, and delete operations are supported.
  • This is the only validated write-capable configuration.
  • This is the recommended enterprise setup.

Read-Only Mode (No Shared Drive)

Text
driveId = null

When driveId is set to null:

  • The filesystem does not target a specific Shared Drive.
  • It can access files and folders explicitly shared with the service account.
  • Listing and reading files works as expected.
  • Write operations are not supported by design.

Step 4: Configure Credentials

Define Credential Parameters

Before constructing GoogleDriveCredentials, define the Drive context and service account credentials.

Define driveId

Text
driveId = "<SharedDriveId>"
  • If you are using a Google Shared Drive, set driveId to the Shared Drive ID (this can be copied from the Drive URL).

  • If you are not using a Shared Drive and only need read-only access to files explicitly shared with the service account, set driveId = null/None.

Define json_creds

JSON
json_creds = {
    "type": "GoogleDriveCredentials",
    "secretKey": "<PRIVATE_KEY>",
    "accountId": "<CLIENT_ID>",
    "accountEmail": "<SERVICE_ACCOUNT_EMAIL>",
    "accessKey": "<PRIVATE_KEY_ID>"
}

These values are obtained from the Google Cloud service account key associated with the Google Drive integration.

Construct GoogleDriveCredentials

Python
creds = c3.GoogleDriveCredentials.make(json_creds).withDriveId(driveId)

Required credential fields

  • secretKey – Service account private key
  • accountId – Client ID
  • accountEmail – Service account email
  • accessKey – Private key ID

Attach Credentials to the FileSourceSystem

Python
fs = c3.FileSourceSystem.forName("testgdrive1")
fs.setCredentials(creds)

Once attached, all filesystem operations routed through this FileSourceSystem authenticate using these credentials. Credential validation occurs lazily on the first Google Drive API call.

Step 5: Verify the Integration

After configuring the Google Drive mount, run:

Python
c3.GoogleDriveFileSystem.listFiles()

Performance Considerations

When working with folders or drives that contain a large number of files, consider specifying a limit when calling listFiles() to avoid long-running operations.

For example:

Python
c3.GoogleDriveFileSystem.listFiles(None, 10)

This confirms that:

  • The mount exists.
  • Credentials are valid.
  • The configured Drive context is accessible.

Verify Read/Write Behavior

Python
c3.File.make("gdrive://TestDrive1/example.txt").writeString("Hello")
c3.File.make("gdrive://TestDrive1/example.txt").readString()

Expected Behavior

  • Shared Drive mode: both write and read operations succeed.
  • Read-only mode (driveId = null): read operations succeed, write operations fail.

Operational Notes and Guardrails

  • Credential validation occurs lazily on the first Google Drive API call.
  • Most configuration errors surface during listFiles() or file operations.
  • The mount name and rootUrlOverride must always align.
  • Do not hard-code private keys in source code; use secure secret management.
  • Shared Drive mode is required for any workflow that needs write access.

Accessing Google-native File Formats in Data Integration

Google-native formats (such as Google Docs, Sheets, and Slides) cannot be read directly via the Google Drive API. To make these files usable in Data Integration (DI) pipelines, they must first be exported (transcoded) to a supported format such as TXT, CSV, or PDF.

This section explains the limitations and required conversion mappings when working with Google-native content.

Note
This step is necessary because the Google Drive API does not natively support reading the content of Docs, Sheets, or Slides. Exporting to a standard format is required to process the file in DI workflows.

Supported Transcoding Mappings

You must export Google-native file formats to a supported format before using them in Data Integration pipelines.

Google-native formatSupported export formats
Google DocsPlain text, PDF
Google SheetsCSV
Google SlidesPDF

For unsupported formats, export the file to PDF.

Note
Writing content is supported only for Google Docs. Writing to Google Sheets or Google Slides is not supported.

Create a Google Docs File

This example shows how to create a Google Docs file and highlights the limitations of reading Google-native formats.

Python
path = "google_test/testDoc"
testfile = fs.makeFile(path).withContentType(c3.MediaType.GOOGLE_DOCS)
createdFile = fs.createFile(testfile)
file = fs.openFile(createdFile)
file

While content can be written to a Google Docs file using writeString(), attempting to read the file directly fails by design. To access the contents, the file must first be transcoded into a supported format, such as plain text or PDF.

Python
file.writeString("C3 rocks")
try:
    file.read()
except:
    print("File could not be read")

File could not be read

Because Google Docs cannot be read directly, the file is first transcoded into plain text, after which its contents can be accessed using standard file read operations.

Transcode the File

The following example shows how to read the contents of a Google Docs file by exporting it to a supported format.

Python
file = file.transcode(
    c3.MediaType.PLAIN_TEXT,
    file.contentType
)

fs.openFile(file)
file.readString()

The output shows the metadata for the newly exported plain-text file created during transcoding, followed by the file’s contents. The GoogleDriveFile object confirms that the original Google Docs file has been converted into a text/plain file (testDoc.txt) under the same parent directory, and readString() returns the extracted text content ("C3 rocks").

Text
({
  "type" : "GoogleDriveFile",
  "contentLength" : 11,
  "contentType" : "text/plain",
  "lastModified" : "2025-06-25T00:09:25Z",
  "lastModifiedBy" : "testaccount@testproject-461317.iam.gserviceaccount.com",
  "contentMD5" : "001b17fbcaf316a88ea59646d14f820e",
  "contentSHA1" : "f91ae838dc3743df96d8e78169ead0a4864cf391",
  "hasMetadata" : true,
  "url" : "gdrive://TestDrive/google_test/testDoc.txt",
  "name" : "testDoc.txt",
  "parentId" : "1jFZUI9C0a4aI-1T8HMrLzprIGP4_M9z8",
  "fileId" : "1uoBCz6dewFmgs5R78pEzUYYd5uITlb_0"
},
"C3 rocks")

Transcoding Behavior and Notes

When transcoding Google-native files (Docs, Sheets, Slides), keep the following in mind:

  • Transcoding creates a new file. The original file is not modified.
  • Original files remain unless explicitly deleted. Use options such as withDeleteOriginal(true) if you want to remove the source file.
  • PDF reads return binary data. To extract text from PDFs, use a PDF processing library.

These behaviors are expected when working with Google-native formats.

See Also

The following resources provide additional background on Google service account credentials and Google Shared Drives, which are required to understand and correctly configure the Google Drive FileSystem integration.

Create credentials for Google Workspace APIs

Describes how to create service account credentials, generate JSON keys, and configure authentication for Google Drive APIs.
https://developers.google.com/workspace/guides/create-credentials

Learn about Google Shared Drives

Explains how Shared Drives work, how they differ from shared folders, and how permissions and ownership are managed.
https://support.google.com/a/users/answer/12380484?hl=en

How file access works in shared drives (Google Workspace productivity guide)

Provides an overview of how permissions, roles, and ownership work in Google Shared Drives, including the differences between Manager, Content manager, Contributor, Commenter, and Viewer roles. Use this guide to understand how access to files and folders is governed within a Shared Drive and how permission changes affect read and write behavior.
https://support.google.com/a/users/answer/12380484?hl=en

Was this page helpful?