File System Configuration Example — Amazon S3
Data Fusion supports a wide range of connectors for integrating data from file systems, databases, and streaming sources.
This documentation includes detailed configuration examples for a selected subset of commonly used connectors to illustrate the connector setup workflow. Not every connector is documented as a separate topic. Instead, the examples demonstrate the common configuration patterns and concepts that apply across connector types.
All connectors follow a consistent creation and configuration experience in Data Fusion 8.10, using the unified connector workflow and canvas-based representation.
The S3 connector enables Data Fusion to access data stored in Amazon S3 buckets and use it as a source for ingestion pipelines.
In Data Fusion 8.10, connectors are created through a unified connector configuration workflow and are represented directly on the Data Fusion canvas as nodes.
Connector Representation
Once created, the S3 connector:
Appears as a Canonical node (FileSourceSystem) on the canvas
Serves as the entry point for file-based ingestion workflows
Can be connected to a Source Collection to define ingestion logic
Connection Configuration Model
S3 connections are configured using a structured connector form with two main sections:
Connector Information
Defines metadata for the connector:
- Name — Unique identifier used within the application
- Description — Optional context for the connector
Authentication and Access Configuration
Defines how Data Fusion connects to S3:
rootUrlOverride
Specifies the S3 bucket or path- Must start with /
- Does not include s3://
region
AWS region where the bucket residesauthMethod
Determines how authentication is handled
Authentication Behavior
Authentication fields vary depending on the selected authMethod:
IAM-based methods (for example, IRSA / Pod Identity)
- No access key or secret key required
- Credentials are resolved through the execution environment
Credential-based methods (if available in your environment)
- May require access key and secret key
Validation and Persistence
- Connection validation is performed using Save and Test
- A successful validation creates the connector
- The connector becomes immediately available for pipeline configuration
Connector Management
After creation, the S3 connector can be managed directly from the canvas:
- View and update configuration
- Modify connection settings
- Delete the connector
All interactions are performed through the connector node and its associated menu.
Add an S3 Connector in Data Fusion
Configure an Amazon S3 connector in Data Fusion by selecting the S3 data connector, providing connection details, and validating access using the Save and Test workflow.
Prerequisites
Before starting, ensure you have the following:
- A C3 environment running on Version 8.8 or above
- A running C3 application
- Amazon S3 credentials
- CSV formatted files
Data Fusion does not currently support unstructured files for data integration.
Steps
- Open your application in C3 AI Studio.
- Navigate to Data Fusion.
- On the canvas, select Add Data Source.
- In the Select a Data Connector panel:
- Under File systems, select S3 by Amazon
- In the Configure Connector screen, enter the following:
Connector Information
- Name — Enter a unique name
- Description — (Optional) Add details
Configure Authentication
rootUrlOverride — Enter the S3 bucket path (for example, /my-bucket/path)
region — Select the AWS region
authMethod — Choose the authentication method
Select Save and Test.
Verify that the connection is successful.
Result
- The S3 connector is created
- A Canonical node (FileSourceSystem) appears on the Data Fusion canvas
- The connector is ready to be linked to a Source Collection
Next Steps
- Connect the S3 node to a Source Collection to define ingestion
- Configure downstream pipeline components
See also
- Understand the Source System
- Configure the Source System and Source Collection
- Configure the Source Schema
- Understand Change Data Capture (CDC) in Data Fusion
- Configure Change Data Capture (CDC) for SQL Source Collection
- Add and Configure a Transform for a DI Pipeline
- Map Source Fields to Target Fields
- Configure Runtime Parameters and Trigger a DI Pipeline
- Confirm Data Fusion Pipeline Run Completion
- Work With File Systems