Understanding the Data Fusion Canvas
The Data Fusion canvas provides a visual representation of how data ingestion pipelines are composed, configured, and connected. Rather than showing a static diagram, the canvas dynamically reflects the current state of your pipeline configuration and guides you through valid next steps.
This topic explains how the canvas behaves, what different node states mean, and how to interpret what you see as you build or modify pipelines.
How the Canvas Represents Pipelines
Each pipeline is represented as a graph of nodes connected from left to right, starting with a Source System and ending with one or more targets. Nodes represent logical configuration units such as Source Systems, Source Collections, Sources, transforms, and entities.
The canvas is not limited to a single fixed pipeline shape. You can configure any number of sources, transforms, and targets, and the canvas updates automatically to reflect valid configurations and next actions.
Configured and Unconfigured Nodes
The canvas distinguishes between configured nodes and unconfigured nodes:
Configured nodes represent objects that already exist in metadata (for example, a configured Source System or Source Collection).
Unconfigured nodes act as placeholders that guide you toward the next valid step in the pipeline.
For incomplete pipelines, the canvas behaves in a guided mode to help you complete the required configuration steps in the correct order:
It highlights the next required unconfigured node so users know exactly where to continue.
It prevents users from selecting invalid or out‑of‑sequence pipeline paths by enabling only the next valid step.
Node Visibility and Duplication
The canvas reflects logical relationships rather than enforcing a single visual instance per object.
A Source Collection or other node that participates in multiple pipelines appears as a separate node instance in each pipeline path.
This duplication helps preserve clarity and avoids visually entangling unrelated pipelines.
Additionally:
- Except for Source and Canonical types, and Entity types that are not used as transform targets, nodes with no remaining connections are still displayed to maintain visibility into defined components.
Editing and Naming Nodes
Every configurable node supports:
- An instance name, when relevant to the underlying object.
This allows meaningful naming without breaking references or requiring code-level edits.
Source System Behavior on the Canvas
Source Systems are always added and managed directly from the canvas.
A configured Source System cannot be changed to a different source type (for example, SQL → File or File → Cloud Message).
Switching to a different Source System of the same type is allowed, provided the target system is valid and error-free.
Interactions include:
Left-click: Opens the Connector Properties modal, with the ability to switch to another valid Source System of the same type.
Context menu: Opens a list with options to delete the connector or view associated errors.
Adding Source Collections
Once a Source System is configured, the canvas presents options to add Source Collections appropriate to that system:
Structured Source Collections
File: Select a file path
SQL: Preview and select tables (optional)
Cloud Message: Select a data stream and configure schema
Unstructured Source Collections
Available only for File Source Systems
Selecting this option displays a template unstructured pipeline
Existing Source Collections can also be selected during the add flow, as long as they are valid and not already bound to another Source System.
After adding a Source Collection, the canvas continues to display the next applicable unconfigured node to guide pipeline completion.
Source Collection Interaction Patterns
Each Source Collection type has tailored canvas behavior:
File Source Collections
Left-click opens a file explorer
The Context menu provides the options to execute pipelines, edit properties, view run status/history, view data integration status, or delete the collection
SQL Source Collections
Left-click opens the Preview tab
The context menu provides the options depending on whether CDC is configured
Cloud Message Source Collections
Left-click opens the Preview tab and provides an option to edit the credentials
The Context menu provides the options to edit properties, view run status, view data integration status, or delete the collection
Execution modals for Source Collections expose run-specific controls such as filtering, reprocessing, clearing targets, and status management.
Source, Transform, and Target Nodes
After a Source Collection is configured, the canvas guides schema and transformation setup.
Source / Canonical Nodes
- Represent schema definitions
- Support remixing when sourced from dependency packages
Transform Nodes
- Include Projection and Transformer transforms
- Filters are configured inline within transforms rather than as separate nodes
- Targets can be existing entities or newly defined via a lightweight editor
Entity Nodes
- Represent ingestion targets
- Support previewing data, editing properties, clearing data, and navigating to the object model
Error States and Rendering Rules
The canvas reflects package validity:
- Nodes defined in metadata but containing package errors are not rendered until errors are resolved.
- Once a node has been rendered, later errors do not remove it from the canvas, but certain actions may be disabled.
- Context menus surface error details to help diagnose configuration issues.
Key Takeaway
The Data Fusion canvas is designed to be state-aware, guided, and flexible. It reflects what is configured, highlights what remains to be done, and adapts dynamically as pipelines evolve. Understanding these behaviors helps you navigate the canvas confidently, interpret what you see correctly, and build pipelines efficiently without second-guessing the UI.