C3 AI Documentation Home

Split Object to Columns

Disassemble a column of type Object, consisting of multiple key/value pairs, into separate columns. This node performs the opposite role of the Assemble Object node.

Objects in Visual Notebooks are equivalent to:

  • JavaScript objects as defined using JavaScript Object Notation "JSON" syntax
  • Python dictionaries

Data analysis often involves "semi structured" data provided as JSON or XML input files. Such data may not cleanly fit into a flat tabular structure and often involves wrangling of more complex "Object" columns. Visual Notebooks provides several transformations for manipulating Objects, including creating Objects, extracting fields from Objects, and disassembling Objects.

Configuration

Configuration sidebar

FieldDescription
NameName of the node A user-specified node name, displayed in the canvas and in the dataframe as a tab.
ColumnSelect single column of type Object Select the input column to be disassembled into multiple columns in the output dataframe.
Output column prefixDesired prefix for output column names Enter a prefix to help identify which columns were generated in the output dataframe. Output columns will be named "" for each key/pair in the object schema. When a prefix is not provided, output columns will be named "" for each key/pair in the object schema.
allLevelsDegree of disassembly of key/value pairs Specify whether to disassemble only the first level of key/value pairs or disassemble all levels recursively. This setting has no effect if the object schema has only a single level of key/value pairs. When toggled on, all levels of key/value pairs will be extracted. Output columns will not contain Objects. When toggled off, only the first level of key/value pairs will be disassembled. Keys containing nested objects will create new columns of type Object.
Drop Original Column(s)Toggle indicating whether the selected input column should be dropped from the output Leave the switch on to delete the Object column from which output columns are extracted. The column information, however, is preserved in the new output columns. Toggle the switch off to keep the selected column.

Node Inputs/Outputs

InputA Visual Notebooks dataframe
OutputA dataframe with at least one Object column

Example dataframe output

Figure 1: Example dataframe output

Examples

The data shown in Figure 2 is used in the following example. We start with a json file comprising a single category, or level, of customer details. The file, when loaded using the json input node, creates a single column of Objects corresponding to customer details. The Object column contains three keys that we wish to separate into individual columns: First_Name, Last_Name, and Suffix.

Example input data for single level object

Figure 2: Example input data for single level object

  1. Connect a Split Object to Columns node to an existing node. In this case, it is connected to a JSON node with the example data provided.
  2. In the JSON node under Mode, select "Singleline."
  3. In the Split Object to Columns node, click "Run."

After running the node, the split Object can be seen in the second through fourth columns, which is shown in Figure 3.

Disaggregating a single level object

Figure 3: Disaggregating a single level object

Now, let's analyze a json file comprising multiple categories, or levels, of customer details. The file, when loaded using the json input node, creates a single Objects column with nested structure representing the various categories of customer information. This data is shown in Figure 3.

Example input data for multilevel object

Figure 3: Example input data for multilevel object

Once loaded, we can see that there are three "top level" key/value pairs, with each value containing another Object. The top level pairs are:

  • Name
  • Location
  • Contact

We first demonstrate how to split each of these top level pairs into individual columns.

  1. Connect a Split Object to Columns node to a JSON node with the example data provided.
  2. Toggle allLevels off and click "Run."

The resulting columns are Objects, with the Location column containing further nesting. This is shown in Figure 4.

Using "Split Object to Columns" to disassemble one level of a multilevel Object

Figure 4: Using "Split Object to Columns" to disassemble one level of a multilevel Object

Toggle allLevels on and click "Run." This setting recursively splits all levels of the initial Object structure until all resulting columns are primitives or Arrays. In this example, nine new columns are generated from the original "Member" column as shown in Figure 5.

The column names of deeply nested fields are autogenerated by appending, as a prefix, the key name at each level of the original structure. In this example, the current city field is nested in the original object as part of the following hierarchy:

  • Member
    • Location
      • Current
        • City

The resulting column name for this field is thus: "Member_Location_Current_City."

On complex and deeply nested Objects, hundreds if not thousands of columns may be generated using the Split Object to Columns node, and many fields may not be important. Instead, consider using the Extract Field from Object node to extract only the most important fields at any level or the original Object structure.

Using "Split Object to Columns" to disassemble all levels of a multilevel Object

Figure 5: Using "Split Object to Columns" to disassemble all levels of a multilevel Object

Was this page helpful?