C3 AI Documentation Home

PCA

Fit a principal component analysis (PCA) model to your data. PCA is a popular dimensionality reduction technique used for analyzing large datasets containing a large number of features.

Configuration

FieldDescription
Name default=noneA user-specified node name displayed in the canvas, both on the node and in the dataframe as a tab.
Training Features default=defaultThe columns to use in the PCA Select the columns you would like to use in the PCA. These are the columns that the PCA will represent in fewer dimensions.
Target number of dimensions RequiredThe target number of dimensions for the PCA Specify the target number of dimensions for the PCA. The columns selected above will be represented in this specifed number of dimensions.

Node Inputs/Outputs

InputA Visual Notebooks dataframe (typically)
OutputA PCA model

Example dataframe output

Figure 1: Example output

Examples

The dataframe shown below is used in this example. It contains data on four features of over 300 penguins. We would like to use PCA to reduce the number of dimensions from four to two.

Example input data

Figure 2: Example input data

  1. Connect the PCA node to an existing node
  2. Select bill_length_mm, bill_depth_mm, flipper_length_mm, and body_mass_g for the Training Features field.
  3. Input 2 for the Target number of dimensions field.
  4. Select Run to perform the PCA.

Now you have a trained PCA model that you can use with a PCA Transform node to reduce the number of dimensions from four to two.

Example dataframe with default settings

Was this page helpful?