C3 AI Documentation Home

S3 Node

Import data from CSV files stored in Amazon Web Services (AWS) S3 object store.

Prerequisites

You must add an access key and AWS secret key to use the S3 node. Follow the steps below to generate these keys. For more information, see the AWS documentation

  1. Log into AWS
  2. Select your account name or number in the top right corner
  3. Select Security credentials AWS security credentials menu
  4. Select the Access keys (access key ID and secret access key) section AWS access keys section
  5. Select Create New Access Key
  6. Select Download Key File Download AWS key file
  7. Drag an S3 node onto the Visual Notebooks workspace
  8. Select the gear icon beside the Credential field S3 credential settings
  9. Select the plus sign in the upper right corner Add new S3 credential
  10. Enter a name for the credential
  11. Enter the access key and AWS secret key found within the downloaded file Configure S3 credentials

Configuration

FieldDescription
Name Default: S3A user-specified node name displayed in the workspace
Credential RequiredThe information needed to access S3 data Select a saved credential from the dropdown menu. Select the gear icon to add a new credential or delete existing credentials.
Path OptionalThe S3 file to upload Select the bucket and the desired file from the pop-up menu.
Nullify Entries with Incompatible Format Default: OffDiscard bad values Toggle this switch on to find values with data types (string, integer, decimal, Boolean, etc.) that are different from the data type found in the rest of the column. Entries with different data types are changed to null values.
Delimiter Default: CommaThe character that separates values Set the delimiter to comma, pipe, colon, semicolon, tab, or space. Only change this field if the uploaded file uses nonstandard formatting.
Quote Default: "The character that surrounds values to ignore Set the quote to any character. Delimiters inside quotes are ignored. Only change this field if the uploaded file uses nonstandard formatting.
Number of rows to use in schema inference Default: 100Rows used to determine a column's data type Set this value to any valid whole number. Visual Notebooks reads the number of rows specified, starting with the first row of the file. These rows are used to determine each column's data type.
Has Header Default: OnHeader data to be used as column names Toggle the "Has Header" switch on if the uploaded file has an initial header row of column names. Toggle the switch off to use numerical column names ("_c0", "_c1", etc.) instead.
Leave all columns as StringType (no schema inference) Default: OffString data type Toggle this switch on to read all columns as strings and upload all values.

Node Inputs/Outputs

InputNone
OutputVisual Notebooks returns a table, called a dataframe, that contains all uploaded data. Columns are labeled and include a symbol that specifies the data type of that column.

Example S3 node output

Figure 1: Example dataframe output

Examples

  • Select the file to upload using the "Path" field.
  • Select "Run" to create a dataframe with the default settings.
    • Notice that columns are labeled and include a symbol that specifies the data type of that column. Various data types are present in the data and are accurately represented in the dataframe.

Example dataframe with default settings

Figure 2: Example dataframe with default settings

  • Toggle the "Leave all columns as StringType (no schema inference)" switch on to upload all data as strings.
    • The "A" symbol beside each column indicates the string data type.

Example dataframe with string data type

Figure 3: Example dataframe with all data uploaded as strings

Was this page helpful?