S3 Node
Import data from CSV files stored in Amazon Web Services (AWS) S3 object store.
Prerequisites
You must add an access key and AWS secret key to use the S3 node. Follow the steps below to generate these keys. For more information, see the AWS documentation
- Log into AWS
- Select your account name or number in the top right corner
- Select Security credentials

- Select the Access keys (access key ID and secret access key) section

- Select Create New Access Key
- Select Download Key File

- Drag an S3 node onto the Visual Notebooks workspace
- Select the gear icon beside the Credential field

- Select the plus sign in the upper right corner

- Enter a name for the credential
- Enter the access key and AWS secret key found within the downloaded file

Configuration
| Field | Description |
|---|---|
| Name Default: S3 | A user-specified node name displayed in the workspace |
| Credential Required | The information needed to access S3 data Select a saved credential from the dropdown menu. Select the gear icon to add a new credential or delete existing credentials. |
| Path Optional | The S3 file to upload Select the bucket and the desired file from the pop-up menu. |
| Nullify Entries with Incompatible Format Default: Off | Discard bad values Toggle this switch on to find values with data types (string, integer, decimal, Boolean, etc.) that are different from the data type found in the rest of the column. Entries with different data types are changed to null values. |
| Delimiter Default: Comma | The character that separates values Set the delimiter to comma, pipe, colon, semicolon, tab, or space. Only change this field if the uploaded file uses nonstandard formatting. |
| Quote Default: " | The character that surrounds values to ignore Set the quote to any character. Delimiters inside quotes are ignored. Only change this field if the uploaded file uses nonstandard formatting. |
| Number of rows to use in schema inference Default: 100 | Rows used to determine a column's data type Set this value to any valid whole number. Visual Notebooks reads the number of rows specified, starting with the first row of the file. These rows are used to determine each column's data type. |
| Has Header Default: On | Header data to be used as column names Toggle the "Has Header" switch on if the uploaded file has an initial header row of column names. Toggle the switch off to use numerical column names ("_c0", "_c1", etc.) instead. |
| Leave all columns as StringType (no schema inference) Default: Off | String data type Toggle this switch on to read all columns as strings and upload all values. |
Node Inputs/Outputs
| Input | None |
|---|---|
| Output | Visual Notebooks returns a table, called a dataframe, that contains all uploaded data. Columns are labeled and include a symbol that specifies the data type of that column. |

Figure 1: Example dataframe output
Examples
- Select the file to upload using the "Path" field.
- Select "Run" to create a dataframe with the default settings.
- Notice that columns are labeled and include a symbol that specifies the data type of that column. Various data types are present in the data and are accurately represented in the dataframe.

Figure 2: Example dataframe with default settings
- Toggle the "Leave all columns as StringType (no schema inference)" switch on to upload all data as strings.
- The "A" symbol beside each column indicates the string data type.

Figure 3: Example dataframe with all data uploaded as strings