C3 AI Documentation Home

Data Lakehouse in C3 AI Studio

Data lakehouses store large amounts of raw data at a low cost. You can store a large volume of tabular data with C3 Data Lake and explore it using C3 Data Spark and through the C3 Data Lake Tables API.

You can see your data lakehouse tables in C3 AI Studio. You can also add, filter, and edit metadata for tables in C3 AI Studio.

To access more granular functionality, use Jupyter. For more information on how to use Data Lake functionality in Jupyter, see the Data Lake Overview tutorial notebook.

Page sections

The Data Lakehouse page is split into two sections:

  • Tables: A list of data lakehouse tables created in your application, and controls for creating more.
  • SQL Editor: An interface for querying tables in your data lakehouse.

Access Data Lakehouse in C3 AI Studio

To see tables in your data lakehouse:

  1. Select your application from the C3 AI Studio homepage or application list.
  2. Select Data Lakehouse tab in the Data section.

To access the SQL Editor, select the SQL Editor tab.

SQL editor

The SQL Editor is an interface for exploring data in an application’s data lakehouse. The tab contains the following sections for exploring your selected catalog's data:

Filter panel

The filter panel contains dropdown menus for selecting a target table or group of tables:

  • Catalog: A catalog is a collection of tables (analogous but not identical to a database).
    • The C3 Agentic AI Platform provides a default catalog (datasets) that contains all your tables.
    • The dropdown menu in the new page allows users to select a catalog and narrow results to a subset of the tables in their data lakehouse.
  • Namespace: A custom namespace associated with the catalog. Use this to narrow your table selection.
  • Preview schema: A list of tables filtered by your Catalog and Namespace selections. Select a table from the Preview schema menu to view its data schema in a modal.

Query builder

The Query builder section is used to create SQL queries for tables in the selected catalog. It consists of:

  • Generative AI input: A text input for generating a SQL query using natural language. The generated query is printed in the query editor, where users can edit and run it.
  • Query editor: A text interface where users can edit an AI-generated query, or write their own. The query editor has an autocomplete feature, which suggests table names or SQL syntax. The query editor includes a button for executing a query.

SQL query results

The SQL Query Results panel shows the result of the most recent query run in the editor. The exact output (rows and columns) depends on the nature of the query. The panel header contains the following controls for interacting with query results:

  • Download the results as a CSV
  • Write results to the C3 File system
  • Save them as a new table in the application’s data lakehouse. The new table can subsequently be used in further data exploration.

Workflows

Create new table

Upload data in C3 AI Studio to populate a new table. Your uploaded files must be in CSV format.

  1. On the Data Lakehouse tab, select Create new table.
  2. Select Browse, and choose an appropriate CSV file. You can verify your information after you upload completes.
    • Files: Upload multiple files to populate your new table. Check if all your files are uploaded correctly.
    • Schema: Verify your schemas are populated correctly.
    • Target preview: Check if your data is accurate.
    • File settings: Files are scanned automatically, but you can specify different delimiters if your target preview looks incorrect.
  3. Select Next after you've verified your file upload. Enter a Table name and Description for your new table.
  4. Select Upload to start processing your new table.

Filter data lake tables

You can filter tables relevant to your use case. Possible filters include:

  • Table name
  • Owner
  • Creation date
  • Last updated date

You can add as many filters as necessary.

  1. On the Data Lakehouse page, select Add filter.
  2. From the dropdown menu, choose your filter field.
  3. Select an appropriate operator and enter a value.
    1. For the date operator, ">" means in the future, and "<" means in the past.
  4. Select Apply to add the filter.

Query a table with the SQL Editor

  1. Navigate to the SQL Editor tab in your data lakehouse.
  2. Select a source Catalog from the menu. If you have not created any custom catalogs, you can leave this option unchanged.
  3. Select a target Namespace. If you have not created any custom catalogs, you can leave this option unchanged.
  4. Create a query. You have two options for doing so:
    • Enter a natural language query into the generative AI input, and select Generate SQL. For example:

      Text
      Write a small query to describe the Manufacturers table.
    • If you are comfortable with SQL, write a SQL query in the text input below the generative AI input field.

  5. Select Run SQL.

Your query results are shown in the SQL Query Results panel.

See also

Was this page helpful?