Custom Scala Function
Use Visual Notebooks to transform data using a custom Scala lambda function. For more information about Scala lambda functions, view the Scala documentation.
Configuration
| Field | Description |
|---|---|
| Name (default=none) | A user-specified node name displayed in the workspace |
| Columns (Required) | Input columns: Specify the input columns for the scala expression. |
Derivation Code (default=default) | Custom scala expression: Optionally, specify a Scala lamba function to act on the data specified by Columns. |
| Output Column Name (Required) | Name of the output column: Specify a column name for the transformed data. |
| Derived Type (Required) | Output column type: Specify the data type for the derived column of data. |
Node Inputs/Outputs
| Input | A Visual Notebooks dataframe |
|---|---|
| Output | A Visual Notebooks dataframe with a column transformed using a custom Scala lamba function |

Figure 1: Example dataframe output
Examples
A lambda function (also known as an anonymous function) is used in Scala and other programming languages to transform an expression without declaring a named function. This has the benefit of producing simple, compact code that is easy for developers to generate and maintain. Lambda functions can also be used as input to higher order functions if needed. For more information about lambda functions, view the Scala documentation.
In the examples presented here, we use Scala lambda functions to perform string manipulations and to manipulate numeric data. To begin, input the data into Visual Notebooks as follows:
- Drag a CSV node onto the canvas.
- Select the CSV node.
- Upload the sample data.
- Select Run.
The sample data used in this example consists of (fictitious) prices of milk and eggs in selected cities in California, Texas, and NY every 5 years. This data is shown in Figure 2.

Figure 2: Input data showing prices of milk and eggs over time in NY, CA, and TX
For the first example, use a Custom Scala Function node to combine city and state information from two different columns, by using these steps:
- Drag a Custom Scala Function node onto the canvas and connect it to the existing CSV node.
- Select the node.
- For Columns, select city and state.
- For Derivation Code, enter
(x: String, y: String) => {x + ", " + y}. - For Output Column Name, enter
City_state. - For Derived Type, select String.
- Select Run.
Since you selected city and state as your columns, the lambda function represents x with city and y with state. The strings are combined using the + operator, and another string , is added to insert a comma and a space.

Figure 3: City and state information combined
For the next example, add the prices of eggs and milk, by performing these steps:
- For Columns, clear the existing selections and select egg_price and milk_price.
- For Derivation Code, enter
(x: Double, y: Double) => {x + y}. - For Output Column Name, enter
Total. - For Derived Type, select Decimal.
- Select Run.
The new column, Total, indicates that in Anaheim, CA, in 1980, the price of eggs and milk combined was $3.20.

Figure 4: Price of milk and eggs and combined
For the final example, modify the previous example by adding 5% tax to the price of milk and eggs:
- Modify Derivation Code to
(x: Double, y: Double) => 1.05 * {x + y}. - Select Run.
The total for Anaheim, CA, in 1980, has increased from $3.20 (Figure 4) to $3.36 (Figure 5).

Figure 5: Total price with tax