Lambda Expressions in Power Query - Part 1 - Introduction

By Lukas Hillesheim, 30 August, 2022
Blog Article & Copyright by Lukas Hillesheim
Part 1 of this series shows how lambda expressions work within Power Query iterators. Lambda expressions placed inside an Power Query function will be executed repeatedly. With each call, the iterator passes different data to the lambda expression.
Like SQL, Python, DAX etc., Power Query also offers a set of iterator functions. Often the function names suggest that there is a loop hiding inside, sometimes not. And in some cases it's even hard to understand the specific function as an iterator.

1. Iterators in Power Query

Function name with obvious reference to a loop. The name of an iterator function, such as "List.Accumulate", makes it clear that you pass a list to the function and receive a single item back. It is obvious that a loop is run on the input list to compute the result.
List: { 10, 50, 30 } Result: 90
The business logic within the function could be thougt of as follows:
Screenshot 001 - Iteration obviously inside
Figure 1 - Iteration obviously inside
Hidden iteration. For other Power Query functions it's not so obvious that they encapsulate an iteration. For example, the Power Query function "Table.SplitColumn" is able to derive multiple columns from the value of a single column. What would the iteration consist of?
Example - table where a column is to be split:
Screenshot 002 - Table with column [Person] to be split
Figure 2 - Table with column [Person] to be split
The string "Judy, Munich" in line 1 can be split using ", " as a separator. Before the scalar values, "Judy" and "Munich", can be placed in new columns, the target columns must already exist. The first step before splitting the string must therefore be to change the table schema by adding two new columns. The last argument of the "Table.SplitColumn" function instructs the engine to create columns "PersonName" and "CityName":
Screenshot 003 - PowerQuery code with 'Table.SplitColumn'
Figure 3 - PowerQuery code & Table.SplitColumn
Step 1 - Modifying the schema. Adding two columns:
Screenshot 004 - Inside 'Table.SplitColumn'. Step 1: Schema modification
Figure 4 - Inside 'Table.SplitColumn'. Step 1: Schema modification. Add column(s)
Step 2 - Filling the new columns.

After the two columns are ready to accept new values, iteration comes into play. The business logic expressed in OOP pseudo-code could look like this:

Screenshot 005 - Inside 'Table.SplitColumn'. Step 2: iteration & split
Figure 5 - Inside 'Table.SplitColumn'. Step 2: Iteration & split
After the loop has finished, the new columns are filled and the original source column can be deleted:
Screenshot 006 - Inside 'Table.SplitColumn'. Step 3: Finished
Figure 6 - Inside 'Table.SplitColumn'. Step 3: Finished

2. Lambda Expressions

The MSDN Syntax Help for "Table.SplitColumns":
  • Table.SplitColumn(
  • table as table,
  • sourceColumn as text,
  • splitter as function,
  • optional columnNamesOrNumber as any,
  • optional default as any,
  • optional extraColumns as any) as table
  • In the Power Query code example above, only the first 4 of the 6 possible arguments are used:
  • - 'table'-Argument: "customer"
  • - 'sourceColumn'-Argument: "Person"
  • - 'function'-Argument: Splitter.SplitTextByDelimiter
  • - 'columnNamesOrNumber'-Argument: "PersonName", "CityName"
The syntax help informs that argument 3 is of type "function".
Whenever you see an argument of type "function", you are free to use a function such as "Splitter.SplitTextByDelimiter". However, you are not limited to BuiltIn functions, syntactically user-defined expressions - lambda expressions - are also allowed.
General syntax for lambda expressions:
Screenshot 008 - Lambda syntax
Figure 7 - Lambda syntax
“=>” is the lambda operator. Together with the expressions on the left and right, you have a lambda expression, an important concept of the Power Query language. Complex code can be written with lambda expressions and problems can be solved that go beyond the capabilities of the Power Query User Interface or BuiltIn functions. In addition to the native Lambda syntax, there are also syntax alternatives that are generally used more frequently. They are more compact and seem easier for beginners to understand. They are explained at the end of the article.
A lambda expression consists of three parts: on the left side of the lambda operator "=>" there are round brackets that enclose zero, one or more variables. As with any variable, you are free to choose an appropriate name. On the right side of the lambda operator is an expression that can use the variable(s) on the left.
The code above can be rewritten using the native syntax as follows:
Screenshot 008 - Lambda expression rewrite (detailed syntax)
Figure 8 - Lambda function rewrite (native syntax)

3. Interaction of Iterator and Lambda Expression

Step 1: The iterator "Table.SplitColumn" calls the lambda expression in each run. In pass 1, the value "Judy, Munich" is passed to the lambda expression. The variable "column" on the left side of the lambda expression is responsible for receiving the value.
Step 2: The expression on the right side of the lambda operator - "Splitter.SplitTextByDelimiter" - takes the lambda variable "column" as an input argument. In the next step, the "Splitter.SplitTextByDelimiter" function splits the string and returns a string array as the result - for the first line in the example { "Judy", "Munich" }. The outer function "Table.SplitColumn" receives the array and writes the array items into the appropriate new columns. This process is repeated for each row of the table.
There is a fundamental challenge in using lambda expressions in Power Query. It is clear that an outer function such as "Table.SplitColumn" can pass values to the lambda variables and it is also clear, that it is able to get values back from the lambda expression. However, the MSDN syntax help does not provide detailed information about this traffic:
  • - It´s not clear how many variables the lambda expression must provide (0, 1, n)
  • - It´s not clear what data types are passed to the Lambda Variable
  • - It´s not clear what type the Lambda Expression must return
Mistakes and misunderstandings - Bad variable name. The next example shows that the developer has choosen a bad name for the variable. The code works, but the variable name exposes some kind of misunderstanding. It's not the row that's passed to the variable, it's a single column.
Screenshot 009 - Bad variable name
Figure 9 - Bad variable name
It might be a good practice to always give a variable a generic name first when you start coding. E.g. you could always place the variable name "item" on the left side. Later, when you are sure, what kind of data is carried around through the variable, you could give the variable a more appropriate name, such as "column".
Mistakes and misunderstandings - Wrong number of variables:
Screenshot 010 - Wrong count of variables
Figure 10 - Wrong number of variables
The only way to understand the lambda expression's hidden signature is by trial and error. Luckily the BuiltIn functions are named in a manner that makes them easy to understand. It is therefore intuitive to more or less guess the signature. For example, if a function like "Table.SplitColumn" generates values for multiple new columns from a single string, "Table.SplitColumn" obviously expects the lambda expression to return a list. And surely the list must return the same number of values for each row of the outer function. If there are two new columns to fill, the lambda expression should of course return a list of two values.

4. Alternate Syntax

Each-Syntax. Users tend to use the "Each-syntax". With the "Each syntax", variable declaration "(...)" and lambda operator "=>" are replaced by the expression "Each". The "Each" keyword underlines that the lambda expression is called repeatedly.
each <expression>
The generic variable "_" can be used on the right side. The variable "_" is implicitly created on the left side and filled by the iterator. The variable will probably look familiar to some users: a similar concept of an auto-created and auto-filled variable named "_" exists in PowerShell too.
Screenshot 011 - Each syntax
Figure 11 - Each-syntax

5. Custom Function as Lambda Expression

Lambda expressions come into play when you are not able to solve a task using builtin functions. In the following example, the builtin function "Splitter.SplitTextByDelimiter" is used within a lambda expression. The business case requires us to not only carry out the splitting, but additionally to replace possible NULL values during the operation. Some person names contain NULL and some cities contain NULL as well. During the split operation, these NULL values should be replaced with "n/a".
Screenshot 012 - Replace NULL values
Figure 12 - Split column & replace NULL values
Screenshot 013 - Lambda function
Figure 13 - Lambda expression
The right side lambda expression consists of several statements. Three steps are performed in the "let"-block to compute the return value successively. Among other things, the splitter function is used and the occurrence of NULL values is checked. At the end, the "in"-block passes the string array of person and city to the iterator.
Screenshot 014 - Custom lambda expression
Figure 14 - Finished