Lambda Expressions in Power Query - Part 1 - Introduction

By Lukas Hillesheim, 30 August, 2022
Blog Article & Copyright by Lukas Hillesheim
Part 1 of this series shows how lambda expressions work within Power Query iterators. Lambda expressions placed inside an Power Query iterator function will be executed repeatedly. With each call, the iterator passes different data to the lambda expression.

Content:

Like SQL, Python, DAX etc., Power Query offers a set of iterator functions. In many cases the names of those functions make it obvious that there is a loop inside. In some other cases it's even hard to understand how the respective function encapsulates an iteration.

1. Iterators in Power Query

1.1. Function names that expose their iterative nature

The name of an iterator function, such as List.Accumulate, makes it clear that you pass a list to the function and receive a single item back. It is obvious that a kind of loop must run inside List.Accumulate in order to compute the result.
// Input: 
List: { 10, 50, 30 } 
// Output: 
Scalar: 90
Fig.1: List aggregation
The business logic inside List.Accumulate could be thougt of as follows:
// 'List.Accumulate': OOP pseudocode 
// Function is used to apply a 'sum' to the list:  
Int Accumulate ( List list ) {
	Int result = 0; 
	ForEach ( Int item in list ) {
		result += item; 
	}
	Return result; 
}
Fig.2: 'List.Accumulate' - OOP pseudocode
Please keep in mind that List.Accumulate is not only dedicated to very simple 'sum'-computations, but for all kind of aggregations.

1.2. Function names that hide their iterative nature

For some other Power Query functions it's not so obvious that they run an iteration inside. For example, the Power Query function Table.SplitColumn is able to derive multiple new table columns from a single column. What would the iteration be?
Example: column [Person] to be split:
ig.3: column [Person] to be split
Fig.3: column [Person] to be split
The string Judy, Munich in line 1 can be split using ', ' as a separator. Before the scalar values Judy, Munich can be assigned to new columns, those target columns must already exist. Therefore in the very first step the table schema is changed and two new columns are added. The last argument of the Table.SplitColumn-function instructs the engine to create two columns [PersonName] and [CityName]:
// 'Splitter.SplitTextByDelimiter' as part of a loop: 
let 
	source0 = Sql.Database (".", "PowerQueryDb"), 
	customer0 = source0 {[Schema="dbo", Item="customer"]}[Data], 
	customer1 = Table.SplitColumn ( 
		customer0, 
		"Person", 
		Splitter.SplitTextByDelimiter(","), 
		{ "PersonName", "CityName" }
	)
in 
	customer1 
Fig.4: 'Splitter.SplitTextByDelimiter' as part of a loop
(Logical) step 1. Schema modification:
Fig.5: Schema Modifikation
Fig.5: Schema Modifikation
(Logical) step 2. After the two columns are ready to be filled, the iteration comes into play. The business logic expressed as OOP pseudocode would look like this:
 
// OOP pseudocode: Split per row 
ForEach ( Row r in customer.Rows ) {
	String[] parts = r[Person].Split(","); // Split 'Person'-column. Get string array 
	r[PersonName] = parts[0];              // Fill the 'PersonName'-column with the first array item 
	r[CityName] = parts[1];                // Fill the 'CityName'-column with the second array item 
}
Fig.6: OOP pseudocode - split per row
(Logical) step 3. After the loop has finished, the new columns have their new values and the original source column can be deleted:
Fig.7: Schritt 3
Fig.7: Step 3

2. Lambda Expressions - Example 'Table.SplitColumn'

The following section explains lambda expressions according to the function Table.SplitColumn.
Die MSDN Syntax help for Table.SplitColumn:
Table.SplitColumn ( 
	table as table, 
	sourceColumn as text,
	splitter as function,
	optional columnNamesOrNumber as any,
	optional default as any,
	optional extraColumns as any) as table
)
Fig.8: 'Table.SplitColumn' syntax
In the Power Query code example above only 4 of 6 possible arguments were used:
  • 'table'-Argument: customer
  • 'sourceColumn'-Argument: Person
  • 'function'-Argument: Splitter.SplitTextByDelimiter
  • 'columnNamesOrNumber'-Argument: PersonName, CityName
The syntax help tells us that argument 3 is of type function.
Whenever a Power Query function has an argument of type function, you are free to pass a function like Splitter.SplitTextByDelimiter. But you are not limited to built-in functions, lambda expressions and user-defined functions are also allowed.
General lambda expression syntax:
(<variable>, n) => <expression>
Fig.9: Lambda expression syntax
=> is the lambda operator. The lambda operator including the expressions on the left and rigth side form a lambda expression. Lambda expressions are an important part of the Power Query language. With the help of lambda expressions you are able to find solutions for problems that go beyond the capabilities of the Power Query user interface or even built-in functions. In addition to the native lambda syntax mentioned above, Power Query provides syntax alternatives that are generally used more frequently. Most users would prefer them because they seem to be handier and easier to understand. At the end of this article they will be introduced.
A lambda expression consists of three parts: on the left side beneath the lambda operator => you find a list of variables enclosed in brackets. Syntactically zero, one or more variables are allowed. As with any variable, you are free to choose an appropriate name. On the right side of the lambda operator => you find an expression that may point to variable(s) on the left side.
The code above can be rewritten using the native syntax as follows:
// Code-rewrite using the 'native' lambda expression syntax: 
let 
	source0 = Sql.Database (".", "PowerQueryDb"), 
	customer0 = source0 {[Schema="dbo", Item="customer"]}[Data], 
	customer1 = Table.SplitColumn ( 
		customer0, 
		"Person", 
		(column) => Splitter.SplitTextByDelimiter(",")(column), // 'native' lambda expression syntax 
		{ "PersonName", "CityName" }
	)
in 
	customer1 
Fig.10: 'Table.SplitColumn' incl. native lambda expression syntax

3. Interaction of iterator and lambda expression

Step 1. The lambda expression is invoked for each pass of the iterator Table.SplitColumn. For example, for row 1, the value Judy, Munich is passed to the lambda expression. The variable column on the left side of the lambda expression is designated to accept this value.
Step 2. The expression on the right side of the lambda operator - Splitter.SplitTextByDelimiter - uses the lambda variable column as input argument. As a result, Splitter.SplitTextByDelimiter returns the splitted column value as string array. E.g. for the first line { "Judy", "Munich" } would be returned. The outer function Table.SplitColumn processes the array and writes each array item to the appropriate new column. This is repeated for each table row.
There is a fundamental challenge that arise with lambda expressions in Power Query. There is no doubt that the outer function (the iterator) such as Table.SplitColumn is able to pass values to the lambda variables and it is also undeniable that the outer function is able to process the return value of the lambda expression. However, the MSDN syntax help does not provide any details about this interaction and leave us alone with some uncertainties:
  • It´s not clear how many variables the lambda expression must provide. The general lambda expression syntax allows 0, 1 or n variables
  • It´s not clear what data types are passed to the lambda variables
  • It´s not clear what type the lambda expression is expected to return
Mistakes and misunderstandings. The next example shows that the developer has choosen a bad name for the variable. The code works without errors, but the variable name exposes some kind of misunderstanding. It's not the row that is passed to the variable, it's a single column.
// Wrong variable name 'row'
let 
	source0 = Sql.Database (".", "PowerQueryDb"), 
	customer0 = source0 {[Schema="dbo", Item="customer"]}[Data], 
	customer1 = Table.SplitColumn ( 
		customer0, 
		"Person", 
		(row) => Splitter.SplitTextByDelimiter(",")(row), 
		{ "PersonName", "CityName" }
	)
in 
	customer1 
Fig.11: Wrong variable name 'row'
It may be a good practice to give a variable a generic name when you have just started coding. E.g. you could define a programming rule where the variable on the left side is always named item. Later, when you are sure, what kind of data is carried around through the variable, you could give the variable a more appropriate name, such as column.
Wrong number of variables:
// Wrong number of variables:
let 
	source0 = Sql.Database (".", "PowerQueryDb"), 
	customer0 = source0 {[Schema="dbo", Item="customer"]}[Data], 
	customer1 = Table.SplitColumn ( 
		customer0, 
		"Person", 
		(var1, var2) => Splitter.SplitTextByDelimiter(",")(var1), 
		{ "PersonName", "CityName" }
	)
in 
	customer1 
Fig.12: Wrong number of variables
The only way to understand the lambda expression's hidden signature is to try it out. Luckily the built-in functions follow strict naming conventions and are named in a meaningful manner. You should be able to intuitively guess the signature. For example, if your investigations have determined that Table.SplitColumn derives values for multiple new columns from a single string, you can guess that the lambda expression is required to return a list. And of course: the number of list items must comply to the number of columns to be filled.

4. Alternate Syntax

Each syntax. Users tend to use the each-syntax. With the each syntax, variable declaration (...) and lambda operator => are replaced by the expression each. The each keyword helps the user to understand that the lambda expression is called repeatedly.
each <expression>
The lambda expression is allowed to make use of the generic variable _. This variable is created implicitly. Conceptually, we have an expression like this on the left side: (_). But this declaration is not exposed to the user and occurs inside each. The variable may look familiar to some users: a similar auto-created and auto-filled variable named _ exists in PowerShell too.
The each-keyword underlines that the lambda expression is run within a loop.
Example: code-rewrite using each:
// Alternative syntax with 'each': 
let 
	source0 = Sql.Database (".", "PowerQueryDb"), 
	customer0 = source0 {[Schema="dbo", Item="customer"]}[Data], 
	customer1 = Table.SplitColumn ( 
		customer0, 
		"Person", 
		each Splitter.SplitTextByDelimiter(",")(_), 
		{ "PersonName", "CityName" }
	)
in 
	customer1 
Fig.13: Alternate syntax with 'each'

5. Custom function as lambda expression

Lambda expressions come into play when you are not able to find a solution based on built-in functions. In the following example the built-in function Splitter.SplitTextByDelimiter is used in conjunction with a lambda expression. The business case requires us to carry out the splitting as seen in the examples above and additionally to replace possible NULL values. Some person names may contain NULL and some cities may contain NULL as well. During the split operation, these NULL values are to be replaced with n/a.
Fig.14: Split column and replace NULL values
Fig.14: Split column and replace NULL values
Use a nested let ... in ... as part of a lambda expression in order to implement a complex string manipulation:
// Lambda Expression and nested 'let ... in': 
let 
	source0 = Sql.Database (".", "PowerQueryDb"), 
	customer0 = source0 {[Schema="dbo", Item="customer"]}[Data], 
	customer1 = Table.SplitColumn ( 
		customer0, 
		"Person", 
		(column) => 
			let 
				colValues = Splitter.SplitTextByDelimiter(",")(column) 
				personName0 = if colValues{0} = "NULL" then "n/a" else colValues{0} 
				cityName0 = if colValues{1} = "NULL" then "n/a" else colValues{1} 
			in 
				{ personName0, cityName0 }, 
		{ "PersonName", "CityName" } 		
Fig.15: Lambda expression and nested 'let ... in'
The lambda expression comprises of several statements. Three steps in the "let"-block compose the return value successively. Among other things, the splitter function is used and the occurrence of NULL values is checked. At the end, the in-block returns the string array of person and city to the iterator.
Abb.16: Fertiges Ergebnis
Fig.16: Finished