Blog Article & Copyright by Lukas Hillesheim
Article 3 of this series covers "List.Generate". "List.Generate" is one of the most important Power Query functions. As in other languages, "List.Generate" encapsulates a loop and is the equivalent of a For-loop. Understanding List.Generate requires knowledge of lambda expressions.
1. âList.Generateâ vs. âFor-Loopâ
The MSDN syntax help for "List.Generate":
- List.Generate(
- initial as function,
- condition as function,
- next as function,
- optional selector as nullable function) as list
For-Loop:
- For ( <variable_expression>; <evaluate_expression>; <increment_expression> ) {
- <loop_block_expression>
- }
Most For-loops look like this:
- For ( Int i = 0; i <= 5; i++ ) {
- <loop_block_expression>
- }
Comparison "List.Generate" / "For-Loop":
2. General
Example 1 - Generating a list of 5 numbers
Output:
In Article 1 of this series, I recommended to avoid using the "Each" syntax and instead use the native lambda expression syntax. This advice applies if you're just getting started with Power Query.
Each of the four arguments of "List.Generate" is of type "function" and therefore it is syntactically allowed to use native lambda expressions.
Rewrite of example 1. Use of a native lambda expression:
The three lambda expressions are independent of each other. Therefore it is allowed to use different variable names for each individual lambda expression:
3. Loop Block / Selector
The fourth argument, the "Selector", is optional and most examples on the web, as well as the previous examples, omit it. If you don't use the "Selector" argument, the default behavior of Power Query is that the "Selector" argument is the same as the "Next" argument.
The next code example shows the implicit behavior:
The comparison matrix above shows that the "Selector" argument of "List.Generate" corresponds to the <loop_block_expression> in a "For-Loop". In other words, you can use the "Selector"-argument to control what is inserted into the result set per iteration. In the example above, the value that is inserted into the result set is the loop counter itself.
Using this logic, a For-loop would look like this:
- Int[] result = {};
- For ( int i = 1; i <= 5; i++ ) {
- result += i;
- }
Example 2 - Using the selector.
In the example above, argument 4, the Selector, is used to return the constant value "test" per loop pass. Arguments 1 through 3 cause the loop to run 5 times.
4. Loop-Header / Initial- & Condition- & Next-Lambda Expressions
Example 3 - Passing a Record object.
- Loop counter. In example 3, the loop is run 5 times. The loop counter starts with 1 and is incremented by 1.
- OrderDate. The loop counter is used as an offset between the constant date "3/1/2022" and the OrderDate.
- Customer. The loop counter is used to calculate a letter.
- Amount. A random value between 30 and 60.
Initial Lambda Expression. The first argument, the "Initial" lambda expression, returns a record. The record is inserted as the first item in the result. Please note that "List.Generate" does not pass any data to the "Initial" lambda expression. Hence the brackets are empty; any variable inside the brackets would raise an exception. The case of a missing fourth argument can be imagined in the OOP comparison as follows:
- Record[] result = {};
- For ( Record r = [âŠ], ⊠) { ⊠}
In Power Query, the control flow that can be thought to live inside "List.Generate", could be that the value for the "Initial" Lambda Expression is inserted into the result set before "Next" Lambda Expression and "Condition" lambda expression have any influence on the course of the loop. In the sense of VB, C# etc., this is a Do-While-loop. This is an important conceptual point, because since the control of the loop occurs at the end, the result set will contain at least one element.
- result += r;
Condition Lambda Expression. The "Condition Lambda Expression" is evaluated when the next loop iteration starts. To run the loop 5 times, an upper bound of 5 is defined and the current value is compared to the upper bound for each pass. The "Next Lambda Expression" is responsible for setting the "record" variable and compares the current id with the upper bound - in this example with the value 5. The current id is accessed by the following expression: "record[Id]".
Next Lambda Expression. The "Next Lambda Expression" accesses the variable "record". The "record" variable gets its value from the iterator "List.Generate". "record" represents the last element inserted into the result set. When the loop initially starts, the last element is represented by the "Initial Lambda Expression". The variable on the left is called on the right and the lambda expression generates new data from it. In particular, a new id is generated. Because the "Selector Lambda Expression" is missing, the return value of the "Next Lambda Expression" is used to insert the current record into the result set.
5. Alternate Pattern / Imperative Style
The code above is not that easy to read, nor easy to understand! It can be confusing for a developer that - if the "Selector Lambda Expression" is missing - the "Selector Lambda Expression" is equated with the "Next Lambda Expression". The next code example is the projection of a PowerQuery loop using "List.Generate" without the fourth argument. You can see that the automatic behavior causes the loop variable to be included in the result:
- For ( int i = 0; i <= 5; i++ ) {
- Int[] result = {};
- result += i;
- }
Using the "Selector Lambda Expression" the code can be rewritten into a more imperative pattern. This makes the code easier to understand:
After the rewrite, the difference between loop control elements and the loop block becomes clearer. The first three lambda expressions take care of loop initialization, loop counter and loop counter increment. The fourth lambda expression uses the loop counter to parameterize the result expression. In contrast to the original pattern, the developer does not have to understand the "record" variable, the content of which is passed back and forth.
PLease also note that now that the "Selector Lambda Expression" is present, the "Initial Lambda Expression" does NOT (!) return the first element for the Result Set, but the "Selector Lambda Expression" does!