KQL - Reconstruct missing series items using 'make-series' and 'series-fill-linear'

By Lukas Hillesheim, 15 July, 2024
Blog Article & Copyright by Lukas Hillesheim
Series, such as time series, are useful when it comes to making predictions or recognizing patterns. KQL offers a bunch of features for working with series. The following article shows how make-series and series-fill-linear may help you in deriving missing values from the existing ones based on interpolation.

Content:

1. Scenario

Underlying scenario for this blog article:
  • Step 1. Generate a series that represents a sinusoid and assign the series to a variable
  • Step 2. A range from 0.5 to 1 is removed from the series and applied to a variable called missing_range
  • Step 3. Based on missing_range a new series is derived and assigned to a variable named recomp_xitems. recomp_xitems contains new, artificially generated items that represent the x-axis items of the sinusoid. Each of the new items has a default value of -1
  • Step 4. A new series is computed and applied to the variable recomp_yitems. The default value -1 is replaced with a new value that fits into the sinusoid.

2. Complete KQL code

Complete KQL code for this article. Each code block will be explained separately:
let sinusoid = 
    print x = range(0.01, 5, 0.03) 
    | mv-expand x
    | serialize 
    | extend  
        row_ix = row_number(1), 
        x = round(todouble(x), 2), 
        y = sin(todouble(x))
    | project-reorder 
        row_ix, 
        x, 
        y
    | sort by row_ix asc; 
let missing_range = 
    sinusoid 
    | where x < 0.5 or x > 1; 
let recomp_xitems = 
    missing_range
    | make-series 
        y = avg (todouble(y))
        default = -1 // Set -1 for new items  
        on x
        from 0.1 to 5
        step 0.03; 
let recomp_yitems = 
    recomp_xitems 
    | extend  
        y = series_fill_linear(y, -1, false); 
recomp_yitems  
    | mv-expand x, y 
    | serialize 
    | extend 
        row_ix = row_number(1), 
        x = round(todouble(x), 2), 
        y = todouble(y)
    | render linechart with (xcolumn=x, ycolumns=y)

3. Series without gaps | Visualization

The following KQL code block shows how the sinusoid series is computed and visualized:
  • The series is based on a decimal array (lower bound = 0.01, upper bound = 5) that is generated by the KQL function range. The increment of 0.03 is responsible to avoid any artifacts.
  • A column named row_ix with an auto id is added as row identifier.
  • At least the series is applied to the sinusoid variable and visualized by render
KQL code:
let sinusoid = 
    print x = range(0.01, 5, 0.03) 
    | mv-expand x
    | serialize 
    | extend  
        row_ix = row_number(1), 
        x = round(todouble(x), 2), 
        y = sin(todouble(x))
    | project-reorder 
        row_ix, 
        x, 
        y
    | sort by row_ix asc; 
sinusoid
| render linechart with (xcolumn=x,ycolumns=y)
Output:
Sinusoid
Fig.1: Sinusoid

4. Series with gaps

The KQL code of section 4 simulates a common situation where data is missing. The series contained in the variable sinusoid is modified and the range between 0.5 and 1 is removed from the series. The result is applied to the variable missing_range:
KQL code:
let missing_range = 
    sinusoid 
    | where x < 0.5 or x > 1; 
missing_range
Output:
Missing range 0.5 to 1
Fig.2: Missing range 0.5 to 1

5. Reconstruct missing items

In the next step the KQL function make-series is applied to the missing_range series thereby computing new items with a default value of -1.
KQL code:
... 
let recomp_xitems = 
    missing_range
    | make-series 
        y = avg (todouble(y))
        default = -1 // Set -1 for new items  
        on x
        from 0.1 to 5
        step 0.03; 
recomp_xitems
Output:
Reconstruct missing items (1)
Fig.3: Reconstruct missing items (1)
The screenshot above shows that the x-/y-axis tupel members reside separated from each other in two arrays. In order to extract a list of tupels from the two arrays, both arrays must be crossed simultaneously. E.g. x[0] and y[0] form the tuple (0.1, 0.0998334). In Python a function named map is able to do this, in KQL a similiar function named mv-expand exists. In order to understand, how make-series is going to add new items, the next code block presents the content of both arrays as a 2-column table. This separate step is dedicated to educational purpose only and is not part of the complete KQL code mentioned above. The array expansion would be superfluous because in the next step series-fill-linear requires the data in an array layout. 
The output is restricted to x-items greater or equal 0.46.
KQL code:
... 
let recomp_xitems = 
    missing_range
    | make-series 
        y = avg (todouble(y))
        default = -1 // Set -1 for new items  
        on x
        from 0.1 to 5
        step 0.03; 
recomp_xitems
    | mv-expand x, y 
    | extend 
        x = round(todouble(x), 2) 
    | project-reorder 
        x, 
        y
    | where x >= 0.46 
    | sort by x asc 
Output:
Reconstruct missing items (2)
Fig.4: Reconstruct missing items (2)

6. Reconstruct missing values

In the last step the KQL function series-fill-linear replaces the dummy values -1 with new values that fit into the sinusoid. This is done using interpolation.
KQL code:
... 
let recomp_yitems = 
    recomp_xitems 
    | extend  
        y = series_fill_linear(y, -1, false); 
recomp_yitems
Output:
Reconstruct missing values (1)
Fig.5: Reconstruct missing values (1)
Similar to 5. Reconstruct missing items, the arrays are unzipped in order to present their content:
KQL code:
... 
let recomp_yitems = 
    recomp_xitems 
    | extend  
        y = series_fill_linear(y, -1, false); 
recomp_yitems
    | mv-expand x, y 
    | extend 
        x = round(todouble(x), 2) 
    | project-reorder 
        x, 
        y
    | where x >= 0.46 
    | sort by x asc 
Output:
Reconstruct missing values (2)
Fig.6: Reconstruct missing values (2)

7. Visualization as linechart

The now reconstructed series is presented through a linechart.
KQL code:
... 
recomp_yitems  
    | mv-expand x, y 
    | serialize 
    | extend 
        row_ix = row_number(1), 
        x = round(todouble(x), 2), 
        y = todouble(y)
    | render linechart with (xcolumn=x, ycolumns=y)
Output:
Reconstructed series as linechart
Fig.7: Reconstructed series as linechart
Especially the reconstructed area from 0.5 to 1 is not so smooth as the rest of the linechart. Optimization features could help in order to remove the artifacts.