Skip to content

Statistical Aggregation functions

Avg

Avg() finds the average value of the aggregated data in the expression over a number of records as defined by a group by clause.

Avg( [DISTINCT] expr )

Return data type: numeric

Argument Description
expr The expression or field containing the data to be measured.
DISTINCT If the word distinct occurs before the expression, all duplicates will be disregarded.

Add the example script to your app and run it. Then add, at least, the fields listed in the results column to a sheet in your app to see the result.

Example Result
Temp:
crosstable (Month, Sales) load * inline [ Customer|Jan|Feb|Mar||Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec Astrida|46|60|70|13|78|20|45|65|78|12|78|22 Betacab|65|56|22|79|12|56|45|24|32|78|55|15 Canutility|77|68|34|91|24|68|57|36|44|90|67|27 Divadip|36|44|90|67|27|57|68|47|90|80|94
] (delimiter is '|');

Avg1:
LOAD Customer, Avg(Sales) as MyAverageSalesByCustomer
Resident Temp Group By Customer;

Customer MyAverageSalesByCustomer
Astrida 48.916667
Betacab 44.916667
Canutility 56.916667
Divadip 63.083333
This can be checked in the sheet by creating a table including the measure:
Sum(Sales)/12
Given that the Temp table is loaded as in the previous example:
LOAD Customer,Avg(DISTINCT Sales) as MyAvgSalesDistinct
Resident Temp Group By Customer;

Customer MyAverageSalesByCustomer
Astrida 43.1
Betacab 43.909091
Canutility 55.909091
Divadip 61
Only the distinct values are counted. Divide the total by the number of non-duplicate values.

Correl

Correl() returns the aggregated correlation coefficient for a series of coordinates represented by paired numbers in x-expression and y-expression iterated over a number of records as defined by a group by clause.

Correl( value1, value2 )

Return data type: numeric

Argument Description
value1, value2 The expressions or fields containing the two sample sets for which the correlation coefficient is to be measured.

Text values, NULL values and missing values in any or both pieces of a data-pair result in the entire data-pair being disregarded.

Add the example script to your app and run it. Then add, at least, the fields listed in the results column to a sheet in your app to see the result.

Example Result
Salary:
Load *, 1 as Grp;
LOAD * inline [
"Employee name"|Gender|Age|Salary
Aiden Charles|Male|20|25000
Brenda Davies|Male|25|32000
Charlotte Edberg|Female|45|56000
Daroush Ferrara|Male|31|29000
Eunice Goldblum|Female|31|32000
Freddy Halvorsen|Male|25|26000
Gauri Indu|Female|36|46000
Harry Jones|Male|38|40000
Ian Underwood|Male|40|45000
Jackie Kingsley|Female|23|28000
] (delimiter is '|');

Correl1:
LOAD Grp,
Correl(Age,Salary) as Correl_Salary Resident Salary Group By Grp;

 

In a table with the dimension Correl_Salary, the result of the Correl() calculation in the data load script will be shown: 0.9270611

Fractile

Fractile() finds the value that corresponds to the fractile (quantile) of the aggregated data in the expression over a number of records as defined by a group by clause.

`Fractile( expr, fraction )``

Return data type: numeric

Argument Description
expr The expression or field containing the data to be measured.
fraction A number between 0 and 1 corresponding to the fractile (quantile expressed as a fraction) to be calculated.

Add the example script to your app and run it. Then add, at least, the fields listed in the results column to a sheet in your app to see the result.

Example Result
Table1: crosstable LOAD recno() as ID, * inline [
Observation|Comparison
35|2
40|27
12|38
15|31
21|1
14|19
46|1
10|34
28|3
48|1
16|2
30|3
32|2
48|1
31|2
22|1
12|3
39|29
19|37
25|2 ] (delimiter is '|');

Fractile1:
LOAD Type,
Fractile(Value,0.75) as MyFractile Resident Table1 Group By Type;
In a table with the dimensions Type and MyFractile, the results of the Fractile() calculations in the data load script are:
Type MyFractile
Comparison 27.5
Observation 36

Kurtosis

Kurtosis() returns the kurtosis of the data in the expression over a number of records as defined by a group by clause.

Kurtosis( [distinct ] expr )

Return data type: numeric

Argument Description
expr The expression or field containing the data to be measured.
distinct If the word distinct occurs before the expression, all duplicates will be disregarded.

Add the example script to your app and run it. Then add, at least, the fields listed in the results column to a sheet in your app to see the result.

Example Result
Table1:
crosstable LOAD recno() as ID, * inline [
Observation|Comparison
35|2
40|27
12|38
15|31
21|1
14|19
46|1
10|34
28|3
48|1
16|2
30|3
32|2
48|1
31|2
22|1
12|3
39|29
19|37
25|2 ] (delimiter is '|');

Kurtosis1:
LOAD Type,
Kurtosis(Value) as MyKurtosis1,
Kurtosis(DISTINCT Value) as MyKurtosis2
Resident Table1 Group By Type;
In a table with the dimensions Type, MyKurtosis1,and MyKurtosis2, the results of the Kurtosis() calculations in the data load script are:
Type MyKurtosis1 MyKurtosis2
Comparison -1.1612957 -1.4982366
Observation -1.1148768 -0.93540144

LINEST_B

LINEST_B() returns the aggregated b value (y-intercept) of a linear regression defined by the equation y=mx+b for a series of coordinates represented by paired numbers in x-expression and y-expression iterated over a number of records as defined by a group by clause.

LINEST_B( y_value, x_value[, y0 [, x0 ]] )

Return data type: numeric

Argument Description
y_value The expression or field containing the range of y-values to be measured.
x_value The expression or field containing the range of x-values to be measured.
y(0), x(0)

An optional value y0 may be stated forcing the regression line to pass through the y-axis at a given point. By stating both y0 and x0 it is possible to force the regression line to pass through a single fixed coordinate.

Unless both y0 and x0 are stated, the function requires at least two valid data-pairs to calculate. If y0 and x0 are stated, a single data pair will do. 

Text values, NULL values and missing values in any or both pieces of a data-pair result in the entire data-pair being disregarded.

LINEST_DF

LINEST_DF() returns the aggregated degrees of freedom of a linear regression defined by the equation y=mx+b for a series of coordinates represented by paired numbers in x-expression and y-expression iterated over a number of records as defined by a group by clause.

LINEST_DF( y_value, x_value[, y0 [, x0 ]] )

Return data type: numeric

Argument Description
y_value The expression or field containing the range of y-values to be measured.
x_value The expression or field containing the range of x-values to be measured.
y(0), x(0)

An optional value y0 may be stated forcing the regression line to pass through the y-axis at a given point. By stating both y0 and x0 it is possible to force the regression line to pass through a single fixed coordinate.

Unless both y0 and x0 are stated, the function requires at least two valid data-pairs to calculate. If y0 and x0 are stated, a single data pair will do. 

Text values, NULL values and missing values in any or both pieces of a data-pair result in the entire data-pair being disregarded.

LINEST_F

This script function returns the aggregated F statistic (r2/(1-r2)) of a linear regression defined by the equation y=mx+b for a series of coordinates represented by paired numbers in x-expression and y-expression iterated over a number of records as defined by a group by clause.

LINEST_F( y_value, x_value[, y0 [, x0 ]] )

Return data type: numeric

Argument Description
y_value The expression or field containing the range of y-values to be measured.
x_value The expression or field containing the range of x-values to be measured.
y(0), x(0)

An optional value y0 may be stated forcing the regression line to pass through the y-axis at a given point. By stating both y0 and x0 it is possible to force the regression line to pass through a single fixed coordinate.

Unless both y0 and x0 are stated, the function requires at least two valid data-pairs to calculate. If y0 and x0 are stated, a single data pair will do. 

Text values, NULL values and missing values in any or both pieces of a data-pair result in the entire data-pair being disregarded.

LINEST_M

LINEST_M() returns the aggregated m value (slope) of a linear regression defined by the equation y=mx+b for a series of coordinates represented by paired numbers in x-expression and y-expression iterated over a number of records as defined by a group by clause.

LINEST_M( y_value, x_value[, y0 [, x0 ]] )

Return data type: numeric

Argument Description
y_value The expression or field containing the range of y-values to be measured.
x_value The expression or field containing the range of x-values to be measured.
y(0), x(0)

An optional value y0 may be stated forcing the regression line to pass through the y-axis at a given point. By stating both y0 and x0 it is possible to force the regression line to pass through a single fixed coordinate.

Unless both y0 and x0 are stated, the function requires at least two valid data-pairs to calculate. If y0 and x0 are stated, a single data pair will do. 

Text values, NULL values and missing values in any or both pieces of a data-pair result in the entire data-pair being disregarded.

LINEST_R2

LINEST_R2() returns the aggregated r2 value (coefficient of determination) of a linear regression defined by the equation y=mx+b for a series of coordinates represented by paired numbers in x-expression and y-expression iterated over a number of records as defined by a group by clause.

LINEST_R2( y_value, x_value[, y0 [, x0 ]] )

Return data type: numeric

Argument Description
y_value The expression or field containing the range of y-values to be measured.
x_value The expression or field containing the range of x-values to be measured.
y(0), x(0)

An optional value y0 may be stated forcing the regression line to pass through the y-axis at a given point. By stating both y0 and x0 it is possible to force the regression line to pass through a single fixed coordinate.

Unless both y0 and x0 are stated, the function requires at least two valid data-pairs to calculate. If y0 and x0 are stated, a single data pair will do. 

Text values, NULL values and missing values in any or both pieces of a data-pair result in the entire data-pair being disregarded.

LINEST_SEB

LINEST_SEB() returns the aggregated standard error of the b value of a linear regression defined by the equation y=mx+b for a series of coordinates represented by paired numbers in x-expression and y-expression iterated over a number of records as defined by a group by clause.

LINEST_SEB( y_value, x_value[, y0 [, x0 ]] )

Return data type: numeric

Argument Description
y_value The expression or field containing the range of y-values to be measured.
x_value The expression or field containing the range of x-values to be measured.
y(0), x(0)

An optional value y0 may be stated forcing the regression line to pass through the y-axis at a given point. By stating both y0 and x0 it is possible to force the regression line to pass through a single fixed coordinate.

Unless both y0 and x0 are stated, the function requires at least two valid data-pairs to calculate. If y0 and x0 are stated, a single data pair will do. 

Text values, NULL values and missing values in any or both pieces of a data-pair result in the entire data-pair being disregarded.

LINEST_SEM

LINEST_SEM() returns the aggregated standard error of the m value of a linear regression defined by the equation y=mx+b for a series of coordinates represented by paired numbers in x-expression and y-expression iterated over a number of records as defined by a group by clause.

LINEST_SEM( y_value, x_value[, y0 [, x0 ]] )

Return data type: numeric

Argument Description
y_value The expression or field containing the range of y-values to be measured.
x_value The expression or field containing the range of x-values to be measured.
y(0), x(0)

An optional value y0 may be stated forcing the regression line to pass through the y-axis at a given point. By stating both y0 and x0 it is possible to force the regression line to pass through a single fixed coordinate.

Unless both y0 and x0 are stated, the function requires at least two valid data-pairs to calculate. If y0 and x0 are stated, a single data pair will do. 

Text values, NULL values and missing values in any or both pieces of a data-pair result in the entire data-pair being disregarded.

LINEST_SEY

LINEST_SEY() returns the aggregated standard error of the y estimate of a linear regression defined by the equation y=mx+b for a series of coordinates represented by paired numbers in x-expression and y-expression iterated over a number of records as defined by a group by clause.

LINEST_SEY( y_value, x_value[, y0 [, x0 ]] )

Return data type: numeric

Argument Description
y_value The expression or field containing the range of y-values to be measured.
x_value The expression or field containing the range of x-values to be measured.
y(0), x(0)

An optional value y0 may be stated forcing the regression line to pass through the y-axis at a given point. By stating both y0 and x0 it is possible to force the regression line to pass through a single fixed coordinate.

Unless both y0 and x0 are stated, the function requires at least two valid data-pairs to calculate. If y0 and x0 are stated, a single data pair will do. 

Text values, NULL values and missing values in any or both pieces of a data-pair result in the entire data-pair being disregarded.

LINEST_SSREG

LINEST_SSREG() returns the aggregated regression sum of squares of a linear regression defined by the equation y=mx+b for a series of coordinates represented by paired numbers in x-expression and y-expression iterated over a number of records as defined by a group by clause.

LINEST_SSREG( y_value, x_value[, y0 [, x0 ]] )

Return data type: numeric

Argument Description
y_value The expression or field containing the range of y-values to be measured.
x_value The expression or field containing the range of x-values to be measured.
y(0), x(0)

An optional value y0 may be stated forcing the regression line to pass through the y-axis at a given point. By stating both y0 and x0 it is possible to force the regression line to pass through a single fixed coordinate.

Unless both y0 and x0 are stated, the function requires at least two valid data-pairs to calculate. If y0 and x0 are stated, a single data pair will do. 

Text values, NULL values and missing values in any or both pieces of a data-pair result in the entire data-pair being disregarded.

LINEST_SSRESID

LINEST_SSRESID() returns the aggregated residual sum of squares of a linear regression defined by the equation y=mx+b for a series of coordinates represented by paired numbers in x-expression and y-expression iterated over a number of records as defined by a group by clause.

LINEST_SSRESID( y_value, x_value[, y0 [, x0 ]] )

Return data type: numeric

Argument Description
y_value The expression or field containing the range of y-values to be measured.
x_value The expression or field containing the range of x-values to be measured.
y(0), x(0)

An optional value y0 may be stated forcing the regression line to pass through the y-axis at a given point. By stating both y0 and x0 it is possible to force the regression line to pass through a single fixed coordinate.

Unless both y0 and x0 are stated, the function requires at least two valid data-pairs to calculate. If y0 and x0 are stated, a single data pair will do. 

Text values, NULL values and missing values in any or both pieces of a data-pair result in the entire data-pair being disregarded.

Median

Median() returns the aggregated median of the values in the expression over a number of records as defined by a group by clause.

Median( expr )

Return data type: numeric

Argument Description
expr The expression or field containing the data to be measured.

Add the example script to your app and run it. Then build a straight table with Type and MyMedian as dimensions.

Example Result
Table1:
crosstable LOAD recno() as ID, * inline [
Observation|Comparison
35|2
40|27
12|38
15|31
21|1
14|19
46|1
10|34
28|3
48|1
16|2
30|3
32|2
48|1
31|2
22|1
12|3
39|29
19|37
25|2 ] (delimiter is '|');

Median1:

LOAD Type,

Median(Value) as MyMedian

Resident Table1 Group By Type;

The results of the Median() calculation are:

  • Type is MyMedian
  • Comparison is 2.5
  • Observation is 26.5

Skew

Skew() returns the skewness of expression over a number of records as defined by a group by clause.

Skew( [ distinct] expr )

Return data type: numeric

Argument Description
expr The expression or field containing the data to be measured.
DISTINCT If the word distinct occurs before the expression, all duplicates will be disregarded.

Add the example script to your app and run it. Then build a straight table with Type and MySkew as dimensions.

Example Result
Table1:
crosstable LOAD recno() as ID, * inline [
Observation|Comparison
35|2
40|27
12|38
15|31
21|1
14|19
46|1
10|34
28|3
48|1
16|2
30|3
32|2
48|1
31|2
22|1
12|3
39|29
19|37
25|2 ] (delimiter is '|');

Skew1:
LOAD Type,
Skew(Value) as MySkew
Resident Table1 Group By Type;

The results of the Skew() calculation are:

  • Type is MySkew
  • Comparison is 0.86414768
  • Observation is 0.32625351

Stdev

Stdev() returns the standard deviation of the values given by the expression over a number of records as defined by a group by clause.

Stdev( [distinct] expr )

Return data type: numeric

Argument Description
expr The expression or field containing the data to be measured.
distinct If the word distinct occurs before the expression, all duplicates will be disregarded.

Add the example script to your app and run it. Then build a straight table with Type and MyStdev as dimensions.

Example Result
Table1:
crosstable LOAD recno() as ID, * inline [
Observation|Comparison
35|2
40|27
12|38
15|31
21|1
14|19
46|1
10|34
28|3
48|1
16|2
30|3
32|2
48|1
31|2
22|1
12|3
39|29
19|37
25|2 ] (delimiter is '|');

Stdev1:
LOAD Type,
Stdev(Value) as MyStdev
Resident Table1 Group By Type;

The results of the Stdev() calculation are:

  • Type is MyStdev
  • Comparison is 14.61245
  • Observation is 12.507997

Sterr

Sterr() returns the aggregated standard error (stdev/sqrt(n)) for a series of values represented by the expression iterated over a number of records as defined by a group by clause.

Sterr( [distinct] expr ) Return data type: numeric

Argument Description
expr The expression or field containing the data to be measured.
distinct If the word distinct occurs before the expression, all duplicates will be disregarded.

Text values, NULL values and missing values are disregarded.

Add the example script to your app and run it. Then add, at least, the fields listed in the results column to a sheet in your app to see the result.

Example Result
Table1:

crosstable LOAD recno() as ID, * inline [
Observation|Comparison
35|2
40|27
12|38
15|31
21|1
14|19
46|1
10|34
28|3
48|1
16|2
30|3
32|2
48|1
31|2
22|1
12|3
39|29
19|37
25|2 ] (delimiter is '|');
 
Sterr1:
LOAD Type,
Sterr(Value) as MySterr
Resident Table1 Group By Type;

In a table with the dimensions Type and MySterr, the results of the Sterr() calculation in the data load script are:

Type MySterr

Comparison 3.2674431

Observation 2.7968733

STEYX

STEYX() returns the aggregated standard error of the predicted y-value for each x-value in the regression for a series of coordinates represented by paired numbers in x-expression and y-expression iterated over a number of records as defined by a group by clause.

`STEYX( y_value, x_value )``

Return data type: numeric

Argument Description
y_value The expression or field containing the range of y-values to be measured.
x_value The expression or field containing the range of x-values to be measured.

Text values, NULL values and missing values in any or both pieces of a data-pair result in the entire data-pair being disregarded.

Add the example script to your app and run it. Then add, at least, the fields listed in the results column to a sheet in your app to see the result.

Example Result
Trend:
Load *, 1 as Grp;
LOAD * inline [
Month|KnownY|KnownX
Jan|2|6 Feb|3|5
Mar|9|11
Apr|6|7
May|8|5
Jun|7|4
Jul|5|5
Aug|10|8
Sep|9|10
Oct|12|14
Nov|15|17
Dec|14|16
] (delimiter is '|');

STEYX1:
LOAD Grp,
STEYX(KnownY, KnownX) as MySTEYX
Resident Trend Group By Grp;

In a table with the dimension MySTEYX, the result of the STEYX() calculation in the data load script is 2.0714764.