Aggregate Functions
AMPS provides a set of aggregation functions that can be used in a Field
constructor for a view and in the projection
option of an aggregated subscription. These functions return a single value for each distinct group of messages, as identified by distinct combinations of values in the Grouping
clause.
These functions produce an aggregation over a literal value, an identifier directing AMPS to extract the value from the message, or the result of a function.
For example, given a set of messages like the following:
{"id":1, "item":1,"qty":10, "oid":1, ...}
{"id":2, "item":2,"qty":10, "oid":1, ...}
{"id":3, "item":3,"qty":25, "oid":1, ...}
With a view definition that has a Projection
clause and Grouping
clause like the following:
<Projection>
<Field>/oid</Field>
<Field>SUM(/qty) AS /totalOrderQty</Field>
<Field>SUM(IF((/qty % 10) == 0,1,0)) AS /evenTensOrderCount</Field>
</Projection>
<Grouping>
<Field>/oid</Field>
</Grouping>
AMPS will produce the following record:
{"oid":1,"totalOrderQty":45,"evenTensOrderCount":2}
Notice that the first SUM()
function simply extracts the value of the /qty from each message, while the second SUM()
function uses the output of the IF
statement for each message.
Since aggregate functions operate over groups of messages, these functions are only available when constructing fields for aggregate purposes, either in a view or an aggregated subscription. The functions described in this section are not available to filters, and are not available for constructing fields during SOW topic enrichment.
The set of functions provided in AMPS have been chosen to be efficient to compute over high volumes of rapidly changing data.
AVG
AVG
AVG(expression)
Averages an expression.
Parameters
expression
: The expression to average.
Returns
The mean value of the values specified by the expression.
ANY
ANY
ANY(expression)
Returns one of the set of values in the expression.
Parameters
expression
: The expression to evaluate.
Returns
One of the set of values in the expression.
COUNT
COUNT
COUNT(expression)
Counts the values in an expression.
Parameters
expression
: The expression to count values from.
Returns
The number of values specified by the expression.
COUNT_DISTINCT
COUNT_DISTINCT
COUNT_DISTINCT(expression)
Counts the number of distinct values in an expression, ignoring NULL
.
Parameters
expression
: The expression to count distinct values from.
Returns
The number of distinct values in the expression. AMPS type conversion rules apply when determining distinct values.
GROUP_CONCAT
GROUP_CONCAT
GROUP_CONCAT(expression, [delimiter])
Creates a list of the distinct values in the specified expression, using the second argument as the delimiter.
Parameters
expression
: The expression that provides the values to concatenate.delimiter
: Optional. The delimiter to use between values. If no delimiter is provided, the delimiter defaults to,
(a comma).
Returns
A string that contains the distinct values from the expression, separated by the delimiter. This function returns a string, regardless of the types of the values in the expression. The order of the values within the string is not guaranteed.
MIN
MIN
MIN(expression)
Returns the minimum value out of the values specified by the expression.
Parameters
expression
: The expression to find the minimum value from.
Returns
The minimum value.
MAX
MAX
MAX(expression)
Returns the maximum value out of the values specified by the expression.
Parameters
expression
: The expression to find the maximum value from.
Returns
The maximum value.
STDDEV_POP
STDDEV_POP
STDDEV_POP(expression)
Calculates the population standard deviation of an expression.
Parameters
expression
: The expression for which to calculate the standard deviation.
Returns
The calculated standard deviation.
STDDEV_SAMP
STDDEV_SAMP
STDDEV_SAMP(expression)
Calculates the sample standard deviation of an expression.
Parameters
expression
: The expression for which to calculate the standard deviation.
Returns
The calculated standard deviation.
SUM
SUM
SUM(expression)
Calculates the summation over an expression.
Parameters
expression
: The expression to sum.
Returns
The total value of the values specified by the expression.
UNIQUE
UNIQUE
UNIQUE(expression)
Determines if all of the values in a given field match within the group.
Parameters
expression
: The expression to check for uniqueness.
Returns
If all of the values match, returns the value. Otherwise, returns NULL
.
Null values are not included in aggregate expressions with AMPS, nor in ANSI SQL. COUNT
will count only non-null values, SUM
will add only non-null values, AVG
will average only non-null values, and MIN
and MAX
ignore NULL
values, and so on.
MIN
and MAX
can operate on either numbers or strings, or a combination of the two. AMPS compares values using the principles described for comparison operators. For MIN
and MAX
, AMPS determines order based on these rules:
- Numbers sort in numeric order.
- String values sort in ASCII order.
- When comparing a number to a string, convert the string to a number, and use a numeric comparison. If that is not successful, the value of the string is higher than the value of the number.
For example, given a field that has the following values across a set of messages:
24, 020, 'cat', 75, 1.3, 200, '75', '42'
MIN
will return 1.3
, MAX
will return 'cat'
. Notice that different message types may have different support for converting strings to numeric values: AMPS relies on the parsing done by the message type to determine the numeric value of a string.