Aggregators turn many values into one value. Primarily used in uniq, where they are associated with an input and output column.
Syntax is Method,Pattern. Available methods include :
- count The number of things aggregated
- prefix The longest common prefix of the input values
- suffix The longest common suffix of the input values
- min,comp Keep the minimum value, Pattern is the associated Comparator
- max,comp Keep the maximum value, Pattern is the associated Comparator
- mean The arithmetic mean of the values
- sum The sum of the values
- amin,counter The min of the values, as determined by the associated Counter
- amax,counter The max of the values, as determined by the associated Counter
- asum,counter The sum of the values, as determined by the associated Counter
- amean,counter The mean of the values, as determined by the associated Counter
- merge Merge into a delimited list of values
The pattern for a merge
aggregator is a period delimited set of any of the following :
- sort -- sort the parts
- uniq -- Remove any adjacent equal parts. Happens after sorting.
- count -- Display the count of parts, not the parts themselves.
- max_len:N -- Discard any parts longer than N bytes.
- min_len:N -- Discard any parts shorter than N bytes.
- max_parts:N -- Display only the first N parts.
- Dxy -- 'x' is the input delimiter and 'y' is the output delimiter. Default is comma for both.
- comp:spec -- Spec for comparator for sort or uniq. Must be last piece.
A Counter spec is a period delimited list of modofiers and Counters
Modifiers
- utf8 -- do the unicode thing, rather than the ascii thing
Counters
- chars -- count the number of bytes, or chars with utf8.
- swords -- count the number of words, define as whitespace separating non-whitespace
- awords -- count the number of words, define as non-alphanumerics separating alphanumerics