Syntax FACTORY_DEF {*} StatisticsCalculatorFactory FACTORY_NAME $(XFORMER_NAME) $(INPUT_LINES) GROUP_BY [+]* FLUSH_WHEN_GROUPS_CHANGE (YES|NO) PREPEND_ATTR_NAME $(PREPEND_ATTR_NAME_OBS) SUFFIX_NAME_MAP $(SUFFIX_NAME_MAP) FEAT_STATS [[,]*,?][;[,]*,?]* ADVANCED_MODE (YES|NO) CUMULATIVE_STATS [[,]*][;[,]*]* CALCULATION_MODE (SCHEMA|LEXICAL|NUMERIC) ATTR_ACCUM_MODE (None|All|One) ATTRIBUTE_NAMES_AND_TYPES $(XFORMER_SCHEMA) TREAT_INVALID_AS_NULL (Yes|No) OUTPUT CUMULATIVE FEATURE_TYPE $(OUTPUT_CUMULATIVE_FTYPE) $(OUTPUT_CUMULATIVE_FUNCS) OUTPUT COMPLETE FEATURE_TYPE $(OUTPUT_COMPLETE_FTYPE) $(OUTPUT_COMPLETE_FUNCS) OUTPUT SUMMARY FEATURE_TYPE $(OUTPUT_SUMMARY_FTYPE) $(OUTPUT_SUMMARY_FUNCS) Overview This factory takes in a feature and calculates the user requested statistics. If Group By attributes are chosen, statistics will be calculated independently within each group of features. It also supports Bulk Mode. The following statistics may be calculated: - Minimum: The numerical minimum for numeric attributes. The lexical minimum for string attributes. - Maximum: The numerical maximum for numeric attributes. The lexical maximum for string attributes. - Median: The middle value of the ordered attribute values. If the number of attributes is even, Median returns the average of the two middle values. For string attributes, the first middle value is always used. - Total Count: The input feature count. - Numeric Count: The number of numeric values that entered the transformer. In particular, missing, null, and NaN values are ignored, and are not included in this count. - Sum: The sum of all values. Undefined for string attributes. - Range: The maximum minus the minimum. Undefined for string attributes. - Mean: The average value, calculated as the sum of values divided by the number of values. Undefined for string attributes. - Standard Deviation (Sample): The standard deviation of all the numeric values, which are assumed to represent a sample of a population (calculated using the "nonbiased" or "n-1" method). Undefined for string attributes. - Standard Deviation (Population): The standard deviation of all the numeric values, which comprise the entire population. Undefined for string attributes. - Mode: The most frequent of all the values. If the dataset is bimodal (two or more values occur with the highest frequency) one of the values will be returned randomly. - Histogram: A count for each unique value encountered for the analyzed attribute. The results are given as a structured list of attributes which present (value,count) pairs. Statistics will be stored as attributes on the feature. Attributes will be named ".". Input features must contain Feature Table geometries only. Output Tags The StatisticsCalculatorFactory supports the following output tags. CUMULATIVE All Input features will all be passed through this output with all the statistics attributes to date for their group added onto them. The features pass through this port immediately, each having the statistics computed for the set of features from the first feature in the group through to the current feature. (Note that this differs from the "final" statistics output in the Complete group.) COMPLETE All Input features will all be passed through this output with all the statistics attributes for their group added onto them. Note that this will require all Input features to be stored until the end of translation, which can greatly increase the amount of memory and/or temporary disk storage usage. SUMMARY A single new feature will be output containing the statistics attributes for each group. If features are not grouped, the port will emit a single feature containing the statistics for the whole set of input features. No summary data will be generated if no input is received. Input attributes will be accumulated on a summary features based on ATTR_ACCUM_MODE.