NeighborhoodAggregator
Creates aggregates of features based on their proximity to each other. Each aggregate that is created covers approximately the neighborhood width and height (measured in feature ground units).
This transformer is used to reduce the data volume of "wallpaper" types of features that have no individual attributes. The resulting aggregates can be output to a system using many fewer records than if each feature was output by itself. For systems that support aggregates, or multi-part features, this can result in substantial performance improvements and greatly decrease storage requirements.
Parameters
Transformer
Features that leave this transformer will have only the group-by attributes present on them. Any other feature attributes are lost.
Note: How parallel processing works with FME: see About Parallel Processing for detailed information.
This parameter determines whether or not the transformer should perform the work across parallel processes. If it is enabled, a process will be launched for each group specified by the Group By parameter.
Parallel Processing Levels
For example, on a quad-core machine, minimal parallelism will result in two simultaneous FME processes. Extreme parallelism on an 8-core machine would result in 16 simultaneous processes.
You can experiment with this feature and view the information in the Windows Task Manager and the Workbench Log window.
No: This is the default behavior. Processing will only occur in this transformer once all input is present.
By Group: This transformer will process input groups in order. Changes of the value of the Group By parameter on the input stream will trigger batch processing on the currently accumulating group. This will improve overall speed if groups are large/complex, but could cause undesired behavior if input groups are not truly ordered.
Using Ordered input can provide performance gains in some scenarios, however, it is not always preferable, or even possible. Consider the following when using it, with both one- and two-input transformers.
Single Datasets/Feature Types: Are generally the optimal candidates for Ordered processing. If you know that the dataset is correctly ordered by the Group By attribute, using Input is Ordered By can improve performance, depending on the size and complexity of the data.
If the input is coming from a database, using ORDER BY in a SQL statement to have the database pre-order the data can be an extremely effective way to improve performance. Consider using a Database Readers with a SQL statement, or the SQLCreator transformer.
Multiple Datasets/Feature Types: Since all features matching a Group By value need to arrive before any features (of any feature type or dataset) belonging to the next group, using Ordering with multiple feature types is more complicated than processing a single feature type.
Multiple feature types and features from multiple datasets will not generally naturally occur in the correct order.
One approach is to send all features through a Sorter, sorting on the expected Group By attribute. The Sorter is a feature-holding transformer, collecting all input features, performing the sort, and then releasing them all. They can then be sent through an appropriate filter (TestFilter, AttributeFilter, GeometryFilter, or others), which are not feature-holding, and will release the features one at a time to the transformer using Input is Ordered By, now in the expected order.
The processing overhead of sorting and filtering may negate the performance gains you will get from using Input is Ordered By. In this case, using Group By without using Input is Ordered By may be the equivalent and simpler approach.
In all cases when using Input is Ordered By, if you are not sure that the incoming features are properly ordered, they should be sorted (if a single feature type), or sorted and then filtered (for more than one feature or geometry type).
As with many scenarios, testing different approaches in your workspace with your data is the only definitive way to identify performance gains.
Parameters
These parameters, measured in ground units, divide the input space into cells. The result is a grid of cells that expands in all directions from the origin (0,0). The center of the bounding box of each input feature is used to determine the cell for the feature. Once all input features have been read, an aggregate feature is created from all features in each cell. If linear features are input, they will have pseudo nodes removed from within their cells to further reduce the number of separate entities. No such reduction is done to any polygons or donuts that enter.
Note: To view the grid of cells that is created from these parameters, use the 2DGridCreator. Specify 0,0 for Starting X Coordinate and Starting Y Coordinate, respectively, and the same values for Column Width and Row Height as Neighborhood Width and Neighborhood Height, respectively.
When you set this parameter, neighborhoods with fewer than the specified number of features are merged with a vertical neighbor area in order to increase the number of members. You can prevent this from happening by setting the parameter to 0 (zero).
Example
Editing Transformer Parameters
Using a set of menu options, transformer parameters can be assigned by referencing other elements in the workspace. More advanced functions, such as an advanced editor and an arithmetic editor, are also available in some transformers. To access a menu of these options, click beside the applicable parameter. For more information, see Transformer Parameter Menu Options.
Transformer Categories
Search FME Knowledge Center
Search for samples and information about this transformer on the FME Knowledge Center.
Tags Keywords: MBR "minimum bounding rectangle" Clumper NeighbourhoodAggregator