FeatureMerger
Merges the attributes and/or geometry of one set of features onto another set of features, based on matching key attribute values and expressions.
Typical Uses
- Combining attributes and/or geometry from two different streams of features, based on a common key attribute value or expression.
How does it work?
The FeatureMerger receives two streams of features via its input ports.
Requestor: Requestors are the features that will receive new attributes and/or geometry.
Supplier: Suppliers provide attributes and/or geometry to be merged onto the Requestors.
Matches between Requestor and Supplier are identified according to the Join On configuration in the parameters dialog. The Join conditions can be simple or complex, using attribute values, constants, functions, or a combination of any of these in expression form. Multiple join conditions can be defined (features must meet all conditions to match).
When a Requestor finds a matching Supplier, the attributes and/or geometry from the Supplier are merged onto the Requestor.
- If the Requestor already has an attribute that the Supplier also has, the Requestor's original value for that attribute can be preserved or overridden.
- A single Supplier may be used by many Requestors.
- Many Suppliers can be merged onto a single Requestor.
- When attribute names conflict, you can choose whether Requestor or Supplier attribute values are maintained by using the Conflict Resolution parameter, and whether null values follow the same behavior by using the Ignore Nulls parameter.
Requestor features match Supplier features when every pair of keys (which support expressions) specified in the Join On table has the same value for both the Requestor and Supplier features.
Examples
In this example, we have two datasets - a CSV text file of historical crime reports, and a shapefile of public streets. Both of the datasets have one piece of information in common, that is an address in the form of “hundred” block.
The FeatureJoiner would likely be more effective to do a simple attribute value join, however, the format of the address is slightly different in each dataset. The CSV data uses the string “XX” in place of zeroes in the address:
And the street data has the more conventional form, with zeroes:
Our goal is to merge the street geometry onto the crime data - producing one output feature for each crime record. And so, The CSV data is routed into the Requestor input port, and the streets are routed into the Supplier port.
In the parameters dialog, we need to create a Join On pair of Requestor and Supplier values. As the format of the addresses is not identical, we will adjust one, constructing a key. With the assistance of the Text Editor, we create an expression that will replace the “XX” of the address with “00”, and so match the format of the street (Supplier) data.
The other half of the pairing is the attribute value HBLOCK from the street data.
Feature Merge Type is set to Attributes and Geometry. The default values for Attribute Accumulation (Merge Supplier) will provide the correct results.
The Merged output contains crime records, now with geometry for the street and associated street attributes. Note that the address format was not actually changed by the constructed expression - it was just used for matching evaluation. In this location, 137 crime records were found. Each one now has (identical) geometry.
To see this example reversed - attaching all crime records to a single street - see the next example.
In this example, we reverse the scenario in the above example. We want to merge all the historical crime records onto the street data, producing one record per street line segment (block), with a list attribute containing the crime data.
In this case, the streets are the Requestor features, and the crime data is the Supplier.
In the parameters dialog, we again create a Join On pair, using an expression to match the two (slightly different) hundred block formats. The Feature Merge Type is Attributes Only, as the Requestors (streets) already have the desired geometry.
Process Duplicate Suppliers is enabled. Without this, the first match would be processed and the remainder discarded. A count of how many Suppliers (crimes) matched will be kept in the Number of Suppliers Attribute, named Incident_Count.
Generate List is enabled, and given the List Name of Crime_Records. We will add only Selected Attributes to the List, and choose TYPE and YEAR.
The Merged output features now have a list attribute, containing all matching records from the CSV. Note the Incident_Count attribute.
Usage Notes
- For simple joins, the FeatureJoiner may provide better performance than the FeatureMerger. However, the FeatureJoiner only accepts attribute values as keys and not constructed expressions, and does not support list attributes. Additionally, the FeatureMerger is able to (optionally) restrict output to one feature in the case of multiple matching Suppliers, whereas the FeatureJoiner will create multiple features for all matches.
- For complex joins using SQL syntax, or more than two input feature streams, consider using the InlineQuerier.
- Where multiple FeatureMergers are required, consider using the InlineQuerier instead.
- If all the data to be queried already exists in a SQL-capable data source, it is always more efficient to use the SQLCreator or SQLExecutor, which allows the queries and filtering of the data to be executed directly by the database before it enters the FME environment.
- To perform a join between features already in the workspace and data residing in an external database, consider the DatabaseJoiner.
- To perform a join where the Requestor key is a list attribute, consider using the ListBasedFeatureMerger.
- To join features on matching geometry, consider the Matcher. The FeatureMerger does not accept geometry as a key.
Choosing a Feature Joining Method
Many transformers can perform data joining based on matching attributes, expressions and/or geometry. When choosing one for a specific joining task, considerations include the complexity of the join, data format, indexing, conflict handling, and desired results. Some transformers use SQL syntax, and some access external databases directly. They may or may not support list attribute reading and creation.
Generally, choosing the one that is most specific to the task you need to accomplish will provide the optimal performance results. If there is more than one way to do it (which is frequently the case), time spent on performance testing alternate methods may be worthwhile. Performance may vary greatly depending on the existence of key indexes when reading external tables (as opposed to features already in the workspace).
Transformer |
Match By |
Uses SQL Statements |
Can Create List |
Input Type |
Notable |
Description |
---|---|---|---|---|---|---|
FeatureJoiner | Attributes | No | No | Features |
|
Joins features by combining the attributes and/or geometry of features based on common key attribute values. Performs the equivalent of Inner, Left, and Full SQL joins. |
FeatureMerger | Attributes | No | Yes | Features |
|
Merges the attributes and/or geometry of one set of features onto another set of features, based on matching key attribute values and expressions. |
ListBasedFeatureMerger | List Attribute to Single Attribute | No | Yes | Features |
|
Merges the attributes and/or geometry of one set of features onto another set of features, based on matching list attribute values with key attribute values and expressions. |
InlineQuerier | SQL query | Yes | No | Features |
|
Creates a set of SQLite database tables from incoming features, executes SQL queries against them, and outputs the results as features. |
SQLCreator | SQL query | Yes | No | External DB |
|
Generates FME features from the results of a SQL query executed once against a database. One FME feature is created for each row of the results of the SQL query. |
SQLExecutor | SQL query | Yes | No | External DB |
|
Executes SQL queries against a database. One query is issued to the database for each initiating feature that enters the transformer. Both the initiating features and the results of the query may be output as features. |
DatabaseJoiner | Attributes | No | Yes | External DB and Features |
|
Joins attributes from an external table to features already in a workspace, based on a common key or keys. SQL knowledge not required. Non-blocking transformer. |
Matcher | Geometry and/or Attributes | No | Yes | Features |
|
Detects features that are matches of each other. Features are declared to match when they have matching geometry, matching attribute values, or both. A list of attributes which must differ between the features may also be specified. If matching on attributes only (not geometry), using the FeatureMerger or another method will give better performance. |
Configuration
Input Ports
Features that will receive new attributes and/or geometry from the features connected to the Supplier port.
The source of new attributes and/or geometry for features that enter through the Requestor port.
Note that by setting Process Duplicate Suppliers to Yes, more than one Supplier may be merged to each Requestor. These Suppliers potentially could come from different sources (such as different database tables).
Output Ports
Requestors that match a Supplier (or possible multiple Suppliers, if Process Duplicate Suppliers is set to Yes).
Requestors that do not match any Suppliers. If a Join On key is null or missing on a Requestor feature, that feature is output onto this port.
Suppliers that are found by at least one Requestor.
Note: A Supplier may still be output through this port even though none of its attributes were merged. This can occur because all of the attributes were already present in the Requestor or were provided by earlier Suppliers.
Suppliers that do not match any Requestors. If a Join On key is null or missing on a Supplier feature, that feature is output onto this port.
If Process Duplicate Suppliers is No, then duplicate suppliers will be rejected. If Reject Null and Missing Keys is Yes, then keyless features will be rejected.
Rejected Feature Handling: can be set to either terminate the translation or continue running when it encounters a rejected feature. This setting is available both as a default FME option and as a workspace parameter.
Parameters
Group By |
The input features may be partitioned by the Group By parameter. If you choose any Group By attributes, then references between features will only be resolved if they share a common value for the selected attributes. If you do not choose any Group By attributes, all features are processed together. If you have more than one Reader, a typical use is to group by reader_id to ensure that references are resolved within the correct set of features. |
Group By Mode |
Process At End (Blocking): This is the default behavior. Processing will only occur in this transformer once all input is present. Process When Group Changes (Advanced): This transformer will process input groups in order. Changes of the value of the Group By parameter on the input stream will trigger processing on the currently accumulating group. This may improve overall speed (particularly with multiple, equally-sized groups), but could cause undesired behavior if input groups are not truly ordered. Considerations for Using Group By
There are two typical reasons for using Process When Group Changes (Advanced) . The first is incoming data that is intended to be processed in groups (and is already so ordered). In this case, the structure dictates Group By usage - not performance considerations. The second possible reason is potential performance gains. Performance gains are most likely when the data is already sorted (or read using a SQL ORDER BY statement) since less work is required of FME. If the data needs ordering, it can be sorted in the workspace (though the added processing overhead may negate any gains). Sorting becomes more difficult according to the number of data streams. Multiple streams of data could be almost impossible to sort into the correct order, since all features matching a Group By value need to arrive before any features (of any feature type or dataset) belonging to the next group. In this case, using Group By with Process At End (Blocking) may be the equivalent and simpler approach. Note: Multiple feature types and features from multiple datasets will not generally naturally occur in the correct order. As with many scenarios, testing different approaches in your workspace with your data is the only definitive way to identify performance gains. |
Suppliers First |
When this option is enabled, the transformer will assume that all Suppliers will enter the transformer before any Requestors. Once the first Requestor arrives, it will process and output Requestors as they come in. The Suppliers will be output after all the Requestors have been processed. Note: It is the user’s responsibility to make sure all the Suppliers arrive before the Requestors. The transformer will stop accepting Suppliers once it receives its first Requestor when this option is enabled. |
Requestor |
Specify an expression from the Requestor feature to match with the Supplier feature expression. An expression can be a constant, attribute value, function or mixture. Leading and trailing space characters are trimmed from the evaluated expressions. Matches are made when the trimmed values of all the Requestor expressions equal the trimmed values of their corresponding Supplier expressions. |
Supplier |
Specify an expression from the Supplier feature to match with the Requestor feature expression. An expression can be a constant, attribute value, function or mixture. Leading and trailing space characters are trimmed from the evaluated expressions. Matches are made when the trimmed values of all the Requestor expressions equal the trimmed values of their corresponding Supplier expressions. Generally, the Suppliers will have expressions defined that result in a unique match with a Requestor, and any duplicate Suppliers are ignored by the transformer. However, if the Process Duplicate Suppliers parameter is set to Yes, then all Suppliers whose expressions match their corresponding Requestor expressions will be combined onto that Requestor. |
Comparison Mode | The Comparison Mode column specifies how to perform the comparison between Requestor and Supplier attribute values. If Automatic or Numeric is specified, an attempt will be made to convert attribute values to numbers before comparing them. |
Feature Merge Type |
This parameter specifies what to merge onto the completed feature. Choices include:
|
Reject Null and Missing Keys |
Yes: Any Requestor or Supplier feature that contains a null or missing join key is output via the <Rejected> port. No: Null and missing keys are treated as empty strings. |
If more than one Supplier is found for a given Requestor, and Process Duplicate Suppliers is not enabled, then every Supplier after the first is output via the <Rejected> port and only the first of the Suppliers will be matched with a Requestor.
If Process Duplicate Suppliers is enabled, duplicate Suppliers are all matched with the corresponding Requestor. Attributes are merged based on the Attribute Accumulation mode, and optionally output as a list attribute of the Requestor, using the specified List Name. The Supplier geometry is merged using the specified Geometry Merge Type.
Note that enabling Generate List also implicitly enables Process Duplicate Suppliers.
Geometry Merge Type |
Specifies how to merge duplicate Suppliers onto the Requestor. It is applicable only if Feature Merge Type specifies to merge geometry. Choices include:
|
||||||||||||||||||
Tolerance |
The minimum distance between geometries in 2D before they are considered equal, in ground units. If the tolerance is None, the geometries must be exactly identical to be considered equal. If the tolerance is Automatic, a tolerance will be automatically computed based on the location of the input geometries. Additionally, a custom tolerance may be used. |
||||||||||||||||||
Connect Z Mode |
If applicable, select a method for handling z values. Connect Z Mode
When viewed in 2D (ignoring Z), a path (which may define the border of a polygon) may appear to be closed as shown in the left figure below. This same path, when viewed in 3D, may appear to be open as shown in the right figure below.
To specify how (and if) paths should be closed in 3D, select one of the listed modes.
|
||||||||||||||||||
Number of Suppliers Attribute | The name of an attribute which will store a count of the number of Suppliers matched a Requestor. |
If attributes on the Supplier and Requestor feature share the same name, but are not geometry attributes that start with fme_, then they are deemed conflicted.
Accumulation Mode |
Merge Supplier: The Requestor feature will retain all of its own un-conflicted attributes, and will additionally acquire any un-conflicted attributes that the supplier feature has. This mode will handle conflicted attributes based on the Conflict Resolution parameter. Prefix Supplier: The Requestor feature will retain all of its own attributes. In addition, the Requestor will acquire attributes reflecting the supplier feature’s attributes, with the name prefixed with the Prefix parameter. Only Use Supplier: The Requestor feature will have all of its attributes removed, except geometry attributes that start with fme_. Then, all of the attributes from one (arbitrary) supplier feature will be placed onto the Requestor. |
Conflict Resolution |
Use Requestor: If a conflict occurs, the Requestor values will be maintained. Use Supplier: If a conflict occurs, the values of the supplier will be transferred onto the Requestor. |
Ignore Nulls |
No: Treat null attribute values like other attribute values. Yes: Treat null attribute values as less important than other attribute values. Whenever a null value is merged with a non-null value, the non-null value shall prevail, regardless of what Conflict Resolution is set to. Note: Null and other attribute values are always more dominant than missing attribute values. That is, whenever an attribute value is merged with a missing attribute value, the attribute value shall prevail, regardless of the Ignore Nulls and Conflict Resolution settings. |
Prefix |
To prevent a Supplier attribute from being ignored because the Requestor attribute already exists, you can optionally specify a prefix that will be applied to each Supplier attribute when it is added to the Requestor. When there are multiple Supplier features for a Requestor feature, multiple Supplier attribute values are merged into the same prefixed attribute. When multiple Supplier features have the same attribute, generally the resulting attribute value is taken from the last of these features. However, this process is governed by the Ignore Nulls parameter. |
If there are duplicate Suppliers and Generate List is enabled, then any Suppliers that are combined with a Requestor will have their attributes added to the specified list on the Requestor. Note that enabling Generate List also implicitly enables Process Duplicate Suppliers.
Note: List attributes are not accessible from the output schema in Workbench unless they are first processed using a transformer that operates on them, such as ListExploder or ListConcatenator. Alternatively, AttributeExposer can be used.
List Name | Enter a name for the list attribute. |
Add To List |
All Attributes: Every attribute from a Supplier that combined with a Requestor will be added to the list specified in List Name. Selected Attributes: Only the attributes specified in the Selected Attributes parameter will be added to the list specified in List Name. |
Selected Attributes | The attributes to be added to the list when Add To List is Selected Attributes. |
Preserve Feature Order |
This parameter controls the order in which features exit a transformer. When a transformer has more than one output port, features usually exit one port at a time. At times, it may be useful to keep the order that features arrived in, switching from port to port as necessary. This allows feature order to be preserved, though at a potential cost in processing efficiency. Select a method for feature ordering.
|
Editing Transformer Parameters
Using a set of menu options, transformer parameters can be assigned by referencing other elements in the workspace. More advanced functions, such as an advanced editor and an arithmetic editor, are also available in some transformers. To access a menu of these options, click beside the applicable parameter. For more information, see Transformer Parameter Menu Options.
Defining Values
There are several ways to define a value for use in a Transformer. The simplest is to simply type in a value or string, which can include functions of various types such as attribute references, math and string functions, and workspace parameters. There are a number of tools and shortcuts that can assist in constructing values, generally available from the drop-down context menu adjacent to the value field.
Using the Text Editor
The Text Editor provides a convenient way to construct text strings (including regular expressions) from various data sources, such as attributes, parameters, and constants, where the result is used directly inside a parameter.
Using the Arithmetic Editor
The Arithmetic Editor provides a convenient way to construct math expressions from various data sources, such as attributes, parameters, and feature functions, where the result is used directly inside a parameter.
Conditional Values
Set values depending on one or more test conditions that either pass or fail.
Parameter Condition Definition Dialog
Content
Expressions and strings can include a number of functions, characters, parameters, and more.
When setting values - whether entered directly in a parameter or constructed using one of the editors - strings and expressions containing String, Math, Date/Time or FME Feature Functions will have those functions evaluated. Therefore, the names of these functions (in the form @<function_name>) should not be used as literal string values.
These functions manipulate and format strings. | |
Special Characters |
A set of control characters is available in the Text Editor. |
Math functions are available in both editors. | |
Date/Time Functions | Date and time functions are available in the Text Editor. |
These operators are available in the Arithmetic Editor. | |
These return primarily feature-specific values. | |
FME and workspace-specific parameters may be used. | |
Creating and Modifying User Parameters | Create your own editable parameters. |
Dialog Options - Tables
Transformers with table-style parameters have additional tools for populating and manipulating values.
Row Reordering
|
Enabled once you have clicked on a row item. Choices include:
|
Cut, Copy, and Paste
|
Enabled once you have clicked on a row item. Choices include:
Cut, copy, and paste may be used within a transformer, or between transformers. |
Filter
|
Start typing a string, and the matrix will only display rows matching those characters. Searches all columns. This only affects the display of attributes within the transformer - it does not alter which attributes are output. |
Import
|
Import populates the table with a set of new attributes read from a dataset. Specific application varies between transformers. |
Reset/Refresh
|
Generally resets the table to its initial state, and may provide additional options to remove invalid entries. Behavior varies between transformers. |
Note: Not all tools are available in all transformers.
Reference
Processing Behavior |
|
Feature Holding |
Yes |
Dependencies | None |
Aliases | |
History |
FME Community
The FME Community is the place for demos, how-tos, articles, FAQs, and more. Get answers to your questions, learn from other users, and suggest, vote, and comment on new features.
Search for all results about the FeatureMerger on the FME Community.
Examples may contain information licensed under the Open Government Licence – Vancouver and/or the Open Government Licence – Canada.