OpenStreetMap (OSM) XML Reader Parameters
The OpenStreetMap (OSM) is a collaborative mapping project for creating a free and editable map of the whole world. Further information on OSM can be found at http://www.openstreetmap.org.
OpenStreetMap data can be downloaded in a topologically structured XML format. The data primitives in an OSM data file are nodes, ways and relations.
- A node is a lat/lon pair.
- A way is a list of at least two node references describing a linear features. Ways can be closed, in which case the first and the last node are identical. Areas are not explicitly represented in OSM but are identified via community approved tags.
- Relations are a group of zero or more primitives with an associated role.
All data in OSM are in the WGS-84 datum.
OSM has no explicit schema (feature type) definitions. Each node, way, and relation can have an arbitrary number of attributes, called tags in OSM. A tag is composed of a key and a value. The OpenStreetMap wiki does define a set of recommend tags that can be used to classify the nodes and ways into higher level groupings, i.e., feature types. The community-defined feature types can be found at http://wiki.openstreetmap.org/index.php/Map_Feature.
The FME OSM reader settings can help influence the classification of the OSM data being read.
Feature Types
These parameters are used only when generating an initial workspace, so they are not editable within Workbench after the workspace has been generated.
You can choose either OSM Map Features, or Basic Element Feature Types Only. This selection determines whether OSM Map Feature parameters below are enabled or disabled.
This parameter allows you to add a custom map features configuration file to define features or modify/remove existing features.
Any changes to the map features made here will be reflected in the map features tree view below. Please refer to OpenStreetMap (OSM) XML User Configuration File Guidelines for information on how to write your own config file.
OSM Map Features
These parameters are used only when generating an initial workspace, so they are not editable within Workbench after the workspace has been generated.
Click the browse button to open a tree view that displays all the map features listed in the wiki at http://wiki.openstreetmap.org/wiki/Map_Features. Additionally, any added/removed features specified in the user config file will take effect here.
Expanding the root osm element in the list displays the items defined as broad features (keys) by the wiki. Expanding any broad feature shows its specific features (values). In this example, building, geological, and natural are the broad features. Specific features are prefixed by the broad feature's name:
Note that if a specific feature contains attributes or geometry types that its broad feature does not, the broad feature will include these items even if it is selected and the specific feature is not. For example, if highway_primary includes an attribute width but highway does not include the attribute, highway must represent highway_primary and include width in its schema feature.
The list also includes unfiltered and unknown feature types. When the OSM reader reads the dataset, any elements that have map features defined in the wiki or user configuration file are emitted as unfiltered features if they were not selected in this list. Any elements that do not contain any defined map features will be emitted as unknown. In the list, unknown and unfiltered can also be expanded into respective nodes, ways, and relations. This will further filter unknown and unfiltered elements into their primitive categories.
Some OSM elements can contain multiple map feature tags.
- If the Use First option is selected, the first map feature tag will be constructed as a feature. This option will not check for alternative map features.
- If the Use First and List Alternatives option is selected, the first map feature tag will be constructed as a feature, while the other map features are set as a list attribute called alternative_map_features.
- If the Duplicate Features option is selected, a feature will be constructed for each map feature tag. For example, an OSM element has 1 aerialway tag, 1 barrier tag, and 1 craft tag. Three nearly identical features will be created for this element – the difference is that the feature type names will be aerialway, barrier, and craft, respectively.
The reader provides an option to perform a schema scan on top of the selected features in the tree view. In this case, the reader will scan the entire data file and any features found that are also selected in the tree view will be constructed.
A schema scan on a large file can be a slow process: in these cases, it may be more desirable to simply select feature types manually.
Geometry and Attribute Options
Controls whether way feature types should have the list of nodes added to the features as list attribute.
Controls whether multipolygon relation feature types (that is, type tag has a multipolygon value) should have its geometry built if checked, or kept as a list attribute containing all the members if left unchecked.
Controls whether way feature types should allow consecutive duplicate coordinates to be added to the line.
For example, if two unique nodes referenced by a line have the same location, the usual behavior is to remove duplicate points from the line. Since removing a duplicate point in this case would mean that the number of nodes and vertices would no longer match, checking this parameter allows the duplicate point to remain.
An .osm xml file is intended to have a strict structure of elements grouped into three blocks – nodes, ways, and relations – and in exactly that order.
The default behavior of the OSM reader is to expect .osm files to uphold this structure, and the reader performs the most optimally when this is the case. There may be instances, however, where an .osm file contains all valid elements, but the elements do not strictly follow this ordering rule.
To handle these cases, you can enable this parameter. This tells the reader to account for data in the file that may not exist in the expected block, in order to correctly parse the file. This is accomplished by internally caching any ways and relations that contain incomplete data not yet read from the file. (Note that because it caches the ways and relations, the reader may take longer to run.)
Note: When enabling this parameter in conjunction with the Search Envelope parameter, the recommended setting for Option for Ways that Cross Envelope Boundary is Preserve.
When encountering ways or relations referencing missing data in the source file, the default behavior for the OSM reader is to not pass these features back to FME. When this parameter is enabled, the reader will output ways and relations with incomplete data.
Note: When this parameter is enabled, the Search Envelope cannot be used. Additionally, enabling this parameter will likely affect performance: ways and relations are cached internally during read, and this will have an effect on overall run time.
Schema Attributes
Use this parameter to expose Format Attributes in Workbench when you create a workspace:
- In a dynamic scenario, it means these attributes can be passed to the output dataset at runtime.
- In a non-dynamic scenario, you can use this parameter to expose additional attributes on multiple feature types.
Use Search Envelope
Using the minimum and maximum x and y parameters, define a bounding box that will be used to filter the input features. Only features that intersect with the bounding box are returned.
If all four coordinates of the search envelope are specified as 0, the search envelope will be disabled.
When selected, this parameter removes any portions of imported features being read that are outside the Search Envelope.
The example below illustrates the results of the Search Envelope when Clip to Search Envelope is not selected (set to No) and when it is selected (set to Yes).
- No: Any features that cross the search envelope boundary will be read, including the portion that lies outside of the boundary.
- Yes: Any features that cross the search envelope boundary will be clipped at the boundary, and only the portion that lies inside the boundary will be read. The underlying function for the Clip to Search Envelope function is an intersection; however, when Clip to Search Envelope is selected, a clipping operation is also performed in addition to the intersection.
When Use Search Envelope is enabled, only nodes that fall within the specified bounding box are read into FME. These three options affect the storage of ways that cross from the inside to the outside of the Search Envelope boundary:
- Discard: Any ways that cross the boundary are not read into FME. This is the fastest option.
- Preserve: All ways that cross the boundary are read into FME. Generally, this is the slowest of the three options; however, this is the recommended setting when used in conjunction with Nodes, Ways, and Relation Blocks in Non-Standard Order.
- Preserve within Margin: All ways that are completely within a user-specified margin are read into FME. The margin is specified by two parameters: the amount of padding to be added to the Left/Ride sides of the Search Envelope, and the amount of padding to be added to the Top/Bottom of the Search Envelope. The margin units are fixed to be in the LL84 coordinate system.