OpenStreetMap (OSM) XML Reader Parameters
The OpenStreetMap (OSM) is a collaborative mapping project for creating a free and editable map of the whole world. Further information on OSM can be found at http://www.openstreetmap.org.
OpenStreetMap data can be downloaded in a topologically structured XML format. The data primitives in an OSM data file are nodes, ways and relations.
- A node is a lat/lon pair.
- A way is a list of at least two node references describing a linear features. Ways can be closed, in which case the first and the last node are identical. Areas are not explicitly represented in OSM but are identified via community approved tags.
- Relations are a group of zero or more primitives with an associated role.
All data in OSM is in the WGS-84 datum.
OSM has no explicit schema (feature type) definitions. Each node, way, and relation can have an arbitrary number of attributes, called tags in OSM. A tag is composed of a key and a value. The OpenStreetMap wiki does define a set of recommend tags that can be used to classify the nodes and ways into higher level groupings, i.e., feature types. The community-defined feature types can be found at http://wiki.openstreetmap.org/index.php/Map_Feature.
The FME OSM reader settings can help influence the classification of the OSM data being read.
Feature Types
You can choose either OSM Map Features, or Basic Element Feature Types Only. This selection determines whether OSM Map Feature parameters below are enabled or disabled.
This parameter allows you to add a custom map features configuration file to define features or modify/remove existing features.
Any changes to the map features made here will be reflected in the map features tree view below. Please refer to OpenStreetMap (OSM) XML User Configuration File Guidelines for information on how to write your own config file.
OSM Map Features
These parameters are used only when generating an initial workspace, so they are not editable within FME Workbench after the workspace has been generated.
Click the browse button to open a tree view that displays all the map features listed in the wiki at http://wiki.openstreetmap.org/wiki/Map_Features. Additionally, any added/removed features specified in the user config file will take effect here.
Expanding the root osm element in the list displays the items defined as broad features (keys) by the wiki. Expanding any broad feature shows its specific features (values). In this example, building, geological, and natural are the broad features. Specific features are prefixed by the broad feature's name:
Note that if a specific feature contains attributes or geometry types that its broad feature does not, the broad feature will include these items even if it is selected and the specific feature is not. For example, if highway_primary includes an attribute width but highway does not include the attribute, highway must represent highway_primary and include width in its schema feature.
The list also includes unfiltered and unknown feature types. When the OSM reader reads the dataset, any elements that have map features defined in the wiki or user configuration file are emitted as unfiltered features if they were not selected in this list. Any elements that do not contain any defined map features will be emitted as unknown. In the list, unknown and unfiltered can also be expanded into respective nodes, ways, and relations. This will further filter unknown and unfiltered elements into their primitive categories.
Some OSM elements can contain multiple map feature tags.
- Use First – The first map feature tag will be constructed as a feature. This option will not check for alternative map features.
- Use First and List Alternatives – The first map feature tag will be constructed as a feature, while the other map features are set as a list attribute called alternative_map_features.
- Duplicate Features – A feature will be constructed for each map feature tag. For example, an OSM element has 1 aerialway tag, 1 barrier tag, and 1 craft tag. Three nearly identical features will be created for this element – the difference is that the feature type names will be aerialway, barrier, and craft, respectively.
The reader provides an option to perform a schema scan on top of the selected features in the tree view. In this case, the reader will scan the entire data file and any features found that are also selected in the tree view will be constructed.
A schema scan on a large file can be a slow process: in these cases, it may be more desirable to simply select feature types manually.
Reader Options
Controls whether way feature types should have the list of nodes added to the features as list attribute.
Controls whether way feature types should allow consecutive duplicate coordinates to be added to the line.
For example, if two unique nodes referenced by a line have the same location, the usual behavior is to remove duplicate points from the line. Since removing a duplicate point in this case would mean that the number of nodes and vertices would no longer match, checking this parameter allows the duplicate point to remain.
An .osm xml file is intended to have a strict structure of elements grouped into three blocks – nodes, ways, and relations – and in exactly that order.
The default behavior of the OSM reader is to expect .osm files to uphold this structure, and the reader performs the most optimally when this is the case. There may be instances, however, where an .osm file contains all valid elements, but the elements do not strictly follow this ordering rule.
To handle these cases, you can enable this parameter. This tells the reader to account for data in the file that may not exist in the expected block, in order to correctly parse the file. This is accomplished by internally caching any ways and relations that contain incomplete data not yet read from the file. (Note that because it caches the ways and relations, the reader may take longer to run.)
When encountering ways or relations referencing missing data in the source file, the default behavior for the OSM reader is to not pass these features back to FME. When this parameter is enabled, the reader will output ways and relations with incomplete data.
Schema Attributes
Use this parameter to expose Format Attributes in FME Workbench when you create a workspace:
- In a dynamic scenario, it means these attributes can be passed to the output dataset at runtime.
- In a non-dynamic scenario, this parameter allows you to expose additional attributes on multiple feature types. Click the browse button to view the available format attributes (which are different for each format) for the reader.
Spatial
Coordinate systems may be extracted from input feature data sources, may come predefined with FME, or may be user-defined. FME allows different output and input coordinate systems, and performs the required coordinate conversions when necessary.
If a coordinate system is specified in both the source format and the workspace, the coordinate system in the workspace is used. The coordinate system specified in the source format is not used, and a warning is logged. If a source coordinate system is not specified in the workspace and the format or system does not store coordinate system information, then the coordinate system is not set for the features that are read.
If a destination coordinate system is set and the feature has been tagged with a coordinate system, then a coordinate system conversion is performed to put the feature into the destination system. This happens right before the feature enters into the writer.
If the destination coordinate system was not set, then the features are written out in their original coordinate system.
If a destination coordinate system is set, but the source coordinate system was not specified in the workspace or stored in the source format, then no conversion is performed. The features are simply tagged with the output system name before being written to the output dataset.
For systems that know their coordinate system, the Coordinate System field will display Read from Source and FME will read the coordinate system from the source dataset. For most other input sources, the field will display Unknown (which simply means that FME will use default values). In most cases, the default value is all you'll need to perform the translation.
You can always choose to override the defaults and choose a new coordinate system. Select More Coordinate Systems from the drop-down menu to open the Coordinate System Gallery.
Changing a Reprojection
To perform a reprojection, FME typically uses the CS-MAP reprojection engine, which includes definitions for thousands of coordinate systems, with a large variety of projections, datums, ellipsoids, and units. However, GIS applications have slightly different algorithms for reprojecting data between different coordinate systems. To ensure that the data FME writes matches exactly to your existing data, you can use the reprojection engine from a different application.
To change the reprojection engine, Select Workspace Parameters > Spatial > Reprojection Engine. In the example shown, you can select Esri (but the selection here depends on your installed applications):
- The coordinate systems file coordsys.db in the FME installation folder contains the names and descriptions of all predefined coordinate systems.
- Some users may wish to use coordinate systems that do not ship with FME, and in those cases, FME also supports custom coordinate systems.
- Learn more about Working with Coordinate Systems in FME.
When checked, builds geometry for multipolygon relation feature types (for example, type tag has a multipolygon value).
When unchecked, keeps multipolygon relation feature types as a list attribute containing all the members.
A search envelope (also known as a bounding box) is a rectangular area that defines a geographic area. In FME, the easiest way to define a search envelope is to use search envelope parameters.
Defining a search envelope is the most efficient method of selecting an area of interest because FME will read only the data that is necessary – it does not have to read an entire dataset. Search Envelope parameters apply to both vector and raster datasets and can be particularly efficient if the source format has a spatial index.
Most FME readers have parameters to define the search envelope of data that is being read:
The parameters include the x and y coordinates of the bounding box as well as a parameter that defines the coordinate system.
How to Define the Bounding Box
Using the minimum and maximum x and y parameters, define a bounding box that will be used to filter the input features. Only features that intersect with the bounding box are returned. Note that the bounding box intersection is not a full geometry intersection (based on spatial relationships) that would be returned by a transformer like the SpatialFilter.
|
Search Envelope Coordinate System |
Specifies the coordinate system of the search envelope if it is different than the coordinate system of the data. The coordinate system associated with the data to be read must always be set if this parameter is set. If this parameter is set, the minimum and maximum points of the search envelope are reprojected from the Search Envelope Coordinate System to the reader’s coordinate system prior to applying the envelope. |
||||||
|
Clip to Search Envelope |
The underlying function for Use Search Envelope is an intersection; however, when Clip to Search Envelope is checked, a clipping operation is also performed.
|
When Use Search Envelope is checked, only nodes that fall within the specified bounding box are read into FME. These three options affect the storage of ways that cross from the inside to the outside of the Search Envelope boundary:
- Discard – Any ways that cross the boundary are not read into FME. This is the fastest option.
- Preserve – All ways that cross the boundary are read into FME. Generally, this is the slowest of the three options; however, this is the recommended setting when used in conjunction with Nodes, Ways, and Relation Blocks in Non-Standard Order.
- Preserve within Margin – All ways that are completely within a user-specified margin are read into FME. The margin is specified by two parameters: the amount of padding to be added to the Left/Right sides of the Search Envelope, and the amount of padding to be added to the Top/Bottom of the Search Envelope. The margin units are fixed to be in the LL84 coordinate system.