Esri Shapefile Reader Parameters
This parameter controls which character encoding is used to interpret text attributes from the shapefile. This parameter is useful when the character encoding information stored in the shapefile is missing or incorrect.
By default, the character encoding will be automatically detected from the source shapefile (fme-source-encoding).
If you select any other character encoding, it will take precedence over the automatically detected character encoding.
FME supports most encodings.
Shapefiles store attributes in a text-based manner, so numeric attributes are converted to text before being written to disk. This parameter controls how FME reads these numeric types and influences how they will be written out at the end of a transformation.
- This parameter defaults to Standard Types, which will cause the reader to convert the text representation of numeric attributes to a binary one, choosing a size that will safely contain all possible values of the fixed-width field. This can be useful when converting from shapefiles to a format that supports binary storage of numbers. Selecting this option will cause FME to produce short and long types for numeric attributes with no decimal places, and float or double types for numeric attributes with decimals.
- The other option available is Explicit Width and Precision, which will keep the attributes as fixed-length text-based fields. This option is preferred when performing a shapefile-to-shapefile translation, as the field widths will remain the same on write. Selecting this option will cause FME to produce the same number(width, precision) types as the underlying dBASE file contains, which will ensure that writing to other fixed-width format types will not cause attribute sizes to increase during a translation.
Schema Attributes
Use this parameter to expose Format Attributes in FME Workbench when you create a workspace:
- In a dynamic scenario, it means these attributes can be passed to the output dataset at runtime.
- In a non-dynamic scenario, this parameter allows you to expose additional attributes on multiple feature types. Click the browse button to view the available format attributes (which are different for each format) for the reader.
Spatial
Coordinate systems may be extracted from input feature data sources, may come predefined with FME, or may be user-defined. FME allows different output and input coordinate systems, and performs the required coordinate conversions when necessary.
If a coordinate system is specified in both the source format and the workspace, the coordinate system in the workspace is used. The coordinate system specified in the source format is not used, and a warning is logged. If a source coordinate system is not specified in the workspace and the format or system does not store coordinate system information, then the coordinate system is not set for the features that are read.
If a destination coordinate system is set and the feature has been tagged with a coordinate system, then a coordinate system conversion is performed to put the feature into the destination system. This happens right before the feature enters into the writer.
If the destination coordinate system was not set, then the features are written out in their original coordinate system.
If a destination coordinate system is set, but the source coordinate system was not specified in the workspace or stored in the source format, then no conversion is performed. The features are simply tagged with the output system name before being written to the output dataset.
For systems that know their coordinate system, the Coordinate System field will display Read from Source and FME will read the coordinate system from the source dataset. For most other input sources, the field will display Unknown (which simply means that FME will use default values). In most cases, the default value is all you'll need to perform the translation.
You can always choose to override the defaults and choose a new coordinate system. Select More Coordinate Systems from the drop-down menu to open the Coordinate System Gallery.
Changing a Reprojection
To perform a reprojection, FME typically uses the CS-MAP reprojection engine, which includes definitions for thousands of coordinate systems, with a large variety of projections, datums, ellipsoids, and units. However, GIS applications have slightly different algorithms for reprojecting data between different coordinate systems. To ensure that the data FME writes matches exactly to your existing data, you can use the reprojection engine from a different application.
To change the reprojection engine, Select Workspace Parameters > Spatial > Reprojection Engine. In the example shown, you can select Esri (but the selection here depends on your installed applications):
- The coordinate systems file coordsys.db in the FME installation folder contains the names and descriptions of all predefined coordinate systems.
- Some users may wish to use coordinate systems that do not ship with FME, and in those cases, FME also supports custom coordinate systems.
- Learn more about Working with Coordinate Systems in FME.
When creating donut geometries, this parameter specifies the criteria that FME will use to detect the geometric properties of the donut(s).
- Orientation Only – FME will detect donut geometry only based on orientation of the rings. Shapefile specifications state that outer boundaries of donut geometries must have clockwise orientation, and any donut holes must have counter-clockwise orientation. This option shall preserve the order of the areas.
- Orientation and Spatial Relationship – FME will detect donut geometry initially by orientation, and will perform additional geometric validation by analyzing the spatial relationships between the donut’s outer rings and holes. If any invalid donut geometries are identified, FME will attempt to correct geometric anomalies (for example, holes larger than outer ring, holes within holes, etc.). If your dataset is supposed to have holes but FME does not correctly produce them, select Orientation and Spatial Relationship.
This parameter controls the handling of measures data associated with geometric data or attributes on the incoming features:
- No – Preserves the measures on the features. This is the default.
- Yes – Measures data is created from the z values on the incoming features, if the z values exist. If measures data exists, it is not overwritten by the z values on the feature.
This parameter controls whether the reader reports geometric anomalies in input shapefiles.
To ensure the validity of input features, the reader will close unclosed polygons and replace empty elements with null geometry.
If this option is set to Yes, the shape_geometry_error attribute will be set on input features, and will contain error codes as geometric anomalies are detected and/or fixed. If multiple codes exist on a feature, they will be comma-separated. The error codes are listed below:
|
Error Code |
Description |
|---|---|
|
INVALID_POLYGON_CLOSED |
This polygon did not specify an end vertex that corresponded to the first vertex. The geometry has been fixed to ensure validity by adding an additional point that closes the polygon. |
|
INVALID_GEOMETRY_EMPTY |
This geometry had 0 parts/points and was collapsed into a Null shape instead. (for example, a Multipoint with 0 points contained). The geometry has been changed. |
|
INVALID_POLYGON_ORIENTATION |
This polygon does not follow the Shapefile specification of vertex ordering. Geometry is not changed, but donuts may not have been formed correctly. This anomaly is only produced when the reader parameter Donut Geometry Detection is set to Orientation Only. |
A search envelope (also known as a bounding box) is a rectangular area that defines a geographic area. In FME, the easiest way to define a search envelope is to use search envelope parameters.
Defining a search envelope is the most efficient method of selecting an area of interest because FME will read only the data that is necessary – it does not have to read an entire dataset. Search Envelope parameters apply to both vector and raster datasets and can be particularly efficient if the source format has a spatial index.
Most FME readers have parameters to define the search envelope of data that is being read:
The parameters include the x and y coordinates of the bounding box as well as a parameter that defines the coordinate system.
How to Define the Bounding Box
Using the minimum and maximum x and y parameters, define a bounding box that will be used to filter the input features. Only features that intersect with the bounding box are returned. Note that the bounding box intersection is not a full geometry intersection (based on spatial relationships) that would be returned by a transformer like the SpatialFilter.
|
Search Envelope Coordinate System |
Specifies the coordinate system of the search envelope if it is different than the coordinate system of the data. The coordinate system associated with the data to be read must always be set if this parameter is set. If this parameter is set, the minimum and maximum points of the search envelope are reprojected from the Search Envelope Coordinate System to the reader’s coordinate system prior to applying the envelope. |
||||||
|
Clip to Search Envelope |
The underlying function for Use Search Envelope is an intersection; however, when Clip to Search Envelope is checked, a clipping operation is also performed.
|
Advanced
This parameter specifies whether the reader should trim preceding spaces of attribute values:
- Yes – Preceding spaces in attribute values will be discarded. This is the default.
- No – Preceding spaces will be left intact.
Some programs write Shapefiles with non-standard padding characters in the .dbf.
If FME is reading 0 for values instead of a null or missing value, setting this option to Blank may correct this. In order to produce results similar to what ArcGIS reads, the default value for this option is Zero (0).
Whether these values are read as null or missing is controlled by the Read Blank Fields as parameter.
Shapefile makes no distinction between null or missing attribute values. This option determines how FME will read features for which the attribute has no value in the .dbf.
- Missing (default) – Empty values will be read back as missing attributes on the output feature.
- Null – Empty values will be read back as null attributes on the output feature.
Shapefile uses strings to store floating point numbers. This takes more storage space, but avoids some precision loss from binary encoding. For efficiency, FME reads float and double type attributes into binary fields which are faster to manipulate, but may result in precision loss when values have a high number of significant figures versus the width of the field in the .dbf. This option controls whether this conversion will take place during the read operation.
-
Yes – The reader will produce string-backed values for all float and double attributes, allowing exact representations of values in the .dbf file. Note: These values may be converted to binary representation anyway if any transformers operate on them, or when writing to other formats that support binary representations for float and double types. Enabling this option may have an impact on performance.
-
No (default) – The reader will produce native binary-backed values for all float and double attributes.
If contents of an .shz file contain international characters, the reader can incorrectly interpret the unicode-encoded filename strings. This can result in a failed translation due to seemingly mismatched basenames:
SHAPEFILE reader: Shapefile Reader: Failed to open filename.shz for reading. A .shz must have the same base name as the shapefile it compresses.
This option ensures that the reader will read the .shz and contained files.