Esri Shapefile Reader/Writer
The Esri® Shapefile Reader provides FME with access to data in Esri’s Shapefile format.
The Shapefile format is a geospatial vector data format for geographic information system (GIS) software. It is developed and regulated by Esri as a (mostly) open specification for data interoperability among Esri and other GIS software products.
In this example, geographic features (campsite, campground, forest, roads) are represented in a shapefile by points, lines, and polygons (areas). Each item usually has attributes that describe it, such as name, temperature, or XY coordinates.
Overview
Esri Shapefiles store both geometry and attributes for features. No topological information is stored.
A shapefile is a logical construct that consists of a series of physical files with different extensions. These extensions are added to the base name of the shapefile. All files must reside in the same folder.
Shapefile does not support binary integer or floating point data types. Instead, it supports a number(x,y) data type. The equivalent data types are:
- short integer (16bit): number(6,0)
- long integer (32bit): number(11,0)
- float (32bit): number(13,11)
- double (64-bit): number(19,11)
Floating point values may lose precision in trailing decimals in order to store them in the space given for the values.
File Extension | Contents |
---|---|
.shp |
Shape format – the feature geometry itself. This is a variable-record-length file in which each record describes a shape (feature) with a list of its vertices. A single .shp file can contain only one type of geometry. Supported geometries are point, multipoint, polyline, polygon, and multipatch. Each entity in a .shp file has a corresponding entry in the .shx index file and a corresponding row of attributes in the associated .dbf file. The order of the entries in each of these files is synchronized. For example, the third geometric entity in the .shp file is pointed to by the third entry in the .shx index file and has the attributes held in the third row of the .dbf file. |
.shx |
Index file that stores the index of the feature geometry. In the .shx file, each record contains the offset of the corresponding main file record from the beginning of the main file, and the size of the record in the main file. These index files are optional, but increase the reliability and speed of reading features. FME will always write them. |
.dbf |
The dBASE file (.dbf) contains feature attributes, with one record per feature – that is, a one-to-one relationship between a record in the main file and its attributes in the dBASE file – based on record number. For example, if the type of geometry in the main file is multipoint, the .dbf file will have one row for each set of points held in the main file. If the type of geometry in the main file is point, there will be one row in the .dbf file for each point. Attribute records in the dBASE file must be in the same order as records in the main file. If no .dbf is available for the target .shp, geometry will be produced but features will have no attributes associated with them. Any single DBFs (attribute) file can have a maximum file size of 2 GB, a limit imposed by the dBase III specification. Files larger than 2 GB may be readable, but not officially supported. Files larger than 2 GB are not writable, and will produce an error message. |
.sbn and .sbx |
Spatial index for the geometric data. These two files will not be written unless Write Spatial Index is selected in the Shapefile writer parameters dialog. |
.atx |
Attribute index for the geometric data. These files are named as filename.attributename.atx. Attribute indexes are created for any user attributes that are flagged for indexing. For more information, see Esri Shapefile User Attributes |
.shz |
Zipped file that contains all the files that comprise a Shapefile dataset. For example, coast.shz will contain coast.shp, coast.dbf, coast.shx, and optionally other shapefiles if applicable. |
.prj |
Projection information for spatial referencing geometry to location. If present, will be used by FME to georeference geometry. |
.cpg |
Encoding information for attributes. If present, will be used by FME to automatically determine the correct character encoding for attribute names and values. |
As a minimum, a .shp or .shz must be present to read any features. If a .dbf is present, attributes will be read on features.
Shapefile datasets larger than 2 GB are considered invalid (and were probably not created with Esri software), due to the following:
- Internal pointers between the index file (.shx) and main file (.shp) are stored as signed 32-bit integers. This is a limitation of operating system architecture.
- Attribute files (.dbf) files also have a 2 GB size limit.
- The main file (.shp) header contains information on the size of the file, specified as a signed integer. Writing a shapefile dataset greater than 2 GB would invalidate the file header.
Because indexes are measured in "words", FME can read and write 4 GB files. However, these files may not function properly with other applications. Further, on some 32-bit operating systems, there is no way to reference a location in a file more than 2 GB from the beginning.
If your dataset grows beyond 2 GB, consider switching to a different format that can more easily handle the size.
Shapefiles can hold both two- and three-dimensional geometry, as well as an optional measure value on each vertex. However, all features within a single shapefile will have the same dimensionality. Note that while older Esri products may only support two-dimensional shapefiles, FME can read and write both two- and three-dimensional shapefiles. FME can also handle measure data associated with features.
Note: Aggregate linear features and aggregate polygonal features may be created using the Aggregator transformer. They may be broken into their component pieces for output to formats that do not support aggregation using the Deaggregator transformer.
Note: If a polygon containing holes is written to a Shapefile, any adjacent holes will be merged into a single hole before the polygon is output.
If the FME feature contains an "unnamed" measure and the destination feature type is set to 2D + Measures or 3D + Measures, then FME will write the measure.
In the FME Data Inspector, these measures are labeled <default_measure>. If the feature has a named measure (for example, distanceMeasure), the Shapefile writer will ignore it, and then measures on the destination geometry will be undefined.
The Shapefile reader will automatically load the <default_measure> if a geometry type with measures is read.
Note: Some 2D + Measure and 3D + Measure files contain records that do not include measures data. If the Shapefile Reader encounters a record that does not have measures, the reader will not produce the measures for that feature. If a feature with no measures is written to a Shapefile Writer set to 2D + Measure or 3D + Measure mode, that record will not have measures.
Reader Overview
The Shapefile reader produces FME features for all feature data held in shapefiles that reside in the specified folder.
- Specify Reader Format (Esri Shapefile) and Dataset (.shp file)
- Optional: Specify Esri Shapefile Reader Parameters.
- Click OK.
The Shapefile reader extracts features one at a time from the file and passes them on to the rest of FME for further processing. When the file is exhausted, the Shapefile reader starts on the next file in the folder.
Writer Overview
The Shapefile writer creates and writes feature data to shapefiles in the folder specified in the writer dataset field.
- Specify Writer Format (Esri Shapefile) and Dataset (folder name).
- Optional: Specify Esri Shapefile Writer Parameters.
- Click OK.
Any old shapefiles in the folder are overwritten with the new feature data. As features are routed to the Shapefile writer by FME, it determines the file they are to be written to and outputs them according to the type of the file.
Many shapefiles can be written during a single FME session.
FME Community
Tags shape esrishape shapefile