Esri Shapefile Troubleshooting
There is occasionally some confusion around the nature of the Short/Long/Float/Double types in attributes on a shapefile. The underlying storage for shapefile stores the values as a fixed-length text field, not a binary representation. This can occasionally cause confusion since other ArcGIS formats do allow binary storage of numeric attributes. This can cause some variations on writing, depending on the source schema.
Shapefile Precision and Scale vs. FME Width and Decimals
In ArcGIS, the precision of a shapefile's number field refers to the number of total characters that can be stored in that field. This includes the decimal and negative signs, if present, as well as the numbers 0 through 9. In FME, this is referred to as the Width of the field. Scale and Decimals both represent the number of digits that come after the decimal. ArcGIS does have some quirks about displaying these, however:
- A Float with Precision and Scale of 13, 11 will display in ArcGIS as Float(0,0), but FME reports an fme_decimal(13,11).
- Doubles with Precision and Scale of 19, 11 will display in ArcGIS as Double(0,0), but FME reports an fme_decimal(19,11).
- Since ArcGIS 10.3.1, an integer field of width 5 is shown as Long Integer, but the default width of a Short Integer is 5. A newly created attribute with the Short Integer type and unspecified width will, when the properties window is reopened, immediately be displayed as a Long Integer with a width of 5.
The Float(0,0) and Double(0,0) is a result of ArcGIS supporting a number of other formats – any fields with 0,0 as their precision and scale will be stored in binary only if the format supports it – Shapefile is not one of these formats. In a shapefile, these will be the same as a Float(13,11) and Double(19,11) respectively. If Numeric Attribute Type Handling is set to Preserve Fixed-Width Numeric Field Size, FME will always report the actual width and decimals of the field.
Widths of Numeric Attributes are Inflated on Write
When FME writes to shapefile, it has to convert any numeric attributes from binary to text, and in order to do so, selects a field width that can represent the full range of values of that binary representation.
This can mean that a 16-bit integer that was originally stored as 3 characters in a shapefile will be written out as a 6-character field. Since the range of values of a 16-bit integer is -32,768 to 32,767, in order to represent the negative end of that range as text, 6 characters are required: -32768 (since without the negative sign the number will be read back as positive).
If you notice that FME is inflating the field width of your numeric attributes in an existing workspace, you can do the following:
- Open the FME Workbench Navigator pane.
- Right-click on the Shapefile reader and click Update Reader.
- Click the Parameters button and change the Numeric Attribute Type Handling parameter to Explicit Width and Precision. FME will then keep the attributes as fixed-width numeric fields rather than converting to binary representation and making the writer guess the field width.
Files With Many Attributes Cannot be Read With ArcGIS
ArcGIS has a limitation on the number of bytes used to store feature attribution – that is, the sum of all attribute widths in a dataset. For example, the following schema would take 274 bytes to store in the .dbf file:
-
ID: number(5,0)
-
Name: varchar(15)
-
Description: varchar(254)
The maximum number of bytes officially supported by ArcGIS is 4000, but it is possible to write datasets with FME that require more. At runtime, a warning will be logged if the dataset would exceed this limit. If you encounter errors reading shapefile datasets in ArcGIS that were written with FME, check the writer’s schema for affected feature types and remove unused/unneeded attributes or reduce the width of attributes to a minimum required.
For example, if a varchar attribute is set to a width of 254 by default, but the longest value is only 50 bytes long, it may be possible to reduce the width and still preserve all strings (Note: Non-ASCII UTF-8 characters take more than 1 byte to represent.)
Some versions of ArcGIS may be able to read some files with more than a 4000-attribute record length, but because it is the stated limitation, this is the threshold at which FME will warn.