CSV (Comma Separated Value) Reader Parameters
Dataset Parameters
This parameter allows you to choose different naming schemes, and the number of feature types generated for the reader.
Options:
- From File Name(s) – Generates one feature type per source filename.
- From Format Name – Produces only a single feature type containing the format name.
Fields
Delimiter Character
The sequence of one or more characters specified as the delimiter between values.
Field Names Line
The line number that contains the field names. Note that the first line in the file is considered to be line number 1. If the file does not contain field names, leave this blank.
When the file does not contain field names, the columns of the CSV table are given default names (for example, col0, col1, ..., colN).
Data Start Line
The line number at which the data starts. Note that the first line in the file is considered to be line number 1.
Preview
Shows a preview of the input CSV dataset, as read with the current options.
Attributes
Shows the schema of the dataset, as read with the current options:
| Read | Name | Type |
|---|---|---|
|
Whether to read this field as an attribute. |
The name of the attribute. |
The type of the attribute. |
Schema Attributes
Use this parameter to expose Format Attributes in FME Workbench when you create a workspace:
- In a dynamic scenario, it means these attributes can be passed to the output dataset at runtime.
- In a non-dynamic scenario, this parameter allows you to expose additional attributes on multiple feature types. Click the browse button to view the available format attributes (which are different for each format) for the reader.
Spatial
When selected, the feature type will produce point geometries. At least the X and Y coordinates must be supplied in order to build the points. Z is optional and, when specified, will produce 3D points if the X and Y coordinates are also set.
The first time this group is enabled, FME will attempt to detect appropriate column names in the file and correlate them to the X Coordinate Attribute, Y Coordinate Attribute, Z Coordinate Attribute, and Coordinate System parameters. You can override this guess by manually selecting attribute names.
For more information, see Latitude/Longitude and x, y, z coordinates.
Specifies which attribute points should take their X values from.
Specifies which attribute points should take their Y values from.
Specifies which attribute points should take their Z values from.
This is an optional value.
Coordinate systems may be extracted from input feature data sources, may come predefined with FME, or may be user-defined. FME allows different output and input coordinate systems, and performs the required coordinate conversions when necessary.
If a coordinate system is specified in both the source format and the workspace, the coordinate system in the workspace is used. The coordinate system specified in the source format is not used, and a warning is logged. If a source coordinate system is not specified in the workspace and the format or system does not store coordinate system information, then the coordinate system is not set for the features that are read.
If a destination coordinate system is set and the feature has been tagged with a coordinate system, then a coordinate system conversion is performed to put the feature into the destination system. This happens right before the feature enters into the writer.
If the destination coordinate system was not set, then the features are written out in their original coordinate system.
If a destination coordinate system is set, but the source coordinate system was not specified in the workspace or stored in the source format, then no conversion is performed. The features are simply tagged with the output system name before being written to the output dataset.
For systems that know their coordinate system, the Coordinate System field will display Read from Source and FME will read the coordinate system from the source dataset. For most other input sources, the field will display Unknown (which simply means that FME will use default values). In most cases, the default value is all you'll need to perform the translation.
You can always choose to override the defaults and choose a new coordinate system. Select More Coordinate Systems from the drop-down menu to open the Coordinate System Gallery.
Changing a Reprojection
To perform a reprojection, FME typically uses the CS-MAP reprojection engine, which includes definitions for thousands of coordinate systems, with a large variety of projections, datums, ellipsoids, and units. However, GIS applications have slightly different algorithms for reprojecting data between different coordinate systems. To ensure that the data FME writes matches exactly to your existing data, you can use the reprojection engine from a different application.
To change the reprojection engine, Select Workspace Parameters > Spatial > Reprojection Engine. In the example shown, you can select Esri (but the selection here depends on your installed applications):
- The coordinate systems file coordsys.db in the FME installation folder contains the names and descriptions of all predefined coordinate systems.
- Some users may wish to use coordinate systems that do not ship with FME, and in those cases, FME also supports custom coordinate systems.
- Learn more about Working with Coordinate Systems in FME.
Advanced
If selected, multiple contiguous delimiters are treated as a single delimiter; otherwise, each delimiter is treated as if it delimits a different field.
Specifies the character that encloses field values. When a field starts with this character, all text that follows this character and precedes the next occurrence of the character will be treated as one value, even if that text contains a delimiter or newline character.
For example, if the delimiter is a comma (,) and the field qualifier is a quotation mark ("), then the value
"Vancouver, BC"
will be treated as one value
Vancouver, BC
rather than two separate values
Vancouver
BC
Specifies the character that escapes the field qualifier character. This is used when wanting to have a field qualifier literal within a field qualifier group.
For example, if the field qualifier character is a quotation mark (") and the escape character is a backslash (\), then the value
"Vancouver \"Lotusland\", BC"
will be read as
Vancouver "Lotusland", BC
Field Names
Specifies whether the field names should be matched against the schema in a case-sensitive or case-insensitive manner.
For example, suppose the schema contains the attribute "MyField" but the file contains the field "myfield":
- If field names are case-sensitive, these are considered to not match, and the attribute "MyField" will not be read.
- If field names are not case-sensitive, these are considered to match, and values from the "myfield" column will be read for attribute "MyField".
Specifies whether to enforce a strict schema.
- If this parameter is set to Yes and the fields in the file do not match the attributes on the schema in FME, the reader will fail.
- If this parameter is set to No, the reader will warn about any attributes that exist on the schema but are not present in the file, and will continue reading.
Specifies whether to trim leading and trailing whitespace from the field names.
- If this parameter is set to Yes, then leading and trailing whitespace will be trimmed from the field names.
- If this parameter is set to No, then whitespace will not be trimmed.
Field Values
Specifies whether to trim the field qualifier character from values. Note that these characters are only trimmed when they serve as field qualifiers (that is, the first character in the value is this character, until the next instance of that character).
For example, if the field qualifier is a quotation mark ("), then the value
"Vancouver, BC" and "More"
will be read as
Vancouver, BC and "More"
Specifies whether to read empty values as Null or Missing in FME.
Historically, for string field types, FME read empty values as empty strings; and for numeric field types, FME read empty values as Null. This behavior will continue in workspaces created before FME 2021. In newer (FME 2021+) workspaces containing readers with this option, both string and numeric fields containing empty values will be read as either Null or Missing, as specified.
Specifies whether to trim leading and trailing whitespace from the field values.
- If this parameter is set to Yes, then leading and trailing whitespace will be trimmed from the field values.
- If this parameter is set to No, then whitespace will not be trimmed.
Encoding
This parameter is applicable if you are working with extended (not basic ASCII) character sets. If your source data contains non-ASCII characters, using this parameter along with the encoding value ensures that the original data is preserved from the reader to the writer.
By default, the character encoding will be automatically detected from the source file if there is a Byte Order Mark (BOM) present in the source file. If you select any other character encoding, it will override the automatically detected character encoding.
Note that only UTF encodings are identified using the BOM – all other character sets must be explicitly identified or they will be read as system encoding. (System encoding is dependent on your computer's operating system locale setting.)
FME supports most encodings.
Specifies whether string attributes will be set in the file encoding.
- Yes – String attributes will always be in the encoding of the file.
- No – String attributes may be in the file encoding, but may also be in a Unicode encoding. Setting this parameter to No may improve performance when reading from an encoded file.
Skipped Lines
Specifies whether to read lines from the file that occur before the data start line. (Note: The field name's line is never read as a feature.)
If set to Yes, the reader will produce features for these lines, where the attribute csv_skipped_lineis set to the content of that line.
If the field structure of the first several rows of a file is representative of the remainder of the file, this option can be set to prevent FME from unnecessarily reading further rows from a potentially large file when determining its schema.
If left blank, there will be no limit and all rows will be read.
Specifies whether to try to determine the types of attributes when scanning for schema.
- No – All attributes will be treated as strings.
- Yes (default) – FME will attempt to determine the correct type for each attribute (for example, int32, real64, etc.). (For more information, see Usage Notes.)
Using properly typed attributes may improve reading and processing performance. However, if an attribute value is not valid for a scanned type (for example, because the value was not included when scanning for schema), it will be set to Null or Missing, depending on the value of the Read Empty Values As parameter.
When scanning for types, FME will also attempt to automatically map fields to coordinates (for example, a field named x will be given a type of x_coordinate).
Specifies which attribute type to scan for:
- Standard Types (default) – Standard numeric types.
- Explicit Width and Precision – Number types with an explicit width and precision.
Specifies which attribute type to scan for:
- Explicit Width (default) – String types with an explicit width.
- Standard Types – Standard string types.
The input format string from which to detect and create FME-formatted date, time, and datetime attributes.
When Scan for Types is set to Yes, this option is enabled.
Specify a format string inline, or specify an attribute that contains a format string.
The default is set to ISO (auto detect), and there is a list of additional presets to choose from.
For information on the presets <Auto detect FME and ISO formats>, FME (auto detect), and ISO (auto detect), please see the documentation about FME and ISO in the topic Format String Flags and Examples.
Specifies whether FME should scan for additional fields, beyond those found in the field names row.
- Yes – FME will attempt to find additional fields that aren’t included in the field names row.
- No – The field names row is assumed to contain all fields in the file.
This option has no effect when the file does not contain field names. If FME does not scan for additional fields and extra data is found beyond the defined fields, these data values will be Missing (that is, not present as attributes on the output data features).