Elastic Elasticsearch Reader/Writer
FME provides read and write access to Elasticsearch clusters.
There are two versions of the FME Elasticsearch Reader/Writer (version 7+ and v6.8 and earlier).
FME Format Name |
FME Format Identifier (Short Name) |
Elasticsearch Version |
---|---|---|
Elastic Elasticsearch |
ELASTICSEARCH_CLUSTER |
Use with Elasticsearch version 7 and later. |
Elastic Elasticsearch v6.8 and earlier (deprecated) |
ELASTICSEARCH |
Use with Elasticsearch version 6 and earlier. |
Overview
FME treats each document as a feature, and each field in a document is treated as an attribute.
Elasticsearch is an open source full-text search index. Elasticsearch indexes are JSON document stores that support LonLat or GeoJSON geometry.
More information about Elasticsearch can be found at www.elastic.co.
Elastic Elasticsearch Product and System Requirements
Format |
Platform |
Operating System |
||||
---|---|---|---|---|---|---|
Reader/Writer |
FME Desktop License |
FME Server |
FME Cloud |
Windows 64-bit |
Linux |
Mac |
Reader |
Available in FME Professional Edition and higher |
Yes |
Yes |
Yes |
Yes |
Yes |
Writer |
Available in FME Professional Edition and higher |
Yes |
Yes |
Yes |
Yes |
Yes |
Version Differences
Elasticsearch Clusters are organized in a hierarchy:
- Cluster, which can contain 1 or more Indices
- Index, which can contain:
- Elasticsearch v6 and earlier: 1 or more Types
Elasticsearch Types were removed in version 7.
The tables below illustrate the major differences between FME Elasticsearch format versions.
FME Reader/Writer Element |
FME Format Identifier (Short Name) |
|
---|---|---|
ELASTICSEARCH (v6.8 and earlier) |
ELASTICSEARCH_CLUSTER (v7+) |
|
Dataset |
The dataset for Elasticsearch version 6 and earlier is an Index. Each Elasticsearch Cluster can contain multiple Indices, but this version of the Reader/Writer can only access one of those Indices. |
The dataset for Elasticsearch version 7 and later is a Cluster. |
Feature Type |
The feature type for Elasticsearch version 6 and earlier is a Type. Each Elasticsearch Index can contain multiple Types, but these Types must have compatible Field Mappings (schemas). To be compatible, all Fields with the same name must have the same Mapping. |
The feature type for Elasticsearch version 7 and later is an Index. Each Elasticsearch Cluster can contain multiple Indices. |
Attribute Types and Attribute Index Types
The following Attribute Types have changed between Elasticsearch versions:
- string
- text
- keyword
The following Attribute Index Types have changed between Elasticsearch versions:
- Analyzed
- NotAnalyzed
String is a field type from Elasticsearch v2 and earlier. string,Analyzed is exactly equivalent to text, and string,NotAnalyzed is exactly equivalent to keyword.
ELASTICSEARCH (v6.8 and earlier) |
ELASTICSEARCH_CLUSTER (v7+) |
---|---|
This version of the Reader/Writer accepts either string,Analyzed or the equivalent text for FME Attribute Types. It uses the former terminology when communicating with v2 and earlier clusters and using the latter terminology when communicating with later version clusters. Similarly, the Reader/Writer accepts either string,NotAnalyzed or the equivalent keyword for FME Attribute Types. |
This version of the Reader/Writer accepts text and keyword for FME Attribute Types. The NotIndexed Attribute Index Type is still supported for all Attribute Types. |
Format Usage Notes
- There are two types of Elasticsearch geometry fields: geo_point and geo_shape. geo_point fields can only contain point geometries, while geo_shape fields can contain any geometry that is representable as GeoJSON.
- You can write features from most coordinate systems, but they will all be reprojected to LL-WGS84 when being converted to GeoJSON. The coordinate reference system for all GeoJSON coordinates is a geographic coordinate reference system, using the World Geodetic System 1984 (WGS 84) [WGS84] datum. [Reference: The GeoJSON Format]
- Writer: If a non-point geometry is written to a geo_point geometry field, then the geometry will be converted to its centroid point before writing.
- Writer: Each Elasticsearch document has a unique Document ID. This ID can be specified on a feature with an attribute selected in the Writer Feature Type Parameters. If a document with that ID already exists, then the translation will fail.
Reader Overview
Version 6 and earlier (deprecated)
The Elasticsearch reader supports reading multiple types from the same Elasticsearch index. Because of this, a separate reader must be created for each Elasticsearch index.
Version 7 and later
The Elasticsearch reader supports reading multiple indices from the same Elasticsearch cluster. Because of this, a separate reader must be created for each Elasticsearch cluster.
The feature types must be defined in the workspace before they can be read.
Multiple Geometry
The Elasticsearch reader supports reading multiple geometry fields from the same Elasticsearch feature type. If there is more than one geometry field in the Elasticsearch Mapping, then geometry will be read as FME Multiple Geometry. Each geometry part will be named after the corresponding Elasticsearch geometry field.
Writer Overview
The Elasticsearch writer stores documents into a type associated with a Elasticsearch index. The Elasticsearch writer provides the following capabilities:
Type Creation (version 6 and earlier)
The Elasticsearch writer uses the information within the FME workspace to automatically create Elasticsearch types as required. A type will be created when the first input feature is processed. If no features are sent to a feature type, then the corresponding type will not be created.
Each Type is created with a Mapping (schema) based on the feature type’s User Attributes. The fields of each JSON document that is written to the Type will be parsed according to that Mapping. If the document contains any fields that do not appear in the Mapping, then those fields will be automatically added to it. This can occur if the Document Source of the feature type is a JSON Attribute.
Index Creation (version 7 and later)
The Elasticsearch writer uses the information within the FME workspace to automatically create Elasticsearch indices as required. An index will be created when the first input feature is processed. If no features are sent to a feature type, then the corresponding index will not be created.
Each Index is created with a Mapping (schema) based on the feature type’s User Attributes. The fields of each JSON document that is written to the Index will be parsed according to that Mapping. If the document contains any fields that do not appear in the Mapping, then those fields will be automatically added to it. This can occur if the Document Source of the feature type is a JSON Attribute.
Overwrite Index (version 6 and earlier only)
If the Overwrite Index parameter for the writer is set to Yes, then the writer will drop and re-create the index before writing any features to it.
Indices will be overwritten when the first input feature is processed. If no features are sent to any of the writer’s feature types, then the corresponding index will not be overwritten.
Multiple Geometry
The Elasticsearch reader supports writing to multiple geometry fields in the same Elasticsearch Mapping.
If there is more than one geometry field in the existing mapping, then a feature's geometry must have the same name as the destination Elasticsearch geometry field. Otherwise, no geometry will be written.
If a feature's FME Multiple Geometry has multiple parts, then each part can be written to a different Elasticsearch geometry field. Each part will be written to the Elasticsearch geometry field corresponding to its geometry part name, provided that the field exists.
Nested geometry fields can be created and/or written to by naming the geometry in the form:
<outer_name>.<inner_name>
For example, a geo_point geometry field called address.location would result in data similar to the following:
{
“address”: {
“location”: [ <lon>, <lat> ]
}
}