Apache Parquet Writer Feature Type Parameters
To access feature type parameters, click the gear icon Tip To always display the editor in FME Workbench, you can select View > Windows > Parameter Editor.
General All feature types share similar General parameters, which may include Feature Type Name, Reader or Writer information, and Geometry. In most Writer Feature Type parameter dialogs, you can also control Dynamic Schema Definitions. Some database formats accept Table or Index Qualifier prefixes on the output table feature type. |
Partition
When this option group is enabled, the writer will write a partitioned dataset instead of a single .parquet file in the writer's specified output directory. See the Parquet File Extensions section for more details.
Partition Type
- Hive – When partitioning by an attribute, the subdirectory will be named in the form attrName=attrValue. This way of partitioning was introduced by Apache Hive.
- Note that attribute values partitioned via the Hive partition type will be URI-encoded. This means that special characters (like spaces, question marks, ampersands, hashes, parentheses, braces, brackets, and punctuation) will be encoded.
- For example, an attribute called Name with value John Smith would result in a partitioned subdirectory called Name=John%20Smith.
- Directory – When partitioning by an attribute, the subdirectory will be named in the form attrValue. This is a simple type of directory partitioning.
Note that attribute values partitioned via the Directory partition type containing forward or backward slashes will result in additional subdirectories.
For example, an attribute called Date with value 2023/01/02 would result in a partitioned Parquet file under nested directories named 2023, 01, and 02.
Partition Attributes
The list of attributes to partition by. Attributes will be partitioned in the order they are on schema.
These types include: bson, decimal, enum, interval, json, uuid.
Partitioning by types that are written as binary (decimal, interval, uuid, bson, binary) means subdirectory paths can have embedded null characters, resulting in a writer error.
Maximum Number of Partitions
The maximum number of partitions or subdirectories to create.
Default: 1024