About Format Attributes and User Attributes
Data Transformation is FME's ability to manipulate data. The transformation step occurs during the process of format translation. Data is read, transformed, and then written to the new format.
FME Workbench provides many options to control data transformation. Data transformation can be subdivided into two distinct types: Structural Transformation and Content Transformation.
See also:
Structural Transformation
This type of transformation is perhaps better called reorganization. It refers to FME's ability to channel data from source to destination in an almost infinite number of arrangements. This includes the ability to merge data, divide data, re-order data, and define custom data structures. Transforming a dataset’s structure requires knowledge of schemas and how to use FME to manipulate them.
Transforming the structure of a dataset is carried out by manipulating its schema.
Content Transformation
This type of transformation is perhaps better called revision. It refers to the ability to alter the substance of a dataset. Manipulating a feature's geometry or attribute values is the best example of how FME can transform content.
Content transformation can take place independently or alongside structural transformation.
Schema Concepts
A schema is the structure of a dataset or, more accurately, a formal definition of a dataset’s structure.
Each dataset has its own unique structure (schema) that includes feature types (layers), permitted geometries, user-defined attributes, and other rules that define or restrict its content.
When a new workspace is created, FME scans all of the source datasets. From this it creates a visual representation of the data’s schema on the left side of the canvas. On the right side, it creates a visual representation of how this schema will be duplicated in the chosen output format.
Here are source and destination schemas as they are represented in Workbench.
Each object on the canvas is a separate Feature Type within a dataset.
The workspace reads from left to right.
At this point, the Reader schema represents what we have (so, FME's view of the source datasets). The Writer schema represents what we want (so, the data required by the user).
By default, the Writer schema is a mirror image of the source; differences only occur when demanded by limitations of the selected destination format. This allows quick translation to occur with no further workspace editing required.
Viewing the Schema in FME Workbench
A schema goes beyond what can be seen on the workspace canvas; there are other components in various dialogs that also represent the structure of a dataset.
Some parts of the schema relate specifically to a single feature type only. Attributes are one such component. These components are shown in the Feature Type editing dialog.
Open the Feature Type dialog by clicking the gear button at the right side of the feature type.
The Feature Type dialog contains several tabs:
- Parameters: Feature Type name, geometry types, name of the parent dataset, and other editable parameters.
- User Attributes: A piece of user-created information that belongs to a feature. The attribute may have been part of a source dataset or may have been created in an ad-hoc manner within FME. Not all formats accept user attributes and the ones that do sometimes put restrictions on it. Each listed attribute is defined by its name, data type, width, and number of decimal places.
- Format Attributes: Built-in FME attributes that you can "expose" or make visible so that you can set them to particular values and connect them to other format-specific attributes. These attributes allow a wide variety of special things to be done with formats (like setting line thickness, creating special entities, and setting particular bits or bytes).
Feature Type Names use format-specific terminology, so instead of Feature Type, the Name parameter label might be Feature Class, Layer, Sheet, Table, or whatever terminology is specific to the format of data you are writing. For example, an Excel feature type is a Sheet Name:
Reader feature type are disabled by default, since source attributes represent the physical schema of the data. If they were changed, the schema would no longer match the Reader dataset. (Note that it is possible to enable reader feature type editing, but this feature is only suggested for use in some advanced scenarios. See the FME Workbench help for more information.)
Schema Editing
As noted, initially the Writer schema in a workspace is a mirror image of the source. However, in many cases you will want the output to have a different data structure.
Schema Editing is the process of altering the destination schema to customize the structure of the output data. One good example is renaming an attribute field in the output. After editing, the source schema still represents what we have, but the destination schema now truly does represent what we want.
Editable Components
There are a number of edits that can be performed, including, but not limited to, the following.
- Attribute renaming: Attributes on the destination schema can be renamed.
- Attribute type changes: Any attribute on the writer schema can have a change of type; for example, changing a character field to a number field. The Type column for an attribute shows only values that match the permitted types for that data format. For example, an Oracle schema permits attribute types of varchar or clob. MapInfo does not support these data types, so they would never appear in a MapInfo schema.
- Feature type renaming: You can change any feature type name.
- Geometry type changes: This field is only available where the format requires a decision on geometry type.
Handling Conflicts with FME Attributes and Format Attributes
Since both generic FME attributes and format-specific attributes exist on workspace features, it is important to note that the co-existence of the two types of attributes can sometimes cause a conflict. If this happens between a reader and a writer, the generic fme
attribute will take precedence.
For example, if a feature contains a format-specific color specification, and the optional fme_color
attribute is changed between the reader and the writer,fme_color
will take precedence and the format-specific color specification will be deleted from the workspace. (However, if a feature within a writer contains a format-specific color specification, then that will supersede fme_color
.
This possible conflict also applies if you alter a feature’s geometry in a workspace that has the same source and destination format. If you alter the geometry from the reader to the writer, then the generic fme_type
will be used, and the format-specific geometry type will be deleted.
If this conflict produces unexpected results in your workspace, follow these steps to remove it:
- Expose the generic attributes that conflict with the format-specific attributes. This adjustment is made in the Feature Type parameters dialog of the reader, on the Format Attributes tab.
- Use the AttributeRemover transformer to remove the exposed attributes. For more information, see AttributeRemover in the FME Transformers help.
Note: For more information about schema mapping, feature mapping, and attribute mapping, see the FME Workbench help.