SchemaScanner

Produces a schema feature representing the feature type definition for each group of input data features.

Jump to Configuration

Typical Uses

  • Generating a schema feature for dynamic writers

  • Generating schema features for comparison for schema validation and schema drift

  • Generating schemas after merging or manipulating datasets

How does it work?

The SchemaScanner receives features and determines their schema by scanning for attribute names and data types, based on the features' structure and attribute values.

It will scan either all features or a specified number of them, and can exclude certain attributes based on names, such as format-specific or internal FME attributes.

The resulting schema is output as a new schema feature, which has a specific form of list attribute and is output via the <Schema> port. It also receives a special attribute and value: fme_schema_handling = ‘schema_only’, which tells a dynamic writer to use that feature as a schema and then remove it from the output.

The original input features are passed out via the Output port, unchanged.

The output order of the schema features relative to the data (input features) can be controlled using the Output Schema Features Before Data. For use with dynamic writers, the schema features should be output first.

Attribute Generation

Schemas can be generated with unbounded or bounded attributes, according to the Numeric Type Format parameter:

  • Unbounded produces numeric types including int, uint, and real.

  • Bounded produces numeric types of fme_decimal(a,b) where a is the number of digits before the decimal, and b the number of digits after (precision). It is recommended to scan all features when using Bounded to ensure all existing attribute value lengths are considered.

String attributes are always bounded. SchemaScanner does not recognize date/time data types.

It does not maintain the original order of attributes.

Excluding Attributes

SchemaScanner processes all attributes on incoming features, including fme and format attributes. It is possible to ignore attributes using the Ignore Attributes Containing parameter.

Enter a regular expression, and matching attributes will be ignored.

For example, if the source data is CSV, you could use the regular expression ^fme_|^multi_|^csv_ to ignore any attributes starting with fme_, multi_, or csv_.

Schema Features

Schema features can be used to store or pass along schema structures - to dynamic writers, for example. The schema is stored in a list attribute named attribute, as shown here.

Each attribute has a name and an fme_data_type - note the attribute LAT has a corresponding data type of fme_real64.

Data types are FME internal data types.

Usage Notes

  • Schema features may also be generated manually, or by using the FeatureReader's schema options. Two readers also generate schemas - the Schema (Any Format) reader and the Schema (From Table) reader.
  • When using the SchemaScanner with a dynamic writer, the Output Schema Features Before Data parameter should be set to Yes, so that the schema arrives at the writer prior to the data features.

Configuration

Input Ports

Output Ports

Parameters

Editing Transformer Parameters

Using a set of menu options, transformer parameters can be assigned by referencing other elements in the workspace. More advanced functions, such as an advanced editor and an arithmetic editor, are also available in some transformers. To access a menu of these options, click beside the applicable parameter. For more information, see Transformer Parameter Menu Options.

Defining Values

There are several ways to define a value for use in a Transformer. The simplest is to simply type in a value or string, which can include functions of various types such as attribute references, math and string functions, and workspace parameters. There are a number of tools and shortcuts that can assist in constructing values, generally available from the drop-down context menu adjacent to the value field.

Dialog Options - Tables

Transformers with table-style parameters have additional tools for populating and manipulating values.

Reference

Processing Behavior

Group-Based

Feature Holding

If Output Schema Features Before Data is Yes then the transformer will block all the incoming data features. This is usually required if you are using the schema feature with a dynamic writer.

Target Number of Features to Scan will also block the data features - up to the number of features selected (or all features, if left blank).

Dependencies None
Aliases  
History  

FME Community

The FME Community is the place for demos, how-tos, articles, FAQs, and more. Get answers to your questions, learn from other users, and suggest, vote, and comment on new features.

Search for all results about the SchemaScanner on the FME Community.

 

Examples may contain information licensed under the Open Government Licence – Vancouver and/or the Open Government Licence – Canada.