Syntax

   FACTORY_DEF {<ReaderKeyword>} SchemaScannerFactory
      [FACTORY_NAME <factory name>] 
      [INPUT FEATURE_TYPE <feature type>
         [<attribute name> <attribute value>]*
         [<feature function>]*]*
      [GROUP_BY [<attribute name>]+]*
      [FLUSH_WHEN_GROUPS_CHANGE (Yes|No)]
      [SCHEMA_FEATURES_FIRST (Yes|No)]
      [IGNORE_ATTRIBUTES_CONTAINING_REGEX <pattern>]*
      [IGNORE_ATTRIBUTES_CASE_SENSITIVE (Yes|No)]
      [COERCE_CHAR_TO_VARCHAR (Yes|No)]
      [FIX_DATE_ATTRS (Yes|No)]
      [INFER_EMPTY_SCHEMA (Yes|No)]
      [MAX_FEATURES_TO_SCAN { <max_feats> } ]
      [SCHEMA_DEF_NAME_ATTR <attrName>]
      [TYPE_OPTIONS] { <scannerOption>[=<value>)]* }
      [TEMPLATE_SCHEMA <attrName>,<attrType>,... ]

      [OUTPUT (SCANNED|SCHEMA) FEATURE_TYPE <feature type>
         [<attribute name> <attribute value>]*
         [<feature function>]*]*

Overview

   This factory scans the input features' attributes using FME's "Schema Scanner"
   funcationality, generating a schema feature suitable for use as a schema source in
   a writer's dynamic schema with "Schema From Schema Feature".
   (https://community.safe.com/s/article/dynamic-workflow-tutorial-destination-schema-is-de-2)

   It might also be useful within a workspace which validates data quality.

   If the factory is set up for GROUP_BY, then the feature type for the
   schema will be derived from the group key, by joining them with a
   colon between key parts. (e.g. "Japan:Hydro" for keys "country" and
   "utility").

   If used to define dynamic schema, the schema feature should precede the data for
   which it is defining the schema. Its "fme_feature_type_name" attribute is joined
   to a specified attribute on the data feature, but in the case of more than
   a single attribute in the group key, there is unlikely to be an attribute which
   matches the colon-separated list. If the data is to be grouped by more than a
   single attribute, it is recommended that the grouping attributes are first
   combined to a single attribute value prior to calling the SchemaScanner,
   and that this attribute is used to configure the writer feature typ eto
   match the features back to their schema feature.

   The TYPE_OPTIONS allows direct configuration of the STFSchemaScanner with specific
   properties. The <scannerOption> may be specified by itself (turning a feature
   ON) or as something like "SCAN_USE_UNBOUNDED_TYPES=No" or
   "SCAN_USE_UNBOUNDED_TYPES=No" to turn it OFF. If the option takes a string,
   it can be passed as an (FMEParsableText-encoded) value, such as "SCAN_FOR_DATE_ATTRIBUTE_FORMAT=%Y%m%d".
   (If not specified, all options will use STFSchemaScanner's defaults.)
   
   The TEMPLATE_SCHEMA directive is used to suggest an ordering for scanned
   attributes. Attributes found in this list will appear at the front
   of the generated schema, in the order specified. Its value contains
   a comma-delimited sequence of (name,type), where each part is FMEParsableText-encoded. 
   The names are used to define the attribute order, and the types are
   currently ignored in most cases.

   The types of the template schema are relevant when INFER_EMPTY_SCHEMA is set to Yes.
   In this case, any empty, null or missing attibutes will inherit their types
   from the template schema instead of being discarded from the output schema.
   This option must be used with care, as it could lead to a lot of unintentional
   attributes ending up on the output schema.

   If date scanning is enabled and FIX_DATE_ATTRS is given a value of Yes,
   then the factory will attempt to "fix" the content of attributes on the
   emitted SCANNED features that scanned as fme_date so that the are of
   form that writers will expect in an fme_date attribute.

   DEVELOPER NOTE: The values for <scannerOption>s are given directly to
   STFSchemaScanner to interpret. The only tested options are:
   SCAN_USE_UNBOUNDED_TYPES, SCAN_FOR_INTEGER_CONTAINER_TYPES,
   SCAN_PREFER_UNSIGNED_TYPES_FOR_INTEGERS, SCAN_USE_UNBOUNDED_STRINGS,
   SCAN_FOR_DATE_ATTRIBUTES, SCAN_FORCE_DATETIME, SCAN_FOR_DATE_ATTRIBUTE_FORMAT.
   Other "SCAN_*" options should also work, if STFSchemaScanner looks for
   them in STFSchemaScanner::processMetaDirective().
   (To see other options, consult <contdefs.h> or <schemscn.cpp>.)

Input Tags

   The SchemaScannerFactory has no named input tags. All input features are
   scanned to compute schema information.

Output Tags

   The SchemaScannerFactory supports the following output tags.
   
   SCANNED
   
      A feature which has been scanned to compute schema.  If
      SCHEMA_FEATURES_FIRST was not specified or was given a "No" value,
      the scanned features will be written out before the computed schema
      feature. Otherwise, they will be held back until their schema feature
      has been fully computed.

      SCHEMA_DEF_NAME_ATTR names an attribute (typically "fme_feature_type")
      which, if specified, is added to the output scanned features so that
      they can be associated back to their corresponding schema feature. Its
      value will match the `fme_feature_type_name` attribute in the schema
      feature.

   SCHEMA

      A schema feature computed for a group of input features. It will be
      emitted either before or after all of the schema features it represents,
      depending on the value specified SCHEMA_FEATURES_FIRST.

      The feature type will be formed from the GROUP_BY key. If this is
      empty (such as when GROUP_BY is not specified), the first scanned
      feature's "fme_feature_type" attribute value will be used, if present.

      The format of the schema feature will be suitable for feeding to
      the first feature of a writer's dynamic schema.