Syntax FACTORY_DEF * XMLFormatterFactory [FACTORY_NAME ] [INPUT FEATURE_TYPE [ ]* []*]* [XML_FILE | XML_TEXT | XML_TEXT_ATTR ] [FORMATTING (LINEARIZE|PRETTY_PRINT |NONE)] [CLEANUP_NAMESPACE (YES|NO)] [INDENT_SIZE (0..9)] [REPLACE_TABS_WITH_SPACES (YES|NO)] [TEXT_INDENT (YES|NO)] [CLEANUP_SCHEMALOCATION (YES|NO)] [FILL_MISSING_NAMESPACE (YES|NO)] [REMOVE_ALL_COMMENT (YES|NO)] [REMOVE_EMPTY_ELEMENT (YES|NO)] [COLLAPSE_EMPTY_ELEMENT (YES|NO)] [SAMPLE_XML_FOR_NAMESPACE ] [DETECT_IGNORABLE_WHITESPACE (Yes|Remove|No)] [SCHEMA_LOC filePath] [LIST_ATTR ] [OUTPUT (PASSED|FAILED) FEATURE_TYPE [ ]* []*]* Overview This factory provides different options to format, remove extraneous namespace declarations, collapse empty elements and remove all xsi:schemaLocation in XML file or text and record any warning or errors that occur as list attribute on the features. There are different ways to specify the XML source to be validated: specifying the location of the file (local, network or web URL) using the XML_FILE clause, the encoded text using the XML_TEXT clause or using the XML_TEXT_ATTR to specify the attribute on features that contains the text to be validated. In the cases where the input fails the validation, the factory will tag on the list of errors and warnings on the feature as a list attribute specified in LIST_ATTR clause. The following elements are added to the list that further describes the errors or warnings (assuming listName is the attribute list specified in this clause): listName{}.type, listName{}.file , listName{}.line, listName{}.col and listName{}.desc. XML Formatting ~~~~~~~~~~~~~~ The FORMATTING clause controls what kind of formatting is performed on the input. If NONE is specified, then no formatting is performed. If LINEARIZE is specified, then the output XML will be put into a single line. If PRETTY_PRINT is specified, then XML elements will be formatted by adding indentations and new lines for improved readability. DETECT_IGNORABLE_WHITESPACE and SCHEMA_LOC clauses can be set to determine whether whitespace is significant or not in the XML document as defined in the inline xsi:schemaLocation or external schema. If DETECT_IGNORABLE_WHITESPACE is set to yes, then the schema files specified in the xsi:schemaLocation attribute are used to determine the significance of whitespace. Optionally, an external schema file can be used instead by setting the SCHEMA_LOC clause to a valid file path. If set to remove, extra whitespace will be removed. This includes blank lines and the whitespace at the beginning and end of the content between the start tag and end tag. The INDENT_SIZE clause specifies the indentation size. The valid values for this clause are the numbers 0 to 9. The default value for this directive is 1. By default the Tab character is used to pretty-print the indentations; set the REPLACE_TABS_WITH_SPACES clause to YES to substitute the Tab character with a Space. The INDENT_TEXT clause specifies whether the text is also pretty printed. The default value is "no" and leaves the text untouched. Removal of redundant Namespace Declarations ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ If CLEANUP_NAMESPACE clause is specified and set to YES, then the factory will remove redundant and extraneous namespace declarations. For example, the following XML document that has redundant namespace declarations: some text some text Selecting Yes for this parameter will return the following results: some text some text Remove Embedded xsi:schemalocation ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ If CLEANUP_SCHEMALOCATION clause is specified and set to YES, this factory will remove all embedded xsi:schemalocation attribute from all elements that are not the root element. Remove All Comments ~~~~~~~~~~~~~~~~~~~ If REMOVE_ALL_COMMENT clause is set to YES, this parameter removes all comments within the XML input. Remove Empty Elements ~~~~~~~~~~~~~~~~~~~~~ If REMOVE_EMPTY_ELEMENT clause is set to YES, this parameter removes elements that have no attributes and contain no content between the Start and End Tag. For example, will be removed, however will not since it contains an attribute. Collapse Empty Elements ~~~~~~~~~~~~~~~~~~~~~~~ If COLLAPSE_EMPTY_ELEMENT clause is set to YES, this parameter creates an empty tag for elements that have no content between the Start and End Tag. For example, will be collapsed into Clauses XML_FILE The file path of the XML file used in the validation. This can be specified as either local file, network file or web URL Default: None Example: XML_FILE "someFile.xml" XML_TEXT_ATTR Specifies the input attribute name that contains the XML text Default: None Example: XML_TEXT_ATTR xml_string XML_TEXT XML text to be validated. It needs to be FMEParsableText-Encoded Default: None Example: XML_TEXT "?xmlversion=1.0encoding=UTF-8?" PARSER_CHOICE (XERCES|LIBXML2) Specifies the underlying parser to use, Xerces is the default parser. FORMATTING (NONE|PRETTY_PRINT|LINEARIZE) Specifies the type of formatting to be performed. None: no validation is performed Pretty print: xml elements are indented for improved readability Linearize: the output xml text will be put in a single line Default: None INDENT_SIZE (0..9) It specifies the indentation size. The valid values for this clause are the numbers 0 to 9. Default: 1 REPLACE_TABS_WITH_SPACES (YES|NO) By default the Tab character is used to pretty-print the indentations; set this clause to YES to substitute the Tab character with a Space. The size of a single indentation is controlled by the INDENT_SIZE clause. The valid values for this directive are YES and NO. The default value is NO. Default: No TEXT_INDENT (YES|NO) Specifies whether text is also pretty printed. The default value is "no" and leaves the text untouched. Default: No CLEANUP_NAMESPACE (YES|NO) Removes redundant namespace declarations Default:NO CLEANUP_SCHEMALOCATION (YES|NO) Removes xsi:schemalocation attribute from XML elements except from the root element Default: NO REMOVE_ALL_COMMENT (YES|NO) Removes all comments from the XML document Default: No REMOVE_EMPTY_ELEMENT (YES|NO) Removes all empty elements that contain no content or attributes Default: No COLLAPSE_EMPTY_ELEMENT (YES|NO) Creates an empty tag for elements that have no content between the Start and End Tag. LIST_ATTR Specifies the list attribute to be added to features that fail the validation. The elements of the list describes the error or warnings with more details Default: no Example: LIST_ATTR _xml_error DETECT_IGNORABLE_WHITESPACE (YES|REMOVE|NO) If schema files are specified in xsi:schemaLocation attribute or SCHEMA_LOC clause, then this clause specifies the significance of whitespace when formatting the XML Document. There will be an attempt to remove excess whitespace which includes blank lines and extra whitespace between the beginning and end tags. Default: NO Example: DETECT_IGNORABLE_WHITESPACE YES SCHEMA_LOC Specifies the path of an external schema file Default: None Example: SCHEMA_LOC c:\fme\schema_file.xsd Output Tags The XMLFormatterFactory supports the following output tags. PASSED Features that pass the validation will be output through this tag INVALID Features that fail the validation will be output through this tag with LIST_ATTR list attribute added to the features explaining the errors or warnings TO BE RESOLVED FILL_MISSING_NAMESPACE and SAMPLE_XML_FOR_NAMESPACE clauses added to syntax section above, but not documented.