XMLFragmenter
Maps elements from an XML document into XML fragments, and optionally flattens the content of the XML elements and the children further as feature attributes.
Configuration
Output Ports
Each fragment is output as a separate FME feature via the Fragments port. Each feature from the port will have an xml_fragment attribute holding the fragment. The fragment is a valid XML document that may be further processed via subsequent XML-based and/or XQuery-based transformers.
Three additional attributes are added to the Fragments features:
- xml_matched_element – records the element that was matched. This attribute can be used to identify which element matched the expression, if the last component of the matched expression is a wildcard character (*).
- xml_id – holds an ID for that element. This attribute is not guaranteed to be globally unique, but it will be unique only in the context of the input document.
- xml_parent_id – holds an ID for the parent of that element. If the parent of the element is not matched or it does not have any parent, then this attribute is empty.
- xml_parent_child_pos – holds the position of the element in relation to its parent. If the parent of the element is not matched or it doesn’t have any parent, then this attribute is empty. The xml_parent_child_pos starts its count at 0.
If Flatten Options is enabled, then the Fragments features will have additional attributes related to the contents of the matched XML element.
If Reject Features With No Fragments is Yes, features that produce no fragments are output through this port.
Features that could not be successfully processed are also output through this port. Typically this happens when the attribute specified in the XML Attribute parameter does not have a value, or has a value that is not valid XML.
Parameters
XML Source Type: XML File/Attribute with XML Document |
The XML source type is either an XML file or a feature attribute whose value is the entire XML document. |
Elements to Match |
This parameter specifies which fragments to map. The Feature Paths are either whitespace-separated xfMap match expressions or each expression can be specified on new line. This parameter can be typed directly in the text box or click the browse button to display the editor or choose a feature attribute. Example <dc:metadata xmlns:dc="http://purl.org/dc/elements/1.1/"> <dc:subject>Utah</dc:subject> <dc:subject>boundaries</dc:subject> <dc:subject>County</dc:subject> <dc:subject>Administrative</dc:subject> <dc:subject>geoscientificInformation</dc:subject> <dc:description>This data set represents county boundaries in Utah at 1:24,000 scale.</dc:description> <dc:date>2004-04-20T00:00:00.000</dc:date> <dc:type>dataset</dc:type> <dc:identifier xmlns:dc="http://purl.org/dc/elements/1.1/">{42AE2814-FCC1-4BC2-BAF4-CA3E55514997}</dc:identifier> <dc:language>en</dc:language> <dc:spatial> <dcmiBox:Box name="Geographic" projection="EPSG:4326" xmlns:dcmiBox="http://dublincore.org/documents/2000/07/11/dcmi-box/"> <dcmiBox:northlimit units="decimal degrees">42.01</dcmiBox:northlimit> <dcmiBox:eastlimit units="decimal degrees">-109.21</dcmiBox:eastlimit> <dcmiBox:southlimit units="decimal degrees">36.98</dcmiBox:southlimit> <dcmiBox:westlimit units="decimal degrees">-114.1</dcmiBox:westlimit> </dcmiBox:Box> </dc:spatial> <dc:rights></dc:rights> </dc:metadata> These are a few Feature Paths xfMap expressions targeting the above <dc:metadata> input document:
|
||||||||
Elements to Exclude |
If a feature path in ‘Elements to Match’ matches multiple elements, then this parameter can be used to specify which elements should be excluded in the results. The input to this parameter also takes the form of the feature path xfMap expressions described in the ‘Elements to Match’ parameter. Using the input document above, if the ‘Elements to Match’ is set to ‘dcmi:Box/*’ and ‘Elements to Exclude’ is set to ‘dcmi:northlimit dcmi:eastlimit” then only 2 fragment features will be output corresponding only to <dcmi:southlimit> and <dcmi:westlimit> elements. |
||||||||
Reject Features With No Fragments |
If Yes, features that produce no fragments are output through the <Rejected> port. |
Merge Attributes From Input Feature |
Setting this parameter to Yes will merge the attributes from the input feature to the output features. |
||||||||||||||||||||||||||||||||||||||||||||
Elements as XML Fragments |
This parameter can be specified to extract the children of the matched elements as xml fragments. Example The same XML input as shown in the above example – with the Feature Paths xfMap expression set to “dcmiBox:Box”, the default options accepted in Flatten Options, and Elements As XML Fragments set to ‘dcmi:northlimit dcmi:southlimit” – will produce the following feature. (The differences compared to the previous example are highlighted in bold.) +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Feature Type: `XMLFragmenter_FRAGMENTS' Attribute(encoded: utf-16): `eastlimit' has value `-109.21' Attribute(encoded: utf-16): `eastlimit.units' has value `decimal degrees' Attribute(string) : `fme_type' has value `fme_no_geom' Attribute(encoded: utf-16): `northlimit' has value `42.01' Attribute(encoded: utf-16): `northlimit.units' has value `decimal degrees' Attribute(encoded: utf-16): `southlimit' has value `36.98' Attribute(encoded: utf-16): `southlimit.units' has value `decimal degrees' Attribute(encoded: utf-16): `westlimit' has value `-114.1' Attribute(encoded: utf-16): `westlimit.units' has value `decimal degrees' Attribute(encoded: utf-16): `xml_fragment' has value `?<?xml version="1.0" encoding="UTF- 16"?><dcmiBox:Box name="Geographic" projection="EPSG:4326" xmlns:dcmiBox="http://dublincore.org/documents/2000/07/11/dcmi-box/"> <dcmiBox:northlimit units="decimal degrees">42.01</dcmiBox:northlimit> <dcmiBox:eastlimit units="decimal degrees">-109.21</dcmiBox:eastlimit> <dcmiBox:southlimit units="decimal degrees">36.98</dcmiBox:southlimit> <dcmiBox:westlimit units="decimal degrees">-114.1</dcmiBox:westlimit> </dcmiBox:Box>' Attribute(encoded: utf-16): `xml_fragment_northlimit{0}' has value `<?xml version="1.0" encoding="UTF-16"?><dcmiBox:northlimit units="decimal degrees" xmlns:dcmiBox="http://dublincore.org/documents/2000/07/11/dcmi-box/">42.01</dcmiBox:northlimit>' Attribute(encoded: utf-16): `xml_fragment_southlimit{0}' has value `<?xml version="1.0" encoding="UTF-16"?><dcmiBox:southlimit units="decimal degrees" xmlns:dcmiBox="http://dublincore.org/documents/2000/07/11/dcmi-box/">36.98</dcmiBox:southlimit>' Attribute(encoded: utf-16): `xml_id' has value `id-Box-1.2.1.11.1' Attribute(encoded: utf-16): `xml_matched_element' has value `Box' Attribute(string) : `xml_type' has value `xml_no_geom' Geometry Type: Unknown (0) ================================================= |
||||||||||||||||||||||||||||||||||||||||||||
Flatten Options |
The Options button opens the XML Flatten Options dialog. These options control the children of the matched elements to be flattened as attributes/attribute lists on the features produced. The default view is Basic mode, where several options are listed:
The Advanced button opens the Advanced Editor, which provides additional options for customizing the feature attributes. The functionality of each option is described in the table below. The options here allows customization of the attributes and attribute lists of the matched XML subtree that will be added to FME Features.
All the options have more detailed examples and descriptions in the FME Readers/Writers manual: XML (Extensible Markup Language) Reader/Writer. Example Given the same XML input as above, and Feature Paths xfMap expression is set to “dcmiBox:Box” with the default options in “Flatten Options” will produce the following feature: +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Feature Type: `XMLFragmenter_FRAGMENTS' Attribute(encoded: utf-16): `eastlimit' has value `-109.21' Attribute(encoded: utf-16): `eastlimit.units' has value `decimal degrees' Attribute(string) : `fme_type' has value `fme_no_geom' Attribute(encoded: utf-16): `northlimit' has value `42.01' Attribute(encoded: utf-16): `northlimit.units' has value `decimal degrees' Attribute(encoded: utf-16): `southlimit' has value `36.98' Attribute(encoded: utf-16): `southlimit.units' has value `decimal degrees' Attribute(encoded: utf-16): `westlimit' has value `-114.1' Attribute(encoded: utf-16): `westlimit.units' has value `decimal degrees' Attribute(encoded: utf-16): `xml_fragment' has value `?<?xml version="1.0" encoding="UTF- 16"?><dcmiBox:Box name="Geographic" projection="EPSG:4326" xmlns:dcmiBox="http://dublincore.org/documents/2000/07/11/dcmi-box/"> <dcmiBox:northlimit units="decimal degrees">42.01</dcmiBox:northlimit> <dcmiBox:eastlimit units="decimal degrees">-109.21</dcmiBox:eastlimit> <dcmiBox:southlimit units="decimal degrees">36.98</dcmiBox:southlimit> <dcmiBox:westlimit units="decimal degrees">-114.1</dcmiBox:westlimit> </dcmiBox:Box>' Attribute(encoded: utf-16): `xml_id' has value `id-Box-1.2.1.11.1' Attribute(encoded: utf-16): `xml_matched_element' has value `Box' Attribute(string) : `xml_type' has value `xml_no_geom' Geometry Type: Unknown (0) ================================================= |
||||||||||||||||||||||||||||||||||||||||||||
Ignore External DTD |
Setting this parameter to Yes instructs the underlying XML parser to disable the loading of an external DTD. |
||||||||||||||||||||||||||||||||||||||||||||
Entity Resolver |
By the default this parameter is Enabled and a custom entity resolver is used to resolve external entities such as DTDs. The installed custom entity resolver relies on an URI Map that incoming URIs to resolved URIs. The resolved URIs are usually URLs with local copies of the resource. If an XML document does not have any external entities that need to be resolved, this parameter may be Disabled and the URI Map will not be loaded unnecessarily. |
Attributes to Expose |
Exposes attributes so they can be used elsewhere in the workspace. Attribute names may be entered directly or provided in the Enter Values for Attributes to Expose dialog accessed via the ellipsis button, where data type may also be specified. For more information on exposed and unexposed attributes, see Understanding Feature Types and Attributes. |
Editing Transformer Parameters
Using a set of menu options, transformer parameters can be assigned by referencing other elements in the workspace. More advanced functions, such as an advanced editor and an arithmetic editor, are also available in some transformers. To access a menu of these options, click beside the applicable parameter. For more information, see Transformer Parameter Menu Options.
Defining Values
There are several ways to define a value for use in a Transformer. The simplest is to simply type in a value or string, which can include functions of various types such as attribute references, math and string functions, and workspace parameters. There are a number of tools and shortcuts that can assist in constructing values, generally available from the drop-down context menu adjacent to the value field.
Using the Text Editor
The Text Editor provides a convenient way to construct text strings (including regular expressions) from various data sources, such as attributes, parameters, and constants, where the result is used directly inside a parameter.
Using the Arithmetic Editor
The Arithmetic Editor provides a convenient way to construct math expressions from various data sources, such as attributes, parameters, and feature functions, where the result is used directly inside a parameter.
Conditional Values
Set values depending on one or more test conditions that either pass or fail.
Parameter Condition Definition Dialog
Content
Expressions and strings can include a number of functions, characters, parameters, and more.
When setting values - whether entered directly in a parameter or constructed using one of the editors - strings and expressions containing String, Math, Date/Time or FME Feature Functions will have those functions evaluated. Therefore, the names of these functions (in the form @<function_name>) should not be used as literal string values.
These functions manipulate and format strings. | |
Special Characters |
A set of control characters is available in the Text Editor. |
Math functions are available in both editors. | |
Date/Time Functions | Date and time functions are available in the Text Editor. |
These operators are available in the Arithmetic Editor. | |
These return primarily feature-specific values. | |
FME and workspace-specific parameters may be used. | |
Creating and Modifying User Parameters | Create your own editable parameters. |
Dialog Options - Tables
Transformers with table-style parameters have additional tools for populating and manipulating values.
Row Reordering
|
Enabled once you have clicked on a row item. Choices include:
|
Cut, Copy, and Paste
|
Enabled once you have clicked on a row item. Choices include:
Cut, copy, and paste may be used within a transformer, or between transformers. |
Filter
|
Start typing a string, and the matrix will only display rows matching those characters. Searches all columns. This only affects the display of attributes within the transformer - it does not alter which attributes are output. |
Import
|
Import populates the table with a set of new attributes read from a dataset. Specific application varies between transformers. |
Reset/Refresh
|
Generally resets the table to its initial state, and may provide additional options to remove invalid entries. Behavior varies between transformers. |
Note: Not all tools are available in all transformers.
FME Community
The FME Community is the place for demos, how-tos, articles, FAQs, and more. Get answers to your questions, learn from other users, and suggest, vote, and comment on new features.
Search for all results about the XMLFragmenter on the FME Community.
Keywords: XMLExploder