Structure Element
Every feature mapping rule may contain an optional <structure> element that allows an XML subtree, that is rooted in a mapping rule's matched element, be added as attribute lists to the FME feature under construction. FME attribute lists behave just as primitive attributes, except that they may contain an index enclosed in braces to identify an element of the list. Attribute list elements may contain primitives or other attribute lists.
The attribute lists indices will likely not correspond to the XML subtree element order. Consider the following XML subtree rooted at <a>:
<a> <b/><c/><b/><b/> </a>
Element <b> is a repeating child of <a>, but the indices for the list attributes are such that they must increase without gaps and as a consequence the ordering for the children of <a> is lost when we have interweaving repeating child elements:
a{0}.b{0}
a{0}.b{1}
a{0}.b{2}
a{0}.c{0}
The structure instruction in a feature mapping rule may be specified by a single empty <structure/> element. This directs the XML reader to start constructing FME attribute lists from the subtree rooted at the matched element:
<mapping match="..."> <feature-type> ... </feature-type> <attributes> ... </attributes> <geometry> ... </geometry> <structure/> </mapping>
Here is the complete set of options that can be set on the <structure> element, all of which are optional:
<structure
map-empty-elements=”yes|no”
attribute-identifier=”...”
child-position-attribute=”...”
structure-prefix=”...”
separator="..."
open-list-brace=”...”
close-list-brace=”...”
matched-prefix=”yes|no|children|attributes”
matched-attributes=”yes|no”
cardinality=”...”
use-namespace-prefix=”yes|no”
except="..."
matched-ancestor-attributes=”...”/>
Option Name | Description | Default Value | Possible Values |
---|---|---|---|
separator | Separator used in the naming of the attributes of the children | . | Any string |
open-list-brace | Open list index delimiter brace | { | Any String |
close-list-brace | Close list index delimiter brace | } | Any String |
map-empty-elements | Specifies whether empty elements will be added as empty feature attributes. | yes | yes | no |
matched-prefix | Specifies whether feature attributes should be prefixed with the name of the matched element | yes | yes | no | children | attributes |
matched-attributes | Specifies whether XML attributes of the matched element should be mapped as feature attributes | yes | yes | no |
matched-ancestor-attributes | Specifies the ancestor elements of the matched element whose XML attributes should be mapped as feature attributes | “” | Space-separated values of the following: parent | grandparent | root | any integer values |
cardinality | Controls whether attributes should be output as list attributes or not | +{?} | Refer to documentation below |
except | Feature path expressions to specify which children of the matched element should be excluded in the mapping | “” | Feature path xfMap expressions |
structure-prefix | Prefix for every attribute that is generated for this structure | “” | Non-empty string |
child-position-attribute | If set to non-empty string, each child element of the matched element generates an additional feature attribute whose value is the position of the child within its parent. | “” | Non-empty string |
attribute-identifier | Feature attribute name for XML attributes is suffixed with the value specified. | “” | Non-empty string |
Consider the following XML document, a_items.xml:
a_items.xml
<?xml version="1.0"?> <a-items> <a> <b>a0b0</b> <c x="first x-val" y="first y-val">a0c0</c> <b>a0b1</b> <d><e>a0e</e></d> <b>a0b2</b > <f></f> <g/> </a> </a-items>
The following xfMap document, a.xmp, maps each <a> element into an FME feature while turning the subtree that is rooted at <a> into attribute lists:
a.xmp:
<?xml version="1.0"?> <xfMap> <feature-map> <mapping match="a"> <feature-type><literal expr="a"/></feature-type>> <structure/> </mapping> </feature-map> </xfMap>
The a.xmp constructs the following feature from a_items.xml:
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Feature Type: a'
Attribute(string): a{0}.b{0}' has value a0b0'
Attribute(string): a{0}.b{1}' has value a0b1'
Attribute(string): a{0}.b{2}' has value a0b2'
Attribute(string): a{0}.c{0}' has value a0c0'
Attribute(string): a{0}.c{0}.x' has value first x-val'
Attribute(string): a{0}.c{0}.y' has value first y-val'
Attribute(string): a{0}.d{0}.e{0}' has value a0e'
Attribute(string): xml_type' has value xml_no_geom'
Geometry Type: Unknown (0)
============================================================================
XML attributes in the FME attribute lists are represented without an index. Notice that the x and y attributes for the <c> element in the a_items.xml document do not have a list index in the FME feature.
It is important to notice that the <f> and <g> elements in the above example did not map over to the FME feature as attributes, this is because <f> and <g> do not have character content. To make the XML reader create the corresponding FME feature list attributes a{0}.f{0} and a{0}.g{0} for the empty <f> and <g> elements, respectively, the optional map-empty-elements xml attribute for the structure element should be set to yes. For example:
<?xml version="1.0"?> <xfMap> <feature-map> <mapping match="a"> <feature-type><literal expr="a"/></feature-type>> <structure map-empty-elements=”yes”/> </mapping> </feature-map> </xfMap>
In addition, XML attributes can also be differentiated from leaf elements, by letting the XML reader append a prefix to their name. The <structure> element may have an optional attribute-identifier xml attribute whose value becomes the prefix for the name in the FME attribute list.
As noted earlier, the list indices in the generated attribute names only preserve the ordering for elements with the same name. The child-position-attribute attribute can be used on the structure element to preserve the ordering of all child elements, regardless of name. When this attribute is specified, each child element will generate an additional feature attribute whose value will be the position of the child element within its parent. The name of the attribute will be the list prefix representing the path to the element, followed by the value of the child-position-attribute attribute. If the attribute-identifier attribute is also specified, it will be used in the feature’s position attribute.
Consider applying the following xfMap, a1.xmp, to the structures_items.xml document. The xfMap appends '@' to every list component whose name originated from an xml attribute for elements in the subtree rooted at <a>. Also, each child element has a ‘pos’ attribute containing its position within its parent.
a1.xmp:
<?xml version="1.0"?> <xfMap> <feature-map> <mapping match="a"> <feature-type><literal expr="a"/></feature-type> <structure attribute-identifier="@" child-position-attribute=”pos” /> </mapping> </feature-map> </xfMap>
FME feature constructed:
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Feature Type: a'
Attribute(string): a{0}.b{0}' has value a0b0'
Attribute(string): a{0}.b{1}' has value a0b1'
Attribute(string): a{0}.b{2}' has value a0b2'
Attribute(string): a{0}.c{0}' has value a0c0'
Attribute(string): a{0}.c{0}.@x' has value first x-val'
Attribute(string): a{0}.c{0}.@y' has value first y-val'
Attribute(string): a{0}.d{0}.e{0}' has value a0e'
Attribute(string): xml_type' has value xml_no_geom'
Geometry Type: Unknown (0)
===========================================================================
A prefix may also be attached to every FME attribute list that is generated through a structure for a matched element. The xfMap <structure> element may have an optional structure-prefix attribute whose value becomes the attribute lists prefix. The following a2.xmp xfMap document extends a1.xmp by adding the "myStructurePrefix-" prefix onto the attribute lists for the constructed feature.
a2.xmp:
<?xml version="1.0"?> <xfMap> <feature-map> <mapping match="a"> <feature-type><literal expr="a"/></feature-type> <structure structure-prefix="myStructurePrefix-" attribute-identifier="@"/> </mapping> </feature-map> </xfMap>
Applying the a2.xmp xfMap to the structures-items.xml document makes the XML reader construct the following feature:
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Feature Type: a'
Attribute(string): myStructurePrefix-a{0}.b{0}' has value a0b0'
Attribute(string): myStructurePrefix-a{0}.b{1}' has value a0b1'
Attribute(string): myStructurePrefix-a{0}.b{2}' has value a0b2'
Attribute(string): myStructurePrefix-a{0}.c{0}' has value a0c0'
Attribute(string): myStructurePrefix-a{0}.c{0}.@x' has value first x-val'
Attribute(string): myStructurePrefix-a{0}.c{0}.@y' has value first y-val'
Attribute(string): myStructurePrefix-a{0}.d{0}.e{0}' has value a0e'
Attribute(string): xml_type' has value xml_no_geom'
Geometry Type: Unknown (0)
===========================================================================
In the examples above, the separator used for attributes on the feature is the period '.'. Thus, the element 'e', child of element 'd', child of element 'a' is represented as a{0}.d{0}.e{0}. Each <structure> element can have an optional separator attribute (the default is the period character). The braces delimiting the index of the list attributes may also be substituted. The open and close list delimiters may also be substituted by the optional open-list-brace and close-list-brace attributes, the default values for these attributes are ‘{‘ and ‘}’, respectively.
The following example changes the default separator and list braces:
a3.xmp:
<?xml version="1.0"?> <xfMap> <feature-map> <mapping match="a"> <feature-type><literal expr="a"/></feature-type> <structure attribute-identifier="@" separator="—" open-list- brace=”_” close-list-brace=””/> </mapping> </feature-map> </xfMap>
FME feature constructed:
++++++++++++++++++++++++++++++++++++++++++++++++++++
Feature Type: a'
Attribute(string): a_0-b_0' has value a0b0'
Attribute(string): a_0-b_1' has value a0b1'
Attribute(string): a_0-b_2' has value a0b2'
Attribute(string): a_0-c_0' has value a0c0'
Attribute(string): a_0-c_0--x' has value first x-val'
Attribute(string): a_0-c_0--y' has value first y-val'
Attribute(string): a_0-d_0--e{0}' has value a0e'
Attribute(string): xml_type' has value xml_no_geom'
Geometry Type: Unknown (0)
====================================================
It is also possible to tell the XML Reader not to add the matched element as prefix when it is constructing an attribute list. This is done by setting the optional matched-prefix attribute on the xfMap <structure>
element to "no". The valid values for this attribute are "yes", "no", “children” and “attributes” and its default value is “yes”. If set to “yes” then both the matched element’s attributes and children will be prefixed with the name of matched element. If it is set to “children” or “attributes” then only the children or attributes of the matched element are prefixed, respectively.
For example, applying the following xfMap a4.xmp
to the a_items.xml document will remove the a{0}
component from the attribute lists for the constructed FME feature:
a4.xmp:
<?xml version="1.0"?> <xfMap> <feature-map> <mapping match="a"> <feature-type><literal expr="a"/></feature-type> <structure matched-prefix="no" attribute-identifier="@"/> </mapping> </feature-map> </xfMap>
Notice that the a{} component does not appear in the attribute lists:
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Feature Type: a' Attribute(encoded: utf-16): `b{0}' has value `a0b0' Attribute(encoded: utf-16): ` b{0}.@pos' has value `0' Attribute(encoded: utf-16): ` b{1}' has value `a0b1' Attribute(encoded: utf-16): ` b{1}.@pos' has value `2' Attribute(encoded: utf-16): ` b{2}' has value `a0b2' Attribute(encoded: utf-16): ` b{2}.@pos' has value `4' Attribute(encoded: utf-16): ` c{0}' has value `a0c0' Attribute(encoded: utf-16): ` c{0}.@pos' has value `1' Attribute(encoded: utf-16): ` c{0}.@x' has value `first x-val' Attribute(encoded: utf-16): ` c{0}.@y' has value `first y-val' Attribute(encoded: utf-16): ` d{0}.@pos' has value `3' Attribute(encoded: utf-16): ` d{0}.e{0}' has value `a0e' Attribute(encoded: utf-16): ` d{0}.e{0}.@pos' has value `0' Attribute(encoded: utf-16): ` f{0}.@pos' has value `5' Attribute(encoded: utf-16): ` g{0}.@pos' has value `6' Attribute(string): xml_type' has value xml_no_geom' Geometry Type: Unknown (0) ===========================================================================
The optional matched-attributes attribute can be used to control whether the attributes of the matched element should be mapped as FME feature attributes. The valid values for this attribute is “yes” and “no” and its default value is “yes”.
For example, applying the following xfmap A5.xmp to the a_items.xml will ignore the ‘x’ and ‘y’ attributes of the element ‘c’.
A5.xmp:
<?xml version="1.0"?> <xfMap> <feature-map> <mapping match="c"> <feature-type><literal expr="c"/></feature-type> <structure matched-prefix="no" matched-attributes="no"/> </mapping> </feature-map> </xfMap>
Notice that the attributes of element 'c' are ignored and only the text is mapped as an attribute:
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Feature Type: c' Attribute(encoded: utf-16): 'c' has value 'a0c0' Attribute(string): ` xml_type' has value xml_no_geom' Geometry Type: Unknown (0) ===========================================================================
It is also possible to control the appearance of the attributes in those instances where the xml is known to only allow a single instance of an element. In the example above for instance, the <c> and <d> elements might be constrained to only occur once. In these cases, the list-suffix is cluttering up the attribute name. The structure element introduces a mini-language to define the cardinalities of the elements. Below an example is given, followed by more detailed discussion.
A6.xmp:
<?xml version="1.0"?> <xfMap> <feature-map> <mapping match="a"> <feature-type><literal expr="a"/></feature-type> <structure skip-matched="yes" attribute-identifier="@" cardinality="*/c */d{}/+ */+"/> </mapping> </feature-map> </xfMap>
FME feature constructed:
++++++++++++++++++++++++++++++++++++++++++++++++++++
Feature Type: a'
Attribute(string): b' has value a0b0'
Attribute(string): b{1}' has value a0b1'
Attribute(string): b{2}' has value a0b2'
Attribute(string): c' has value a0c0'
Attribute(string): c.@x' has value first x-val'
Attribute(string): c.@y' has value first y-val'
Attribute(string): d{0}.e' has value a0e'
Attribute(string): xml_type' has value xml_no_geom'
Geometry Type: Unknown (0)
====================================================
The cardinality attribute is a space separated list of cardinality directives. In the previous example, the strings are:
- */c
- */d{}/+
- */+
Each forward-slash separated element indicates an element in the xml-document. The asterisk acts as a wildcard, matching any element. A literal string matches the name of an element. The use of the braces ({}) indicates that element should be treated like a list, while no braces indicates that the element should be treated as singular if possible. The use of braces with a question mark ({?}) indicates an optional list. In this mode, elements will be treated as a list if they have siblings with the same name. They will be singular if they don’t share a name. Finally, the trailing '+' , '+{}' or '+{?}' indicates that any further elements along this path through the xml-document should be treated as non-list (+), list ({}) or optional list ({?}) elements.
In the above example, cardinality (1) matches the <a> element (the root) and the <c> element. This indicates that both the root and the <c> element should be treated as singular. Attributes are always singular.
Cardinality (2) matches the the <a> element, followed by the <d> element which should be treated as a list, followed by any number of attributes, all of which should be treated as non-list.
Finally, cardinality (3) matches the root <a>, followed by any other xml elements, all of which should be treated as singular. In this circumstance "*/+" fixes the cardinality of exactly the same set of attributes as "+" would have.
These rules are applied in order to determine matches:
- Literal matches are preferred to wildcard matches. For example, a/+ is preferred to */+
- Literal matches occurring early in a cardinality expression are preferred to literal matches occurring late. For example, a/*/* is preferred to */b/*.
- Non-list elements are preferred to list elements. So /*/foo/ is preferred to /*/foo{}/.
List elements are preferred to optional list elements. So /*/foo{} is preferred to /*/foo{?}.
This doesn't provide a total ordering on the cardinality expressions, since e.g. a/b/c should sort exactly the same way as d/e/f, but since these will not match the same elements, the order doesn't matter. The basic elements of a cardinality expression are: literal matches, consisting of characters matching the name of an xml-element; wildcard matches: "*{}", "*" matching exactly one element, and treating it as a list, or non-list (respectively); and an optional suffix: +, +{} or +{?} to indicate that any further matches should be treated as non-list, list or optional list (respectively).
In case there are no matches found, the default behavior is to assume that the cardinality is specified as "+{}". In order to match, an attribute path (e.g. <a><b><c/></b></a> is matched by a.b.c) must be exactly as long as the cardinality expression. The only exception is that the suffixes "+", "+{}" and “+{?}” extend the cardinality expression as long as is necessary to match a string.
In addition to specifying element names, forward-slash separated elements can also include a namespace prefix and a colon. If a namespace prefix is specified, then the colon must also be specified. In all cases, an element name must also be specified (possibly as a wildcard). If no namespace prefix is given, the effect is the same as specifying a wildcard for the prefix. It is possible to specify a "blank" prefix, by having nothing before the colon. In this case, it will only match if the actual element's prefix is the empty string.
In other words:
- a/b/*/+ is the same as *:a/*:b/*:c/+
- a/:b/c will match a <b> element but not a <test:b> element (where <a> and <c> match)
The '+', '+{}' and ‘+{?}’ suffixes do not currently take a namespace specifier.
If one wants to include the prefix in the name of the attribute (in order to treat elements from different namespaces as different attributes on their FME feature), one must set the attribute "use-namespace-prefix" on the structure element to "yes".
Note that there is no interaction between the skip-matched attribute and the cardinality attribute. This means that even if skip-matched="yes", an element in the cardinality expression must still match it. For example, if we had specified the structure element in a5.xmp to be
<structure skip-matched="yes" cardinality="c d/+{} */+"/>
The result would have been that the only match would be */+, since none of the other cardinality expressions would match the <a> element. In normal use, this simply means that the first element in the cardinality expression should be either a wildcard or the name of the matching element. This can be usefull if one wishes to match a number of different elements, some of which have different cardinality constraints. It allows quite succinct xfMaps to be written.
A7.xmp
<?xml version="1.0"?> <xfMap> <feature-map> <mapping match="a-list/*"> <feature-type><matched expr="local-name"/></feature-type> <structure cardinality="a/b{}/+ a/b{}/c/+{} z/c{}/d{}/e"/> </mapping> </feature-map> </xfMap>
The above example will match any element that is a child of the element "a-list", name the feature according to the element matched, and then use the cardinalities given to determine how to write the attributes out.
Finally, we would often like to exclude some elements of an xml tree from conversion into FME attributes. An obvious case is one such as the following, were we want to map all the xml leaf elements to FME attributes, except those which are used to construct the geometry of the feature.
A8.xml:
<?xml version="1.0"?> <features> <feature> <name>Downtown Harbour</name> <age>132 years </age> <lat>100</lat> <lon>54.2</lon> </feature> <feature> <name>EastSide Harbour</name> <age>38 years </age> <lat>101.2</lat> <lon>54.8</lon> </feature> </features>
A9.xmp:
<?xml version="1.0"?> <xfMap> <feature-map> <mapping match="feature"> <feature-type><literal expr="Harbour"/></feature-type> <geometry activate="xml-point"> <data name="data-string> <extract expr="./lat"/> <literal expr=","/> <extract expr="./lon"/> </data> </geometry> <structure skip-matched="yes" cardinality="+" except="lat lon"/> </mapping> </feature-map> </xfMap>
FME feature constructed:
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Feature Type: Harbour'
Attribute(string): name' has value Downtown Harbour'
Attribute(string): age' has value 132'
Attribute(string): xml_type' has value xml_point'
Geometry Type: IFMEPoint
(100,0, 58.2)
Feature Type: Harbour'
Attribute(string): name' has value Eastside Harbour'
Attribute(string): age' has value 37'
Attribute(string): xml_type' has value xml_point'
Geometry Type: IFMEPoint
(101,2, 54.8)
========================================================================
Here we explicitly exclude the xml elements <lat> and <lon> in order to extract them using the geometry tag (discussed elsewhere in the xfMap documentation). This avoids having attributes which mirror the geometry.
The except attribute accepts the same types of expressions as the match or except attribute of a mapping rule. For example, the expression except=”parent/child{2}” could be used to exclude the second <child> element contained in a <parent> element from the output of the structure subrule.
Note: Currently, <structure> elements cannot be constructed in parallel on a feature – only one can be constructed at a time.