FME Transformers: 2024.2

Categories
Database
Strings

TextDecoder

Decodes a string from a number of different text encodings into plain text. The following encoding types are supported:

URL (Percent Encoding)

This encoding is used to ensure that a string is valid for inclusion in a URL. All characters that are not a letter, digit, dash, period, underscore or tilde will be encoded. The TextDecoder converts an encoded string such as black%20%26%20white into its decoded form black & white.

Unicode

This encoding is used to encode non-ascii characters in an ascii string. The TextDecoder will decode from any of these code point representations, where the XX...X string represents the hexadecimal value of a Unicode code point:

\uXXXX

\UXXXX

\u{XX...X}

\UXXXXXXXX

U+XXXX

For example, the Cyrillic character Ӥ is represented as \U04E4 or U+04E4. The TextDecoder converts a string containing code point references to a UTF-8 string, with the code points dereferenced. Any characters which are not part of a Unicode code point will be unchanged. For example, the string ‘U+0F06 εA \U03A8’ will be decoded to ‘༆ εA Ψ’

XML

This encoding is used to ensure strings are acceptable for use in an XML document. Characters that have syntactic meaning in XML are escaped, using the following mapping:

Character Encoded Value
< &lt;
> &gt;
" &quot;
& &amp;
' &apos;

In addition, the XML encoding allows for any character to be represented using the decimal or hexadecimal representation of its Unicode code point. The TextDecoder converts an XML encoded string, such as black &amp; white into its plain text representation, black & white.

HTML

This encoding is an extension of the XML encoding. The HTML encoding includes many characters which cannot be represented using a simple Latin character set, such as ♪, ± or ∞. The TextDecoder will convert an HTML encoded string, such as this &plusm; that into its plain text representation, this ± that.

Base64

Base64 encoding is a method of storing arbitrary data as an ASCII string. The TextDecoder will convert Base64 encoded data into a text string. The Base64 data will be decoded into a sequence of bytes, which will then be interpreted using the character encoding given in the Character Encoding for Binary Data parameter.

HEX

HEX encoding is another method used to store arbitrary data as an ASCII string. HEX encoded data is not as compact as Base64 encoded data. The TextDecoder will convert HEX encoded data to a text string. The HEX data will be decoded into a sequence of bytes, which will then be interpreted using the character encoding given in the Character Encoding for Binary Data parameter.

Octal

Octal encoding is another method used to store arbitrary data as an ASCII string. Octal encoded data is not as compact as HEX or Base64 encoded data. The TextDecoder will convert Octal encoded data to a text string. The Octal data will be decoded into a sequence of bytes, which will then be interpreted using the character encoding given in the Character Encoding for Binary Data parameter.

Configuration

Parameters

Editing Transformer Parameters

Transformer parameters can be set by directly entering values, using expressions, or referencing other elements in the workspace such as attribute values or user parameters. Various editors and context menus are available to assist. To see what is available, click beside the applicable parameter.

For more information, see Transformer Parameter Menu Options.

FME Community

The FME Community has a wealth of FME knowledge with over 20,000 active members worldwide. Get help with FME, share knowledge, and connect with users globally.

Search for all results about the TextDecoder on the FME Community.

Keywords: URLDecoder decode encode