AzureAIVisionConnector

Connects to the Azure AI Vision service to detect objects in images.

Typical Uses

Detecting faces, objects, or text in images

How does it work?

The AzureAIVisionConnector uses your Azure account credentials to connect to Azure Vision and submit images for analysis.

Images may be provided as files, URLs, or as raster geometry, and each one may produce multiple output features.

Services supported are:

Detection Type	Analysis
Face	Successfully identified faces will result in output features with attributes describing the face. If the location of a face in the image can be identified, a bounding box geometry will also be returned with a separate confidence value. Facial landmarks may be optionally detected.
Object	Successfully identified objects will result in output features with attributes describing the objects. If the location of an object in the image can be identified, a bounding box geometry will also be returned with a separate confidence value.
Text	Successfully identified text will result in output features with attributes describing the text. Features will be output for lines of text and for individual words in each line. Each feature will have a bounding box for the line or word.

Detection Type

Analysis

Face

Successfully identified faces will result in output features with attributes describing the face.

If the location of a face in the image can be identified, a bounding box geometry will also be returned with a separate confidence value.

Facial landmarks may be optionally detected.

Object

Successfully identified objects will result in output features with attributes describing the objects.

If the location of an object in the image can be identified, a bounding box geometry will also be returned with a separate confidence value.

Text

Successfully identified text will result in output features with attributes describing the text.

Features will be output for lines of text and for individual words in each line.

Each feature will have a bounding box for the line or word.

Bounding boxes are in pixel units, and align with the input.

Optional Input Port

This transformer has two modes, depending on whether a connector is attached to the Input port or not:

Input-driven: When input features are connected, the transformer runs once for each feature it receives in the Input port.
Run Once: When no input features are connected, the transformer runs one time.

When the Input port is in use, the Initiator output port is also enabled.

Usage Notes

For better performance, requests to the service are made in parallel, and are returned as soon as they complete. Consequently, detection results will not be returned in the same order as their associated requests.

Configuration

Input Ports

Output Ports

Output

Features with added attributes, as specified in parameters and according to Detection Type.

Detection Type	Output - Input-Driven	Output - Run Once
Face	Input feature(s), one copy for each face identified, with details about the face.	New feature(s), one for each face identified, with details about the face.
Object	Input feature(s), one copy for each object identified, with details about the object.	New feature(s), one for each object identified, with details about the object.
Text	Input features, one copy for each line of text and individual word, with details about the text.	New features, one for each line of text and individual word, with details about the text.

Summary

One feature per image is output here, with added attributes describing detection result success.

Detection Type

Summary Feature Attributes

Face

_detected_faces

The number of faces that were detected in the image.

Object

_labels{}.confidence	A number between 0 and 1 that indicates the probability that a given label is correct.
_labels{}.name	A word or phrase describing the content of the image.

Text

_detected_words	The number of words that were detected in the image.
_detected_lines	The number of lines of text that were detected in the image.

Parameters

Authentication

To use the AzureTextAnalyticsConnector or the AzureAIVisionConnector you will need a Cognitive Services Account, then generate an endpoint and key to authenticate.

Credential Source

Select the type of credentials to use:

Web Connection (Recommended): Use an Azure Cognitive Services web connection.
Embedded: Embed an endpoint and secret key as parameters in the transformer.

Account

When Credential Source is Web Connection, select or create a Web Connection connecting to an Azure Cognitive ServicesWeb Service.

See Using Web Connections.

Embedded Credentials

When Credential Source is Embedded:

The required endpoint URL and access keys can be found in the Microsoft Azure Portal under Resource Management > Keys and Endpoint, after creating or selecting the appropriate resource based on the Detection Type:

Endpoint	Provide an endpoint URL.
Key	Provide an access key.

Request

Image Source	Select the source of the image: Local File: A JPEG or PNG file on disk . URL: An image located at a URL. Raster Geometry: Raster geometry on a feature.
Input Filename	When Image Source is Local File, provide the path to a JPEG or PNG file.
URL	When Image Source is URL, provide the image URL.
Detection Type	Select the type of detection to perform: Face Object Text

Detection Type > Face

Face Detection Options

Output Facial Landmark Points

Select a landmarks option:

No (default): Do not output landmark points.
Yes: Output landmark points.

Added Attributes

Output features will receive these attributes:

_head_pose_pitch
_head_pose_roll
_head_pose_yaw
_glasses
_blur_level
_blur_value
_exposure_level
_exposure_value
_noise_level
_noise_value

Detection Type > Object

Object Detection Options

Object detection has no parameters to configure.

Added Attributes

Output features will receive these attributes:

_label

A word or short phrase describing the content of the image. Labels may be general descriptors for the image, or may refer to identifiable instances in the image.

For example, a label of Vegetation with no bounding box indicates that there are plants somewhere in the image. A label of Abies with a bounding box might indicate that there is a fir tree at that location.

_confidence

A number between 0 and 1 that indicates the probability that a given prediction is correct.

Detection Type > Text

Text Detection Options

Text Analysis Language

Select or provide the language to be detected.

If providing a value, use the language code, as in en for English, or Unknown [unk] for automatic language detection.

See Language support for Language features.

Added Attributes

Output features will receive these attributes:

_text	The detected text in the line or word.
_type	The type of detected text. Lines are sections of text that are aligned along the same horizontal axis. Sentences may be split across multiple lines. Words are sections of text separated by whitespace, and are associated with parent lines. Options: LINE, WORD
_id	The number identifying the feature. If the feature represents a line of text, the identifier is unique within the image. If the feature represents a word, the identifier is unique within the parent line.
_parent_id	The _id value of the row the word is in. This value will be null for rows.

Editing Transformer Parameters

Transformer parameters can be set by directly entering values, using expressions, or referencing other elements in the workspace such as attribute values or user parameters. Various editors and context menus are available to assist. To see what is available, click beside the applicable parameter.

How to Set Parameter Values

Defining Values

There are several ways to define a value for use in a Transformer. The simplest is to simply type in a value or string, which can include functions of various types such as attribute references, math and string functions, and workspace parameters.

Using the Text Editor

The Text Editor provides a convenient way to construct text strings (including regular expressions) from various data sources, such as attributes, parameters, and constants, where the result is used directly inside a parameter.

Text Editor

Using the Arithmetic Editor

The Arithmetic Editor provides a convenient way to construct math expressions from various data sources, such as attributes, parameters, and feature functions, where the result is used directly inside a parameter.

Arithmetic Editor

Conditional Values

Set values depending on one or more test conditions that either pass or fail.

Parameter Condition Definition Dialog

Content

Expressions and strings can include a number of functions, characters, parameters, and more.

When setting values - whether entered directly in a parameter or constructed using one of the editors - strings and expressions containing String, Math, Date/Time or FME Feature Functions will have those functions evaluated. Therefore, the names of these functions (in the form @<function_name>) should not be used as literal string values.

Content Types

String Functions	These functions manipulate and format strings.
Special Characters	A set of control characters is available in the Text Editor.
Math Functions	Math functions are available in both editors.
Date/Time Functions	Date and time functions are available in the Text Editor.
Math Operators	These operators are available in the Arithmetic Editor.
FME Feature Functions	These return primarily feature-specific values.
FME Parameters	FME and workspace-specific parameters may be used.
Creating and Modifying User Parameters	Create your own editable parameters.

Dialog Options - Tables

Table Tools

Transformers with table-style parameters have additional tools for populating and manipulating values.

Row Reordering

Enabled once you have clicked on a row item. Choices include:

Add a row
Remove a row
Move current row up one
Move current row down one
Move current row to top
Move current row to bottom

Cut, Copy, and Paste

Enabled once you have clicked on a row item. Choices include:

Cut a row - delete and copy to clipboard
Copy a row to the clipboard
Paste a row from the clipboard

Cut, copy, and paste may be used within a transformer, or between transformers.

Filter

Start typing a string, and the matrix will only display rows matching those characters. Searches all columns. This only affects the display of attributes within the transformer - it does not alter which attributes are output.

Import

Import populates the table with a set of new attributes read from a dataset. Specific application varies between transformers.

Reset/Refresh

Generally resets the table to its initial state, and may provide additional options to remove invalid entries. Behavior varies between transformers.

Note: Not all tools are available in all transformers.

For more information, see Transformer Parameter Menu Options.

Reference

Processing Behavior	Feature-Based
Feature Holding	No
Dependencies	Azure Cognitive Services Account
Aliases
History	Released FME 2019.2

FME Online Resources

The FME Community and Support Center Knowledge Base have a wealth of information, including active forums with 35,000+ members and thousands of articles.

Search for all results about the AzureAIVisionConnector on the FME Community.

Examples may contain information licensed under the Open Government Licence – Vancouver, Open Government Licence - British Columbia, and/or Open Government Licence – Canada.