AzureComputerVisionConnector

Connects to the Azure Computer Vision Service to detect objects in images.

Typical Uses

Submitting text to the Azure Computer Vision service to

detect individual objects
describe the general contents

How does it work?

The AzureComputerVisionConnector uses your Azure Cognitive Services account credentials (either via a previously defined FME web connection, or by setting up a new FME web connection right from the transformer) to access the service.

It will submit images to the service, and return features with attributes that describe the contents of the image. Services supported are object detection, text detection, and face detection.

For object detection, if the service is able to identify the exact location of an object in the image, a bounding box geometry will also be returned.
Text detection will always return bounding boxes around the detected text.
For face detection, if the service is able to identify the exact location of a face in the image, a bounding box geometry will also be returned. There is also the option to detect and locate facial landmarks.

Usage Notes

For better performance, requests to the Computer Vision service are made in parallel, and are returned as soon as they complete. Consequently, detection results will not be returned in the same order as their associated requests.
While powerful, the use of AI has important legal and ethical implications. Consult your local AI legislation and ethical guidelines before applying the AzureComputerVisionConnector in a production environment. For information about privacy and compliance with respect to Azure Cognitive Services, please see https://azure.microsoft.com/en-ca/support/legal/cognitive-services-compliance-and-privacy.

Configuration

Input Ports

Output Ports

Output

Output will depend on the analysis chosen. Each input feature may result in multiple output features. For example, a single image is likely to contain multiple detectable objects.

Object Detection

Successfully identified objects will result in output features with attributes describing the objects. A bounding box geometry may also be returned, with a separate confidence value. Each input image may result in several detection features. Bounding boxes are in pixel units, and will align with the input.

Attributes

_label

A word or short phrase describing the content of the image. Labels may be general descriptors for the image, or may refer to identifiable instances in the image.

For example, a label of "Vegetation" with no bounding box indicates that there are plants somewhere in the image. A label of "Abies" with a bounding box might indicate that there is a fir tree in the top left corner.

_confidence

A number between 0 and 100 that indicates the probability that a given prediction is correct.

Text Detection

Successfully identified text will result in output features with attributes describing the text. Features will be output for lines of text, and for individual words in each line. Each feature will have a bounding box for the line or word.

When using a local file or raster geometry or URL as input, the bounding box is in pixel units, and will align with the input.

Attributes

_text	The detected text in the line or word.
_type	The type of detected text. Lines are sections of text that are aligned along the same horizontal axis. Sentences may be split across multiple lines. Words are sections of text separated by whitespace, and are associated with parent lines. Options: LINE, WORD
_id	The number identifying the feature. If the feature represents a line of text, the identifier is unique within the image. If the feature represents a word, the identifier is unique within the parent line.
_parent_id	The _id value of the row the word is in. This value will be null for rows.

Face Detection

Successfully identified faces will result in output features with attributes describing the face. A bounding box geometry will also be returned, with a separate confidence value. Each input image may result in several detection features. Bounding boxes are in pixel units, and will align with the input.

Attributes

The following attributes are returned:

_head_pose_pitch
_head_pose_roll
_head_pose_yaw
_glasses
_blur_level
_blur_value
_exposure_level
_exposure_value
_noise_level
_noise_value

Summary

Output will depend on the analysis chosen. Only one output summary feature will be produced per input feature.

Object Detection

A summary feature with the original geometry and attributes preserved will always be output through the Summary port. Attributes will be added to indicate the labels that apply to the image in general, and not to a specific area.

Attributes

_labels{}.confidence	A number between 0 and 100 that indicates the probability that a given label is correct.
_labels{}.name	A word or phrase describing the content of the image.

Text Detection

A summary feature with the original geometry and attributes preserved will always be output through the Summary port. Attributes will be added to indicate the number of lines and words detected.

Attributes

_detected_words	The number of words that were detected in the image.
_detected_lines	The number of lines of text that were detected in the image.

Face Detection

A summary feature that contains details on the total amount of faces that were detected in the image.

Attributes

_detected_faces

The number of faces that were detected in the image.

Parameters

Authentication

To use the AzureTextAnalyticsConnector or the AzureComputerVisionConnector you will need a Cognitive Services Account, then generate an endpoint and key to authenticate through our connectors.

Credential Source

The AzureComputerVisionConnector can use credentials from different sources. Using a web connection integrates best with FME, but in some cases, you may wish to use one of the other sources.

Web Connection - use an Azure Cognitive Services web connection stored in the FME web connections database
Embedded - embed an endpoint and secret key as parameters in the transformer

Account

Available when the credential source is Web Connection. To create a Azure Cognitive Services connection, click the 'Account' drop-down box and select 'Add Web Connection...'.

The connection can then be managed via Tools -> FME Options... -> Web Connections.

Endpoint and Secret Key

Available when the credential source is Embedded. An endpoint and secret key can be specified directly in the transformer instead of in a web connection.

Request

Image Source	Where to get the input image for detection. Options are: Local File: specify a path to a JPEG or PNG file on disk URL: specify an image URL Raster Geometry: use the raster geometry of incoming features
Input Filename	If Local File is selected for the image source, the path to the JPEG or PNG file to use.
URL	If URL is selected for the image source, the source URL to use.
Detection Type	The type of operation to perform. Choices are: Object Detection: Detect labels for the given image. If possible, instances of each label will be identified with output bounding boxes. Text Detection: Detect text in the given image. Bounding boxes will be output for each detected line and for each detected word within the line. Face Detection: Detect faces in the given image. Bounding boxes and facial landmarks will be in the output for each detected face.

The remaining parameters available depend on the value of the Request > Detection Type parameter. Parameters for each Detection Type are detailed below.

Editing Transformer Parameters

Transformer parameters can be set by directly entering values, using expressions, or referencing other elements in the workspace such as attribute values or user parameters. Various editors and context menus are available to assist. To see what is available, click beside the applicable parameter.

How to Set Parameter Values

Defining Values

There are several ways to define a value for use in a Transformer. The simplest is to simply type in a value or string, which can include functions of various types such as attribute references, math and string functions, and workspace parameters.

Using the Text Editor

The Text Editor provides a convenient way to construct text strings (including regular expressions) from various data sources, such as attributes, parameters, and constants, where the result is used directly inside a parameter.

Text Editor

Using the Arithmetic Editor

The Arithmetic Editor provides a convenient way to construct math expressions from various data sources, such as attributes, parameters, and feature functions, where the result is used directly inside a parameter.

Arithmetic Editor

Conditional Values

Set values depending on one or more test conditions that either pass or fail.

Parameter Condition Definition Dialog

Content

Expressions and strings can include a number of functions, characters, parameters, and more.

When setting values - whether entered directly in a parameter or constructed using one of the editors - strings and expressions containing String, Math, Date/Time or FME Feature Functions will have those functions evaluated. Therefore, the names of these functions (in the form @<function_name>) should not be used as literal string values.

Content Types

String Functions	These functions manipulate and format strings.
Special Characters	A set of control characters is available in the Text Editor.
Math Functions	Math functions are available in both editors.
Date/Time Functions	Date and time functions are available in the Text Editor.
Math Operators	These operators are available in the Arithmetic Editor.
FME Feature Functions	These return primarily feature-specific values.
FME Parameters	FME and workspace-specific parameters may be used.
Creating and Modifying User Parameters	Create your own editable parameters.

Dialog Options - Tables

Table Tools

Transformers with table-style parameters have additional tools for populating and manipulating values.

Row Reordering

Enabled once you have clicked on a row item. Choices include:

Add a row
Remove a row
Move current row up one
Move current row down one
Move current row to top
Move current row to bottom

Cut, Copy, and Paste

Enabled once you have clicked on a row item. Choices include:

Cut a row - delete and copy to clipboard
Copy a row to the clipboard
Paste a row from the clipboard

Cut, copy, and paste may be used within a transformer, or between transformers.

Filter

Start typing a string, and the matrix will only display rows matching those characters. Searches all columns. This only affects the display of attributes within the transformer - it does not alter which attributes are output.

Import

Import populates the table with a set of new attributes read from a dataset. Specific application varies between transformers.

Reset/Refresh

Generally resets the table to its initial state, and may provide additional options to remove invalid entries. Behavior varies between transformers.

Note: Not all tools are available in all transformers.

For more information, see Transformer Parameter Menu Options.

Reference

Processing Behavior	Feature-Based
Feature Holding	No
Dependencies	Azure Cognitive Services Account
Aliases
History	Released FME 2019.2

FME Community

The FME Community has a wealth of FME knowledge with over 20,000 active members worldwide. Get help with FME, share knowledge, and connect with users globally.

Search for all results about the AzureComputerVisionConnector on the FME Community.

Examples may contain information licensed under the Open Government Licence – Vancouver, Open Government Licence - British Columbia, and/or Open Government Licence – Canada.

AzureComputerVisionConnector

Typical Uses

How does it work?

Usage Notes

Configuration

Input Ports

Output Ports

Object Detection

Attributes

Text Detection

Attributes

Face Detection

Attributes

Object Detection

Attributes

Text Detection

Attributes

Face Detection

Attributes

Parameters

Object Detection Options

Text Detection Options

Face Detection Options

Editing Transformer Parameters

Defining Values

Using the Text Editor

Using the Arithmetic Editor

Conditional Values

Content

Table Tools

Reference

FME Community