RekognitionConnector
Accesses the Amazon Rekognition Service AI computer vision service to detect objects, faces, and text in images and to describe image contents and faces.
Typical Uses
Submitting images to the Amazon AWS Rekognition service to
- detect individual objects
- describe the general contents
- extract small amounts of text
- detect faces and facial descriptions in an image
- detect explicit or suggestive adult content in an image
How does it work?
The RekognitionConnector uses your Amazon AWS account credentials (either via a previously defined FME web connection, or by setting up a new FME web connection right from the transformer) to access the computer vision service.
It will submit images to the service, and return features with attributes that describe the contents of the image. Object detection, text detection, and face detection can be performed.
- For object detection, if the service is able to identify the exact location of an object in the image, a bounding box geometry will also be returned.
- Text detection and face detection always return bounding boxes around the detected text or face.
- Image moderation will return the input feature, with moderation labels added when unsafe content is detected.
Each input image may result in several output features.
Usage Notes
- For better performance, requests to the Rekognition service are made in parallel, and are returned as soon as they complete. Consequently, detection results will not be returned in the same order as their associated requests.
-
The RekognitionConnector is able to extract up to 50 words per image when performing text detection. This makes it unsuitable for detecting full pages of text, as from a scanned document.
- While powerful, the use of AI has important legal and ethical implications. Consult your local AI legislation and ethical guidelines before applying the RekognitionConnector in a production environment.
- For an overview of Rekognition's capabilities and limitations, please see Amazon's FAQ on the topic: https://aws.amazon.com/rekognition/the-facts-on-facial-recognition-with-artificial-intelligence
Configuration
Input Ports
This transformer accepts any feature. Raster geometries may be used as input if "Raster Geometry" is selected as the image source. The raster will be uploaded to the service as JPEG if possible, otherwise PNG will be used. The service does not support any other raster formats directly. Note that the RekognitionConnector does not support coordinate systems.
Output Ports
Output will depend on the analysis chosen. Each input feature may result in multiple output features. For example, a single image is likely to contain multiple detectable objects.
Object Detection
Successfully identified objects will result in output features with attributes describing the objects. A bounding box geometry will also be returned, with a separate confidence value. Each input image may result in several detection features.
When using a local file or raster geometry as input, the bounding box is in pixel units, and will align with the input. When using a file on S3, the size of the image is not known, so the output bounding box will be expressed in terms of a ratio of the original image. For example, if an object takes up a quarter of the image, the bounding box will be 0.5 by 0.5 in size.
Attributes
_detection_label |
A word or short phrase describing the content of the image. Labels may be general descriptors for the image, or may refer to identifiable instances in the image. For example, a label of "Vegetation" with no bounding box indicates that there are plants somewhere in the image. A label of "Abies" with a bounding box might indicate that there is a fir tree in the top left corner. |
_confidence | A number between 0 and 1 that indicates the probability that a given prediction is correct. For more information about confidence, see the Amazon Rekognition FAQs: https://aws.amazon.com/rekognition/faqs |
_bounding_box_confidence | A number between 0 and 1 that indicates the probability that the specific bounding box for a detected object is correct. |
_parent_label{} | A list of higher hierarchical labels. For example, if "Helicopter" is detected, the parent labels might be "Aircraft," "Vehicle," and "Transportation." |
Text Detection
Successfully identified text will result in output features with attributes describing the text. Features will be output for lines of text, and for individual words in each line. Each feature will have a bounding box for the line or word.
When using a local file or raster geometry as input, the bounding box is in pixel units, and will align with the input. When using a file on S3, the size of the image is not known, so the output bounding box will be expressed in terms of a ratio of the original image. For example, if a line of text takes up 80% of the width of the image, the bounding box will have a width of 0.8.
Attributes
_text | The detected text in the line or word. |
_confidence | A number between 0 and 1 that indicates the probability that a given text string is correct. For more information about confidence, see the Amazon Rekognition FAQs: https://aws.amazon.com/rekognition/faqs |
_type |
The type of detected text. Lines are sections of text that are aligned along the same horizontal axis. Sentences may be split across multiple lines. Words are sections of text separated by whitespace, and are associated with parent lines. Options: LINE, WORD |
_id | The number identifying the feature. If the feature represents a line of text, the identifier is unique within the image. If the feature represents a word, the identifier is unique within the parent line. |
_parent_id | The _id value of the row the word is in. This value will be null for rows. |
Face Detection
Successfully identified faces will result in output features with attributes describing the face. Each feature will have a bounding box for the face. Optionally, facial "landmarks," such as "nose," "chinBottom," or "midJawlineLeft" will also be added as additional point geometries on the feature.
When using a local file or raster geometry as input, the bounding box is in pixel units, and will align with the input. When using a file on S3, the size of the image is not known, so the output bounding box will be expressed in terms of a ratio of the original image. For example, if a face takes up a quarter of the image, the bounding box will be 0.5 by 0.5 in size.
Attributes
Many of the attributes for face detection have a value and an associated confidence. This is denoted with two attributes, in the form _characteristic.value and _characteristic.confidence. Confidence ranges from 0 to 1. Unless otherwise specified, all detected characteristics take this form.
_age_range |
The estimated age of the subject. There is no explicit confidence value for the age range. A smaller range indicates greater confidence in the estimate.
|
_beard | Whether or not the subject has a beard. |
_confidence | The overall confidence that the detected entity is a human face. |
_eyeglasses | Whether the subject is wearing glasses. |
_eyes_open | Whether the subject's eyes are open. |
_gender | Whether the subject is male or female. |
_mouth_open | Whether the subject's mouth is open. |
_mustache | Whether the subject has a mustache. |
_pose |
There are no confidence values associated with this characteristic. |
_quality |
There are no confidence values associated with this characteristic. |
_smile | Whether the subject is smiling. |
_sunglasses | Whether the subject is wearing sunglasses. |
Image Moderation
No features will be returned through the Output port when performing image moderation.
Output will depend on the analysis chosen. Only one output summary feature will be produced per input feature.
Object Detection
A summary feature with the original geometry and attributes preserved will always be output through the Summary port. Attributes will be added to indicate the labels that apply to the image in general, and not to a specific area.
Attributes
_detection_labels{}.confidence | A number between 0 and 1 that indicates the probability that a given label is correct. For more information about confidence, see the Amazon Rekognition FAQs: https://aws.amazon.com/rekognition/faqs |
_detection_labels{}.name | A word or phrase describing the content of the image. |
_detection_labels{}.parent | For the descriptor in _labels{n}.name, _labels{n}.parent will contain a comma-separated list of parent descriptors. |
Text Detection
A summary feature with the original geometry and attributes preserved will always be output through the Summary port. Attributes will be added to indicate the number of lines and words detected.
Attributes
_detected_words | The number of words that were detected in the image. |
_detected_lines | The number of lines of text that were detected in the image. |
Face Detection
A summary feature with the original geometry and attributes preserved will always be output through the Summary port. Attributes will be added to indicate the number of faces detected.
Attributes
_detected_faces | The number of faces that were detected in the image. |
Image Moderation
A summary feature with the original geometry and attributes preserved will always be output through the Summary port. If unsafe content is detected, list attributes will be added indicating what type of unsafe content was found.
Attributes
_safe_image |
A boolean value (yes/no) indicating whether the image is deemed to be safe, within the requested confidence. For example, if the confidence level was set to 0.95, and the image is deemed to be safe, it means that no moderation labels could be detected with a confidence score of at least 0.95. |
_moderation_labels{}.confidence | A number between 0 and 1 that indicates the probability that a given moderation label is correct. For more information about confidence, see the Amazon Rekognition FAQs: https://aws.amazon.com/rekognition/faqs |
_moderation_labels{}.name |
A phrase indicating the type of unsafe content. There are several top-level categories that the service detects. Within each category, there are subcategories which may also appear in this list.
|
_moderation_labels{}.parent | If the list item for _labels{n}.name is a subcategory, _labels{n}.parent will contain its parent category. |
The incoming feature is output through this port.
Features that cause the operation to fail are output through this port. An fme_rejection_code attribute, having the value ERROR_DURING_PROCESSING, will be added, along with a more descriptive fme_rejection_message attribute which contains more specific details as to the reason for the failure.
Note: If a feature comes in to the RekognitionConnector already having a value for fme_rejection_code, this value will be removed.
Rejected Feature Handling: can be set to either terminate the translation or continue running when it encounters a rejected feature. This setting is available both as a default FME option and as a workspace parameter.
Parameters
Credential Source |
The RekognitionConnector can use credentials from different sources. Using a web connection integrates best with FME, but in some cases, you may wish to use one of the other sources.
|
Account |
Available when the credential source is Web Connection. To create a Rekognition connection, click the 'Account' drop-down box and select 'Add Web Connection...'. The connection can then be managed via Tools -> FME Options... -> Web Connections. |
Region | The AWS Region through which to access Rekognition. To optimize latency, it is best practice to specify the correct region. |
Access Key and Secret Access Key | Available when the credential source is Embedded. An access key ID and secret access key can be specified directly in the transformer instead of in a web connection. |
Image Source |
Where to get the input image for detection. Options are:
|
Input Filename | If File is selected for the image source, the path to the JPEG or PNG file to use |
Bucket/Key | If File on S3 is selected for the image source, the path on S3 to use |
Action |
The type of operation to perform. Choices are:
|
The remaining parameters available depend on the value of the Request > Action parameter. Parameters for each Action are detailed below.
Object Detection Options
Minimum Confidence (0.0-1.0) |
The lowest detection confidence level to include in results. |
Maximum Number of Objects |
The maximum number of objects (labels) to detect per supplied image. |
Text Detection Options
Text detection does not require any additional parameters.
Face Detection Options
Output Facial Landmark Points |
Yes: the output features will have aggregate geometry, consisting of the bounding box polygon and points indicating facial "landmarks." For more information about landmarks, see https://docs.aws.amazon.com/rekognition/latest/dg/faces-detect-images.html for an illustration of the available landmarks. The geometry name for these points will be a descriptor of the landmark, such as "nose" or "chinBottom." No: the output features will have polygon geometry for the bounding box only. |
Image Moderation Options
Minimum Confidence (0.0-1.0) |
The lowest detection confidence level to include in results. |
Editing Transformer Parameters
Using a set of menu options, transformer parameters can be assigned by referencing other elements in the workspace. More advanced functions, such as an advanced editor and an arithmetic editor, are also available in some transformers. To access a menu of these options, click beside the applicable parameter. For more information, see Transformer Parameter Menu Options.
Defining Values
There are several ways to define a value for use in a Transformer. The simplest is to simply type in a value or string, which can include functions of various types such as attribute references, math and string functions, and workspace parameters. There are a number of tools and shortcuts that can assist in constructing values, generally available from the drop-down context menu adjacent to the value field.
Using the Text Editor
The Text Editor provides a convenient way to construct text strings (including regular expressions) from various data sources, such as attributes, parameters, and constants, where the result is used directly inside a parameter.
Using the Arithmetic Editor
The Arithmetic Editor provides a convenient way to construct math expressions from various data sources, such as attributes, parameters, and feature functions, where the result is used directly inside a parameter.
Conditional Values
Set values depending on one or more test conditions that either pass or fail.
Parameter Condition Definition Dialog
Content
Expressions and strings can include a number of functions, characters, parameters, and more.
When setting values - whether entered directly in a parameter or constructed using one of the editors - strings and expressions containing String, Math, Date/Time or FME Feature Functions will have those functions evaluated. Therefore, the names of these functions (in the form @<function_name>) should not be used as literal string values.
These functions manipulate and format strings. | |
Special Characters |
A set of control characters is available in the Text Editor. |
Math functions are available in both editors. | |
Date/Time Functions | Date and time functions are available in the Text Editor. |
These operators are available in the Arithmetic Editor. | |
These return primarily feature-specific values. | |
FME and workspace-specific parameters may be used. | |
Creating and Modifying User Parameters | Create your own editable parameters. |
Reference
Processing Behavior |
|
Feature Holding |
No |
Dependencies | Amazon AWS Account with Rekognition access |
FME Licensing Level | FME Base Edition and above |
Aliases | AmazonAWSRekognitionConnector |
History | Released FME 2019.2 |
Categories |
FME Community
The FME Community is the place for demos, how-tos, articles, FAQs, and more. Get answers to your questions, learn from other users, and suggest, vote, and comment on new features.
Search for all results about the RekognitionConnector on the FME Community.
Examples may contain information licensed under the Open Government Licence – Vancouver