PythonCaller
Executes a user-supplied Python script to manipulate features.
Typical Uses
- Tasks where a transformer is not available
- Using external modules for processing
- Performing complex manipulations on list attributes
How does it work?
The PythonCaller executes a Python script to manipulate features.
When a specialized task is required, such as custom statistical analysis of an attribute, but Workbench does not provide a transformer suited to the task, a Python script can perform specialized and complex operations on a feature's geometry, attributes, and coordinate system.
Access is provided via the FME Objects Python API.
Note: Python is a programming language external to FME. For documentation on creating Python scripts, visit The Python Foundation.
Using Python to perform arbitrary operations on features is a powerful aspect of Workbench. However, the logic introduced into a workspace is less visible and can therefore be more difficult to maintain than logic built using Workbench’s built-in transformers. It is recommended that other transformers be used when possible instead of Python scripts.
Class Interface
The PythonCaller can interface with a class defined in a Python script. The calling sequence on the methods defined in a class depends on which mode the PythonCaller is in. There are two modes - Standard and Group By.
Standard Mode
This is the operation mode when no attributes are set in the Group By parameter. In this mode, which is the most common, the PythonCaller will have the following calling sequence on a class:
- __init__() - Called once, whether or not any features are processed.
- input() - Called for each FMEFeature that comes into the input port.
- close() - Called once, after all features are processed (when no more FMEFeatures remain). If no features are processed, the close() method will still be called.
Features that need to continue through the workspace for further processing must be explicitly written out using the pyoutput() method.
When the class interface is processing incoming FMEFeatures one at a time, the pyoutput() method is to be called once per incoming FMEFeature in the input() method. Conversely, when the class interface is operating on a group of FMEFeatures, the incoming FMEFeatures can be stored in a list, then processed and written out through pyoutput() in the close() method.
The example below calculates the total area of all the features processed and then outputs all the features with a new attribute containing the total area:
import fme
import fmeobjects
class FeatureProcessor(object):
def __init__(self):
self.feature_list = []
self.total_area = 0.0
def input(self, feature):
self.feature_list.append(feature)
self.total_area += feature.getGeometry().getArea()
def close(self):
for feature in self.feature_list:
feature.setAttribute("total_area", self.total_area)
self.pyoutput(feature)
def process_group(self):
pass
Group By Mode
This is the operation mode when one or more attributes are set in the Group By parameter. In this mode, the PythonCaller will have the following calling sequence on a class:
- __init__() - Called once, whether or not any features are processed.
- input() - Called for each FMEFeature in a group.
- process_group() - Called after all FMEFeatures in a group have been sent to input(). After this is called and executed, PythonCaller will send the next group of FMEFeatures to input() and then call process_group() again. This is repeated until all groups have been exhausted.
- close() - Called once, after all rounds of input() and process_group() have been called to process all incoming features (when no more FMEFeatures remain). If no features are processed, the close() method will still be called.
In this mode, the class interface will be working on groups of FMEFeatures. The incoming FMEFeatures must be stored in a list class member variable, then processed and written out through pyoutput() in the process_group() method. After processing is complete in process_group(), all class member variables should be cleared for the next round of group by handling. In the next round, FMEFeatures of the next group are passed through input() calls followed again by process_group(). This is repeated until all the groups are exhausted.
The example below calculates the total area of all features grouped by the _shape attribute and then outputs all the features with a new attribute containing the total area for each group:
import fme
import fmeobjects
class FeatureProcessor(object):
def __init__(self):
self.feature_list = []
self.total_area = 0.0
def input(self, feature):
self.feature_list.append(feature)
self.total_area += feature.getGeometry().getArea()
def close(self):
pass
def process_group(self):
for feature in self.feature_list:
feature.setAttribute("group_total_area", self.total_area)
self.pyoutput(feature)
self.feature_list = []
self.total_area = 0.0
Script Editing
A PythonCaller transformer can call scripts that are stored in the transformer itself or scripts that are stored globally for the entire workspace:
- To store a Python script with a specific PythonCaller transformer, use the Python Script parameter in the transformer.
- To store a Python script globally, click the Advanced Workspace Parameter in the Navigator, and double-click Startup Python Script. Storing scripts globally has the advantage of keeping the Python logic centralized, which makes editing and maintenance easier. This is useful when there are multiple PythonCaller transformers throughout the workspace that use the same script. For more information, see Startup and Shutdown Python Scripts in the FME Workbench help.
FME can access .py modules that are stored on the file system, including modules in external Python libraries. Use the Python "import" command to load these modules. FME will search both the standard Python module locations and the workspace location to find the module to be imported.
Configuration
Input Ports
Features to be manipulated.
Output Ports
Features, including any modifications made.
Parameters
Group By |
If the Group By parameter is set to a set of attributes, one feature per group will be produced. |
Group By Mode |
Process At End (Blocking): This is the default behavior. Processing will only occur in this transformer once all input is present. Process When Group Changes (Advanced): This transformer will process input groups in order. Changes of the value of the Group By parameter on the input stream will trigger processing on the currently accumulating group. This may improve overall speed (particularly with multiple, equally-sized groups), but could cause undesired behavior if input groups are not truly ordered. Considerations for Using Group By
There are two typical reasons for using Process When Group Changes (Advanced) . The first is incoming data that is intended to be processed in groups (and is already so ordered). In this case, the structure dictates Group By usage - not performance considerations. The second possible reason is potential performance gains. Performance gains are most likely when the data is already sorted (or read using a SQL ORDER BY statement) since less work is required of FME. If the data needs ordering, it can be sorted in the workspace (though the added processing overhead may negate any gains). Sorting becomes more difficult according to the number of data streams. Multiple streams of data could be almost impossible to sort into the correct order, since all features matching a Group By value need to arrive before any features (of any feature type or dataset) belonging to the next group. In this case, using Group By with Process At End (Blocking) may be the equivalent and simpler approach. Note: Multiple feature types and features from multiple datasets will not generally naturally occur in the correct order. As with many scenarios, testing different approaches in your workspace with your data is the only definitive way to identify performance gains. |
Class to Process Features | The name of the Python Class within the script that PythonCaller will use to begin execution. For the above examples, set this parameter to FeatureProcessor. |
Python Script | The Python script to be executed. When the Python script is stored as the Startup Python Script for the Workspace, leave this parameter blank. |
Attributes to Expose | Exposes any attributes that are created by the Python script being executed so they can be used by other transformers. |
Attributes to Hide | Hides any attributes that may be removed by the Python script being executed. Other transformers will not be able to use these attributes. |
Lists to Hide |
Hides any lists that may be removed by the Python script being executed. Other transformers will not be able to use these lists. Note that if you select to hide a list, your selection will include any list attributes or nested lists. For example, if you select to hide a list called list{} then list{}.attr or list{}.sublist{} will also be hidden. |
Editing Transformer Parameters
Using a set of menu options, transformer parameters can be assigned by referencing other elements in the workspace. More advanced functions, such as an advanced editor and an arithmetic editor, are also available in some transformers. To access a menu of these options, click beside the applicable parameter. For more information, see Transformer Parameter Menu Options.
Defining Values
There are several ways to define a value for use in a Transformer. The simplest is to simply type in a value or string, which can include functions of various types such as attribute references, math and string functions, and workspace parameters. There are a number of tools and shortcuts that can assist in constructing values, generally available from the drop-down context menu adjacent to the value field.
Using the Text Editor
The Text Editor provides a convenient way to construct text strings (including regular expressions) from various data sources, such as attributes, parameters, and constants, where the result is used directly inside a parameter.
Using the Arithmetic Editor
The Arithmetic Editor provides a convenient way to construct math expressions from various data sources, such as attributes, parameters, and feature functions, where the result is used directly inside a parameter.
Conditional Values
Set values depending on one or more test conditions that either pass or fail.
Parameter Condition Definition Dialog
Content
Expressions and strings can include a number of functions, characters, parameters, and more.
When setting values - whether entered directly in a parameter or constructed using one of the editors - strings and expressions containing String, Math, Date/Time or FME Feature Functions will have those functions evaluated. Therefore, the names of these functions (in the form @<function_name>) should not be used as literal string values.
These functions manipulate and format strings. | |
Special Characters |
A set of control characters is available in the Text Editor. |
Math functions are available in both editors. | |
Date/Time Functions | Date and time functions are available in the Text Editor. |
These operators are available in the Arithmetic Editor. | |
These return primarily feature-specific values. | |
FME and workspace-specific parameters may be used. | |
Creating and Modifying User Parameters | Create your own editable parameters. |
Dialog Options - Tables
Transformers with table-style parameters have additional tools for populating and manipulating values.
Row Reordering
|
Enabled once you have clicked on a row item. Choices include:
|
Cut, Copy, and Paste
|
Enabled once you have clicked on a row item. Choices include:
Cut, copy, and paste may be used within a transformer, or between transformers. |
Filter
|
Start typing a string, and the matrix will only display rows matching those characters. Searches all columns. This only affects the display of attributes within the transformer - it does not alter which attributes are output. |
Import
|
Import populates the table with a set of new attributes read from a dataset. Specific application varies between transformers. |
Reset/Refresh
|
Generally resets the table to its initial state, and may provide additional options to remove invalid entries. Behavior varies between transformers. |
Note: Not all tools are available in all transformers.
Reference
Processing Behavior |
Feature-Based or Group-Based, conditional on Python script |
Feature Holding |
Conditional on Python script |
Dependencies |
Specifying a Python Interpreter
An FME installation includes a Python version 2.7 and Python version 3.5 interpreter. The default Python interpreter used for Python processing is the Python 2.7 interpreter. The FME Objects Python API supports Python 2.7, Python 3.4, and Python 3.5. The Python interpreter used by FME to execute Python scripts is controlled by the Python Compatibility workspace parameter and the Preferred Python Interpreter setting under Tools > FME Options > Translation. Python Compatibility specifies the version of Python with which Python scripts are compatible. FME loads the Preferred Python Interpreter if it is compatible with the Python Compatibility. If not, FME loads an appropriate Python interpreter matching Python Compatibility. For more information, see the FME Workbench help. Installing Python Packages
If you would like to install a third-party package for use by Python in FME, see Installing Python Packages to FME Desktop in the FME Workbench help. |
Aliases | |
History |
FME Community
The FME Community is the place for demos, how-tos, articles, FAQs, and more. Get answers to your questions, learn from other users, and suggest, vote, and comment on new features.
Search for all results about the PythonCaller on the FME Community.
Examples may contain information licensed under the Open Government Licence – Vancouver and/or the Open Government Licence – Canada.