FME Server includes a number of failover mechanisms that allow various components to connect to backup components in the event of failure. However, FME Server administrators must understand that a standard FME Server cluster does not provide complete fault tolerance. Configuring a fault tolerant cluster requires the following components:
Note: To ensure failover capability of any source data files or databases used in your FME workspaces, the following additional components are required:
The following diagram shows the basic structure of FME Server failover.
Note: To limit hardware costs, two instances of Engine 1 can be on the same machine, with one instance connected to the Primary host and the other to the Failover host. See Configuring FME Engines for Failover below.
When the Core A machine fails, clients connecting to Core A failover to Core B. Because the Failover Core is now the “active” Core of the cluster, clients subsequently always connect to the Failover Core so that processing continues. When a heartbeat failure is detected, this Faillover Core takes over Jobs and Schedules.
When the Core A machine and associated Engines are restored, Core B remains the Active Core, while Core A is now online as the Failover Core.
Note: Notification Service publishers (including UDP, Email and JMS clients) do not failover. These clients must be manually reconfigured to connect to the active core.
The first FME Server Core that starts up in a fault tolerant cluster automatically becomes the Primary “active” Core. The second FME Server Core to start up becomes the Failover Core. When configured, the fault tolerant cluster automatically reconfigures itself depending on the state of the system.
Prior to installation, you must create and share the directory where the common FME Server files will exist (repository directory), so that each FME Server Core can access it (read and write). The repository directory should be on its own redundant file server, not the same machine as the FME Server Core installation. Then, during installation, provide the UNC/mount path to the repository directory, ensuring that you use the same directory for each FME Server Core installation.
When you bring the primary server back online, it is not necessary to shutdown the secondary server unless you want clients to fail back to the primary server. You might want to fail back if the primary server is a more capable system than the failover server.
The following instructions apply to both the primary and failover server. Additional parameters to configure a failover cluster can be found in the FME Server configuration file.
<FMEServerDir>\Server\fmeServerConfig.txt
#CLUSTER_TYPE=DEFAULT
CLUSTER_TYPE=FAILOVER
The host name value corresponds to the FME_SERVER_HOST_NAME setting of the monitored host. This value is case sensitive and is typically all upper case in a default installation. It is best to confirm by checking the FME_SERVER_HOST_NAME value of the monitored host.
FAILOVER_MONITOR_HOST=<FME_SERVER_HOST_NAME>
You can configure individual FME Server services to switch to a backup FME Server Core when the connection to the primary FME Server Core is lost. This capability is called "service failover".
When a service loses its connection to the primary FME Server Core, the service attempts to connect to a backup FME Server Core you define. In a fault tolerant cluster, a primary FME Server Core can only have one backup FME Server Core.
Open each web application properties file located at:
<WebAppDir>\<fmeServiceName>\WEB-INF\conf\propertiesFile.properties
Find the following line in this file:
#FAILOVER_SERVER_NAMES=<failoverServerName>
Uncomment this line and replace <failoverServerName> with a host name.
The host name must be running an FME Server, for example:
FAILOVER_SERVER_NAMES=red
This example defines an FME Server system—red—which must be running an FME Server.
We recommend configuring redundant FME Engines for both the primary FME Server Core and the secondary FME Server Core. If you wish to limit the number of machines used, you can start up FME Engines with the same name on the same machine, with one engine connected to the primary FME Server Core and the other connected to the failover FME Server Core.
System notifications, such as via an email or a mobile client subscription, can be configured so that you are alerted of the status of the Failover cluster. Status notifications can include when the FME Server starts up, when a host fails to detect the host it is monitoring, and when a failover operations occurs.
Perform the following steps for both the FME Server primary and failover server. We recommend that topic names be different between servers to distinguish notification messages.
<FMEServerDir>\Server\fmeServerConfig.txt
NOTIFY_FAILOVER=<topicName>
For more information about configuring notifications, see the Notification Service section of the FME Server Reference Manual.