Apache Hive (Hadoop) Reader

Licensing options for this format begin with FME Desktop Professional Edition.

The Apache Hive (Hadoop) reader module provides FME access to databases and file systems within Hadoop via Hive.

Overview

FME supports HiveServer2, which was introduced in Hive 0.11.0. FME is not compatible with HiveServer1, which has been removed from Hive releases since Hive 1.0.0.

To start using Hive Reader, you will first need to obtain a Hive JDBC client driver.

Finding the JDBC Driver in Your Hive Server Installation

The Hive installation within your Hadoop cluster would typically include a compatible Hive JDBC client driver (a .jar file).

Note: For best results, use the Hive JDBC driver that matches your Hive installation. Apache Hive JDBC client drivers are not backwards compatible: newer client versions cannot connect to older Hive servers. Standalone Hive JDBC Drivers were introduced on Hive 0.14.0. We do not recommend using versions older than 0.14.0.

Here are some typical locations:

Hortonworks distribution of Hadoop

  • /usr/hdp/current/hive-client/lib/hive-jdbc-<version>-standalone.jar

or

Cloudera distribution of Hadoop

  • /usr/lib/hive/lib/hive-jdbc-<version>-standalone.jar

You can also use the basic Apache Hive JDBC Driver.

The .jar filename should resemble this string: hive-jdbc-<version>-standalone.jar.

Installing the JDBC Driver

To view instructions for installing the JDBC driver, please see Getting Started with JDBC.

Usage Notes

The performance of this format is dependent on the amount of memory allocated to the Java Virtual Machine (JVM). The following environment variables allow you to specify memory available to Java Plugins:

  • FME_JVM_MIN_HEAP_SIZE: Initial heap size for initializing the JVM. If unset, the default value is 1024K.
  • FME32_JVM_MIN_HEAP_SIZE: Same as FME_JVM_MIN_HEAP_SIZE, but applies to 32-bit FME Desktop on Windows, and if set, takes precedence over FME_JVM_MIN_HEAP_SIZE.

These variables must be set in multiples of 1024 and greater than 1 MB. To indicate kilobytes, megabytes, or gigabytes, append k or K, m or M, or g or G respectively. For example, any of these values is acceptable:

6291456

6144k

6m

  • FME_JVM_MAX_HEAP_SIZE: Maximum heap size for initializing the JVM. If unset, the default value is 16384K.
  • FME32_JVM_MAX_HEAP_SIZE: Same as FME_JVM_MAX_HEAP_SIZE, but applies to 32-bit FME Desktop on Windows, and if set, takes precedence over FME_JVM_MAX_HEAP_SIZE.

These variables must be set in multiples of 1024 and greater than 2 MB. To indicate kilobytes, megabytes, or gigabytes, append k or K, m or M, or g or G respectively. For example, any of these values is acceptable:

83886080

81920k

80m

Note: To pass additional parameters used by FME to the Java Virtual Machine, use the JAVA_TOOL_OPTIONS environment variable.