Apache Hive (Hadoop) Reader
FME can read databases and file systems within Hadoop via Hive.
Apache Hive (Hadoop) Product and System Requirements
Format |
Product |
Operating System |
||||
---|---|---|---|---|---|---|
Reader/Writer |
FME Form |
FME Flow |
FME Flow Hosted |
Windows 64-bit |
Linux |
Mac |
Reader |
Yes |
Yes |
Yes |
Yes |
Yes |
Yes |
FME supports HiveServer2, which was introduced in Hive 0.11.0. FME is not compatible with HiveServer1, which has been removed from Hive releases since Hive 1.0.0.
To start using Hive Reader, you will first need to obtain a Hive JDBC client driver.
Finding the JDBC Driver in Your Hive Server Installation
The Hive installation within your Hadoop cluster would typically include a compatible Hive JDBC client driver (a .jar file).
Here are some typical locations:
Hortonworks distribution of Hadoop
- /usr/hdp/current/hive-client/lib/hive-jdbc-<version>-standalone.jar
or
- Download from Hortonworks.
Cloudera distribution of Hadoop
- /usr/lib/hive/lib/hive-jdbc-<version>-standalone.jar
You can also use the basic Apache Hive JDBC Driver.
The .jar filename should resemble this string: hive-jdbc-<version>-standalone.jar.
Installing the JDBC Driver
To view instructions for installing the JDBC driver, please see Getting Started with JDBC.
Usage Notes
The performance of this format is dependent on the amount of memory allocated to the Java Virtual Machine (JVM). The following environment variables allow you to specify memory available to Java Plugins:
FME_JVM_MIN_HEAP_SIZE
Initial heap size for initializing the JVM. If unset, the default value is 1024K.
This variable must be set in multiples of 1024 and greater than 1 MB. To indicate kilobytes, megabytes, or gigabytes, append k or K, m or M, or g or G, respectively. For example, any of these values is acceptable:
6291456
6144k
6m
FME_JVM_MAX_HEAP_SIZE
Maximum heap size for initializing the JVM. If unset, the default value is 16384K.
This variable must be set in multiples of 1024 and greater than 2 MB. To indicate kilobytes, megabytes, or gigabytes, append k or K, m or M, or g or G, respectively. For example, any of these values is acceptable:
83886080
81920k
80m