Amazon Redshift Reader/Writer

Licensing options for this format begin with FME Professional Edition.

Overview

Redshift is a petabyte-scale managed database, provided by Amazon Web Services. Redshift is based on PostgreSQL 8.0.2, an Object-Relational Database Management System (ORDBMS) that stores attribute information. The Redshift reader/writer module enables FME to read and write Redshift attribute data.

The Redshift reader/writer is specifically designed to handle the attribute portion of the data in the database. For maximum throughput, the Redshift reader/writer communicates directly with the Redshift database via the libpq interface.

This chapter assumes familiarity with Redshift, SQL, the attribute types supported, and its indexing mechanisms.

For more information, please see the Redshift home at:

http://aws.amazon.com/redshift/

Reader Overview

FME considers a Redshift dataset to be a database containing a collection of relational tables together with their corresponding geometries. The tables must be defined before they can be read.

Arbitrary WHERE clauses and joins are fully supported, as well as an entire arbitrary SQL SELECT statement; however, the user then assumes responsibility for the correctness of the statement or clause, including quoting where necessary.

When reading from the Redshift database, each table is considered a feature type in FME and each row of a table at least one feature in FME.

The basic reading process involves opening a connection to the database, querying metadata, and querying data. The data is read using a series of select statements and rows are fetched to the client machine in batches of 10000 by default.

Table and column names are truncated at 64 characters. If duplicate names are produced by truncation, the behavior is undetermined. Please ensure that table names comply with Redshift naming conventions.

Spaces and special characters are permissible in both table and column names. Case sensitivity has also been implemented, so table and column names are no longer changed to lowercase.

UNICODE support works with a client’s system encoding. Although there is no way to specify the encoding explicitly, the client is assumed to have entered data and created tables and columns in the encoding of their operating system. Multiple system encodings are now supported via the native PostgreSQL conversions between client and server, particularly if the server encoding is set to UNICODE.

Qualified table naming is supported in the form <schemaname>.<tablename>. The schema search path is read and interpreted to determine a user’s default schema when writing and the available schema to read from when reading. Failing a valid schema search path, the default public schema will be used for newer databases.

Writer Overview

The Redshift writer module stores attributes into a Redshift database. Note that attributes are always written as lowercase. The Redshift writer provides the following capabilities:

  • Transaction Support: The Redshift writer provides transaction support that eases the data loading process. Occasionally, a data load operation terminates prematurely due to data difficulties. The transaction support provides a mechanism for reloading corrected data without data loss or duplication.
  • Bulk Loading: The Redshift writer uses a multi-row insert technique to ensure speedy data load.