Databricks Writer Parameters

Version 2.1.0 of this package added Database Connections support for Databricks Credentials and Cloud Upload Credentials.

To define a new connection from the Connection parameter in a Databricks format: Select Add Database Connection, and scroll to Databricks.

See database-specific parameters below, as well as the section Adding a Database Connection in a workspace in Using Database Connections.

The new connection can be made visible only to the current user, or can be shared among multiple users. To select an existing, previously defined connection, see the section Reusing a Database Connection in Using Database Connections.

Databricks Credentials

Server Hostname: The URL of the Databricks workspace. This will take the format https://<workspace_id>.cloud.databricks.com/ or https://adb-<id>.azuredatabricks.net/.

Cluster ID: The ID of the cluster to run Databricks commands with. This cluster should be configured with access to the cloud storage location specified in the writer parameters.

Authentication Method: Personal Access Token (default) or Databricks Sign In. Select whether to authenticate Databricks access with either a Personal Access Token or Username and Password.

Username: Used only when the Authentication Method parameter is set to Databricks Sign In. The Databricks username to use.

Password: Used only when the Authentication Method parameter is set to Databricks Sign In. The password of the Databricks account.

Generated Token Lifetime (Days): Used only when the Authentication Method parameter is set to Databricks Sign In. How long the generated token will be valid for. If this parameter is left blank, the generated Personal Access Token will not have an expiry date.

Personal Access Token: Used only when the Authentication Method parameter is set to Personal Access Token. A valid Personal Access Token which can be used to authenticate access to the Databricks workspace.

Catalog: The catalog to write to. Each writer can only write to a single catalog. Click the [...] button to see a list of accessible catalogs in the workspace.

Cloud Upload Credentials

Cloud Storage Type : Amazon S3 | Azure Data Lake Gen 2 (no default value)

Select the form of cloud storage to use as a staging area. The Databricks cluster selected should be configured with its own access to this location. The cloud storage credentials provided here will strictly be used by FME to upload data to the cloud storage location and will not be passed through in any Databricks commands.

Advanced