Databricks Writer Parameters

Version 2.1.0 of this package added Database Connections support for Databricks Credentials and Cloud Upload Credentials.

To define a new connection from the Connection parameter in a Databricks format: Select Add Database Connection, and scroll to Databricks.

See database-specific parameters below, as well as the section Adding a Database Connection in a workspace in Using Database Connections.

The new connection can be made visible only to the current user, or can be shared among multiple users. To select an existing, previously defined connection, see the section Reusing a Database Connection in Using Database Connections.

Databricks Credentials

Server Hostname: The URL of the Databricks workspace. This will take the format https://<workspace_id>.cloud.databricks.com/ or https://adb-<id>.azuredatabricks.net/.

Cluster ID: The ID of the cluster to run Databricks commands with. This cluster should be configured with access to the cloud storage location specified in the writer parameters.

Authentication Method: The method used for authentication.

Personal Access Token

Connects using a personal access token from Databricks.

Personal Access Token

The personal access token to connect to the specified cluster. This parameter is enabled when you select the Personal Access Token authentication method.

OAuth (Web Connection)

Uses an OAuth web connection that accesses the Databricks service. This option enables the Databricks OAuth Connection parameter, from which you can pick a saved connection, or add a Databricks OAuth Web Connection.

Adding a Databricks OAuth Connection

The OAuth web connection used to connect to the service principal that accesses the specified cluster. These parameters are enabled when you select the OAuth (Web Connection) authentication method and click Add Web Connection:

  • Token Endpoint URL – The URL used to request an OAuth connection token.
  • Client ID – The client ID for the OAuth secret for the service principal.
  • Client Secret – The value of the OAuth secret for the service principal.
  • Scope – The scope of the requested OAuth access token. For Microsoft Azure-hosted Databricks clusters, the scope value must be set to 2ff814a6-3304-4ab8-85cb-cd0e6f879c1d/.default. See Get a Microsoft Entra ID access token with the Microsoft identity platform REST API from the Azure Databricks documentation website for more information on the required scope value.

Adding a Databricks OAuth U2M Connection

The OAuth web connection used to connect to the specified cluster using Authorization Code flow. These parameters are enabled when you select the OAuth (Web Connection) authentication method and click Add Web Connection:

  • Client ID – The client ID for the OAuth request. This value can typically be set to databricks-cli.
  • Client Secret – The value of the OAuth secret for the service principal. This is optional.
  • Redirect URI – The redirect URI for the Authorization Code request. This value is typically set to http://localhost in FME Form and https://<your-fme-flow-url>/fmeoauth in FME Flow. This should match the configuration in Databricks.
  • Workspace URL – The URL of the Databricks workspace. This is typically the same as the Server Hostname in the Connection Parameters.

OAuth U2M

Connects using U2M using the Databricks JDBC driver. The first time you connect, a new browser tab will open to enable you to log in using your credentials.

Token Cache Pass Phrase

The pass phrase used to enable persistent token caching.

Setting a value here will enable the driver to cache the access token for use in subsequent connections, preventing repeated browser pop-ups. This parameter is enabled when you select the OAuth U2M authentication method.

More Information

Authentication Method

M2M OAuth Connections

U2M OAuth connections

OAuth for AWS-hosted Databricks clusters

See Authorize service principal access to Databricks with OAuth on the Databricks documentation website.

See Authorize user access to Databricks with OAuth on the Databricks documentation website.

OAuth for Microsoft Azure-hosted Databricks clusters

See OAuth 2.0 client credentials grant on the Microsoft documentation website.

See Authorize user access to Azure Databricks with OAuth on the Microsoft documentation website.

Catalog: The catalog to write to. Each writer can only write to a single catalog. Click the [...] button to see a list of accessible catalogs in the workspace.

Cloud Upload Credentials

Storage Type: Amazon S3 | Microsoft Azure Data Lake Gen 2 | Databricks Unity Catalog Volume

Select the form of cloud storage to use as a staging area. The Databricks cluster selected should be configured with its own access to this location. The cloud storage credentials provided here will strictly be used by FME to upload data to the cloud storage location and will not be passed through in any Databricks commands.

Advanced