The DATALAKE object encapsulates all the information needed to connect to an OTF data lake including the Authorization information needed to connect to the Catalog & Object storage and the connection details.
The following example shows how to create a DATALAKE that connects to a Databricks Unity Catalog and ADLS Gen2 storage. The same Authorization object can be used for both Catalog and Storage.
The Name-Value pairs in the USING clause are as follows:
Name | Required | Value |
---|---|---|
catalog_type | Y | Unity |
catalog_location | Y | Endpoint must be specified as /api/2.1/unity-catalog/iceberg |
unity_catalog_name | Y | Represents a catalog in the metastore, the first layer of the object hierarchy that is used to organize data assets such as databases/schemas, tables, views, and volumes. |
storge_account_name | Y | |
tenant_id | Y | |
default_cluster_id | Y | For Iceberg Write on Unity, OTF will require a running Databricks Spark cluster to execute the SYNC METADATA Spark SQL. If the cluster is not running, it’ll start the cluster and will timeout with errors if the cluster is not in RUNNING state in 5 minutes. |