The following example shows how to create an Iceberg DATALAKE that connects to an Apache Hive Catalog in AWS S3 Object storage.
Define the authorization for catalog:
CREATE AUTHORIZATION hive_catalog_auth AS INVOKER TRUSTED USER 'xxx' PASSWORD 'yyy’;
Define authorization for Storage access:
CREATE AUTHORIZATION s3_storage_auth AS INVOKER TRUSTED USER 'abc' PASSWORD 'def’;
Create an Iceberg DATALAKE object referencing the two AUTH objects:
CREATE DATALAKE datalake_iceberg_hive EXTERNAL SECURITY INVOKER TRUSTED CATALOG hive_catalog_auth, EXTERNAL SECURITY INVOKER TRUSTED STORAGE s3_storage_auth USING catalog_type ('hive') catalog_location ('thrift://<hostname>:<port>') storage_location ('s3://<folder>/') storage_region('us-west-2') s3_max_task ('1000') s3_max_threads ('1000') s3_max_connections ('5000') vectorized_read_scans_batch_size ('1') TABLE FORMAT iceberg;
The following example shows how to create an Iceberg DATALAKE that connects to an Apache Hive Catalog in ADLS Gen2 storage.
Define the authorization for catalog and storage access:
CREATE AUTHORIZATION hive_catalog_auth AS INVOKER TRUSTED USER '<azure_principal_clientid> ' -- Azure AD service principal client id PASSWORD '<client_secret_key>'; -- Azure AD service principal client secret key
Create DATALAKE object:
CREATE DATALAKE database_iceberg_hive EXTERNAL SECURITY INVOKER TRUSTED CATALOG hive_catalog_auth, EXTERNAL SECURITY INVOKER TRUSTED STORAGE hive_catalog_auth USING catalog_type ('hive') catalog_location ('thrift://<hostname>:<port>') storage_location ('abfss://<folder>/') container_name ('<container-name>') storage_region ('East US 2') storage_account_name ('<account-name>') tenant_id('<tenant-id>') TABLE FORMAT iceberg;