The following example shows how to create an Iceberg DATALAKE that connects to an Apache Hive Catalog in AWS S3 Object storage.
Define the authorization for catalog:
CREATE AUTHORIZATION hive_catalog_auth
AS INVOKER TRUSTED
USER 'xxx'
PASSWORD 'yyy’;
Define authorization for Storage access:
CREATE AUTHORIZATION s3_storage_auth
AS INVOKER TRUSTED
USER 'abc'
PASSWORD 'def’;
Create an Iceberg DATALAKE object referencing the two AUTH objects:
CREATE DATALAKE datalake_iceberg_hive
EXTERNAL SECURITY INVOKER TRUSTED CATALOG hive_catalog_auth,
EXTERNAL SECURITY INVOKER TRUSTED STORAGE s3_storage_auth
USING
catalog_type ('hive')
catalog_location ('thrift://<hostname>:<port>')
storage_location ('s3://<folder>/')
storage_region('us-west-2')
s3_max_task ('1000')
s3_max_threads ('1000')
s3_max_connections ('5000')
vectorized_read_scans_batch_size ('1')
TABLE FORMAT iceberg;
The following example shows how to create an Iceberg DATALAKE that connects to an Apache Hive Catalog in ADLS Gen2 storage.
Define the authorization for catalog and storage access:
CREATE AUTHORIZATION hive_catalog_auth AS INVOKER TRUSTED USER '<azure_principal_clientid> ' -- Azure AD service principal client id PASSWORD '<client_secret_key>'; -- Azure AD service principal client secret key
Create DATALAKE object:
CREATE DATALAKE database_iceberg_hive
EXTERNAL SECURITY INVOKER TRUSTED CATALOG hive_catalog_auth,
EXTERNAL SECURITY INVOKER TRUSTED STORAGE hive_catalog_auth
USING
catalog_type ('hive')
catalog_location ('thrift://<hostname>:<port>')
storage_location ('abfss://<folder>/')
container_name ('<container-name>')
storage_region ('East US 2')
storage_account_name ('<account-name>')
tenant_id('<tenant-id>')
TABLE FORMAT iceberg;