Creating a Flow | VantageCloud Lake Console - Creating a Flow Using the Console (AWS) - Teradata Vantage

Teradata® VantageCloud Lake

Deployment
VantageCloud
Edition
Lake
Product
Teradata Vantage
Published
January 2023
Language
English (United States)
Last Update
2024-04-03
dita:mapPath
phg1621910019905.ditamap
dita:ditavalPath
pny1626732985837.ditaval
dita:id
phg1621910019905
Any time in the following procedure, you can select Save to save your progress. The status of a saved flow is PENDING CREATE. To continue setting up the flow, select Continue from its menu. To delete the flow, select Delete from its menu. If you log off the current session, you lose every flow whose status is PENDING CREATE. You do not lose created flows.
  1. Select Create flows or +.
  2. Enter the New flow information:
    Item Description
    Load Options When to load the source data:
    • Once

      You can rerun Once flows to load additional data in the specified sources.

    • Schedule

      Daily at a specified time or weekly on a specified day at a specified time.

    • Advanced Schedule

      At the times you specify with a cron expression. See AWS Cron Expressions

    • Continuous Rate

      Every x hours, minutes, or days.

    AWS Role ARN Amazon Resource Name that uniquely identifies the AWS IAM role trust policy. See Amazon Resource Names

    The required permissions are included in the IAM Policy Template.

    AWS External ID External ID that designates who can assume the AWS role. See AWS Granting Access using External ID
  3. [Optional] Select advanced options, if you do not want Flow to determine which values to use.
  4. Select Add source or +.
  5. Enter the Source options information:
    Item Description
    Foreign table name Name of foreign table for source file, which Flow creates. Must not be the name of an existing table.
    S3 bucket path URI URI of source file.

    URI scheme must be 's3://'.

    Path needs read permission on URI path.

    Flow needs the following permissions to access your source bucket:
    • s3:ListBucket
    • s3:GetObject
    • s3:GetBucketLocation

    The preceding permissions are included in the IAM Policy Template.

    S3 Manifest bucket path URI URI of manifest bucket and optional key prefix.

    A manifest bucket is a path to the location where Flow writes the manifest file that the foreign table uses to select files to read.

    You must specify this location because Flow uses the same authorization object to read the manifest file and the selected files.

    Flow needs the following permissions to access your manifest bucket:
    • s3:ListBucket
    • s3:PutObject
    • s3:GetObject
    • s3:DeleteObject
    • s3:GetBucketLocation

    The preceding permissions are included in the IAM policy template in the IAM Policy Template.

    Format Accept CSV or choose Parquet from menu.
    Headers Option is only for CSV. Select if source file has header row.
    Quoted Option is only for CSV. Select if source file has quoted fields.

    Selecting Quoted has no adverse affect if source file has no quoted fields.

    If source file has quoted fields and you do not select Quoted, quotes appear in loaded field values.

    Delimiter Delimiter that separates source fields. Default: comma (,).
    Compression For CSV: Choose None or Gzip from menu.

    For Parquet: Choose None or Snappy from menu.

  6. Select Add Target or +.
  7. Under Target load, enter the name of the target table and select Table type.
    Flow creates the target table if it does not exist.
  8. [Optional] Return to step 3 to create another source-target pair.
  9. Select Create.