CSV Schema | DATASET Data Type | VantageCloud Lake - CSV Schema

CSV Schema | DATASET Data Type | VantageCloud Lake - CSV Schema - Teradata Vantage

Teradata® VantageCloud Lake

Deployment

VantageCloud

Edition

Lake

Product

Teradata Vantage

Published

January 2023

Language

English (United States)

Last Update

2024-04-03

dita:mapPath

phg1621910019905.ditamap

dita:ditavalPath

pny1626732985837.ditaval

dita:id

phg1621910019905

The DATASET type supports the CSV storage format.

A simple schema is required for specifying certain optional attributes of CSV. Specify the schema as a JSON document composed of specific key-value pairs.

Example: Using a CSV Schema

The following example contains attributes based on optional parameters.

{
      "field_names" : JSON_array_with_column_names,
      "field_delimiter" : field_delimiter_character,
      "record_delimiter" : record_delimiter_character
}

Another example of the schema:

{
      "field_names" : ["name","dob","phone","address"],
      "field_delimiter" : ",",
      "record_delimiter" : "\r\n"
}

The key names field_names, field_delimiter, and record_delimiter must be specified exactly as shown to be correctly interpreted. The names are also case-sensitive.

There are three options when specifying a header for CSV data using this schema format:

The field_names key/value pair is omitted. This tells the database there is a header record in the CSV data.
The field_names key/value pair is included, but its value is a JSON null. This tells the database that there is no header included in the CSV data, and that you do not want to provide one. The database auto-generates names for the fields of the CSV file, according to the format:
```
csv_fld1, csv_fld2, csv_fld3, ... , csv_fldN
```
The field_names key/value pair is included, and its value is a JSON array of strings. This tells the database there is no header in the CSV data, and that you want to specify the names of the fields by using this schema.

Based on the schema, the data is expected to be a set of characters where fields are divided by the field_delimiter and records are divided by the record_delimiter. The following example shows CSV data which has '&' as the field delimiter, '#' as the record delimiter, and contains a header row:

Schema

{
      "field_delimiter" : "&",
      "record_delimiter" : "#"
}

Data

Item ID&Item Name&Item Color&Item Style&Quantity Purchased&Item Price&Total Price#55&bicycle&red&boys&1&100.00&100.00#88&toy boat&pink&&1&15.10&15.10#105&soap&&&1&0.99&0.99