Specifies the name and data type of a column in the foreign table for Parquet-formatted data. The data_column_definition has the format column_name column_data_type
You cannot specify data column definitions for CSV or JSON formatted external data. Vantage creates the data columns automatically from the information in the data itself or from the column specification in a schema object associated with the CSV data.
- Define the columns in the same order as they appear in the Parquet file.
- Use the Vantage column data types that correspond to the Parquet data types listed in the Parquet schema.
Parquet Logical Data Type | Corresponding Vantage Data Type | Description |
---|---|---|
UINT_8 | SMALLINT | Unsigned integer with range 0 to 255. |
UINT_16 | INTEGER | Unsigned integer with range 0 to 65535. |
UINT_32 | BIGINT | Unsigned integer with range 0 to 4,294,967,295. |
UINT_64 | DECIMAL(20,0) | Unsigned integer with range 0 to 18,446,744,073,709,551,615. |
INT_8 | BYTEINT | Signed integer with range -128 to 127. |
INT_16 | SMALLINT | Signed integer with range -32768 to 32767. |
INT_32 | INTEGER | Signed integer with range -2,147,483,648 to 2,147,483,647. |
INT_64 | BIGINT | Signed integer with range -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807. |
INT_96 (in UTC format) | TIMESTAMP(6) | Used to represent timestamp. First 8 bytes has number of nanoseconds and next 4 bytes has number of days since Julian day. Deprecated in future Parquet releases. |
DECIMAL(p,s) | DECIMAL(p,s) | Decimal where precision is between 1 and 38 (inclusive) and scale is between 0 and 38 (inclusive). |
VARBYTE(p+1), where p is the precision of the DECIMAL. | Precision and scale is greater than 38. | |
FLOAT | REAL | IEEE 32-bit floating point. |
DOUBLE | REAL | IEEE 64-bit floating point. |
DATE | DATE | Number of days from the Unix epoch, 1 January 1970 |
TIME_MILLIS (in UTC format) | TIME(3) or TIME(6) | Stores time in milliseconds. |
TIME_MICROS (in UTC format) | TIME(6) | Stores time in microseconds. |
TIMESTAMP_MILLIS (in UTC format) | TIMESTAMP(3) or TIMESTAMP(6) | Stores timestamp in milliseconds. |
TIMESTAMP_MICROS (in UTC format) | TIMESTAMP(6) | Stores timestamp in microseconds. |
INTERVAL | VARCHAR CHARACTER SET UNICODE | Stores interval in months, days and milliseconds. |
BOOL | BYTEINT | Boolean (True or False). |
STRING | VARCHAR or CLOB CHARACTER SET UNICODE | Encoded as a UTF8 byte array. |
BSON | JSON STORAGE FORMAT BSON | BSON data. |
JSON | JSON CHARACTER SET UNICODE | JSON data. |
ENUM | VARCHAR or CLOB CHARACTER SET UNICODE | Store enumerated values encoded as a UTF8 string. |
ARRAY | VARCHAR or CLOB CHARACTER SET UNICODE | Stored as repeated fields. Can be an array of a single value or multiple values. |