Usage Notes - Advanced SQL Engine - Teradata Database

JSON Data Type

Product
Advanced SQL Engine
Teradata Database
Release Number
17.10
Published
July 2021
Language
English (United States)
Last Update
2021-07-27
dita:mapPath
uwa1591040057999.ditamap
dita:ditavalPath
uwa1591040057999.ditaval
dita:id
B035-1150
lifecycle
previous
Product Category
Teradata Vantageā„¢
When the character set of the data to be shredded is:
  • LATIN, use the JSON_SHRED_BATCH procedure
  • UNICODE, use the JSON_SHRED_BATCH_U procedure

Other than the difference regarding the character set of the data, the functionality of the two procedures is identical.

When a JSON value is shredded to populate a CHAR, VARCHAR, or VARBYTE column, if the size of the value is larger than the size of the target column, the value is truncated to fit the column.

The JSON_SHRED_BATCH query provides flexibility between the source JSON instance and the table(s) the source data is loaded into. This flexibility allows for efficient and non-efficient queries, depending on the query itself and how the mapping (shred statement) is performed.

The following guidelines assist in achieving the optimal performance with these procedures.
  • For each shred statement, a JSON_TABLE function call is made, to shred the JSON object into a temporary table based on the row expression and column expressions. The resulting temporary table may be used to assign values to any column of any table for which the user has the proper privileges. The best performing queries optimize the mapping such that each shred statement updates the maximum possible number of tables. Only if complications of the mapping (such as hierarchical relationships) make it impossible to map a shredding to an actual column should another shred statement be included in the query.
  • The performance is largely dependent upon the usage of the procedure. If the mapping minimizes the number of separate queries needed, it will perform best. It is not always the case that everything can fit into one shred statement; for this reason multiple statements are allowed.
  • This procedure allows INSERT, UPDATE, MERGE and DELETE operations, which can be specified in the operation portion of the metadata portion of the statement. The keys in the metadata statement are used to perform the join between the temporary table created by the row/column expressions and the target table. This should be used carefully as it can drastically affect performance. In a MERGE operation, the target table must have a primary index, and the primary index has to be a member of the keys in the metadata.
  • In order to avoid a full table join, we require an ID column to be specified in the input query parameter, so that a join condition can be built off that column.

Columns of a target table may be assigned values in the temporary table created by the row and column expressions, constants, or the results of SQL expressions. The use of an SQL expression requires the user to submit a proper SQL statement (in terms of syntax and actual results of the query). This is a powerful and flexible way to manipulate the data in a target table, but can cause a problem if queries are not structured properly. Any errors reported by the DBS based on an SQL expression will be reported to the user and cause the query to fail. Columns of the temporary table created by the row and column expressions and the extra columns created by the input query may be used in the SQL expression.

In the process of shredding, a volatile table is created for each shred statement. A table can have a maximum of 2048 columns, so all the columns together from all the table mappings should not exceed 2044 columns (there are four internal columns). You can have 1 to N target tables, which can each have 1 to N columns, but the total number of all columns must not exceed 2044.

All keywords in the shred statement must be specified in lowercase.

The names assigned to the temporary columns (temp_column_name) and the names of extra columns created by the input query must be unique. They can be referenced in the table expression clause, so there cannot be any ambiguity. Note, names are not case sensitive. If a non-unique name is detected, an error is reported. For example, col1 and COL1 will fail because they are used in internal queries and are not unique.
All the names given in the keys clause must be present in the column assignment clause.

You must specify the data type of the temporary column in the output table in the column expression. It is necessary to provide this information so that the data may be correctly interpreted and used with the target table(s).

JSONID and ROWINDEX (uppercase or lowercase) are keywords. They are used to track the input JSON document ID value (the first column of the input query) and the index number for an input row, respectively. JSONID and ROWINDEX are not allowed in colexpr and queryexpr because they are fixed temporary column names. A syntax error is reported if they are used in those clauses. However, they may be referenced in the table expression clause as a source value for the shredding operation.