Storage Format Conversions - Advanced SQL Engine - Teradata Database

JSON Data Type

Product
Advanced SQL Engine
Teradata Database
Release Number
17.10
Published
July 2021
Language
English (United States)
Last Update
2021-07-27
dita:mapPath
uwa1591040057999.ditamap
dita:ditavalPath
uwa1591040057999.ditaval
dita:id
B035-1150
lifecycle
previous
Product Category
Teradata Vantageā„¢

A table column that is defined to store JSON data in a particular storage format (text, BSON, or UBJSON) can accept JSON data in any of the other storage formats. Vantage automatically converts the data to the storage format of the column before storing it.

The data is converted based on the following rules.
Conversion of Storage Format Description
From BSON to Text

The BSON data is converted to the character set of the column. In BSON, strings are stored in UTF8, so if the data contains a character that is not in the repertoire of the target character set, a translation error is reported.

BSON contains extensions to the base JSON data types (such as binary data and dates). These extended data types are treated as strings in the resulting JSON text document.

From UBJSON to Text The UBJSON data is converted to the character set of the column. In UBJSON, strings are stored in UTF8, so if the data contains a character that is not in the repertoire of the target character set, a translation error is reported.
From Text to BSON JSON text data is converted to the BSON format. All strings are encoded in UTF8, even if the source JSON text was in the LATIN character set. No extended data types are used in the resulting BSON document.
From UBJSON to BSON UBJSON data is converted to the BSON format. No extended data types are used in the resulting BSON document.
From Text to UBJSON JSON text data is converted to the UBJSON format. All strings are encoded in UTF8, even if the source JSON text was in the LATIN character set.
From BSON to UBJSON BSON data is converted to the UBJSON format. BSON contains extensions to the base JSON data types (such as binary data and dates). These extended data types are treated as strings in the resulting UBJSON document.

Conversions Involving BSON

When converting from BSON, if the BSON document contains some data in one of the extended data type formats and this data is not supported in the target format, the data will be enclosed in double quotation marks.

When converting to BSON, the source formats will not contain any data in the extended data type formats. Vantage does not interpret data enclosed in double quotation marks to determine if it qualifies as one of the extended data types. Therefore, in all cases of converting to BSON, the result is a BSON document that has no data in any extended data type.

The root of a JSON document when stored as BSON is always assumed to be an object. Furthermore, arrays are stored as key/value pairs where the key is the array index of the element.

For example, consider the following JSON document:

[ "a" , "b" ]

In BSON, that document is stored similar to the following:

{ "0" : "a" , "1" : "b" }

When this BSON document is converted to text, you will get that object-like representation. This is due to the BSON assumption that the root of the document is an object. This is a limitation of the BSON specification. Nested arrays do not suffer from this limitation because they are represented identically, but preceded by an indicator byte to signify that the data represents an array. Therefore, when converting back to JSON text, the keys are not printed out.

Consider the following document:

{ "array" : [ "a" , "b" ] }

The nested array is treated properly as an array, so that if you convert this text to BSON and then back to text, you will get the exact same document as follows:

{ "array" : [ "a" , "b" ] }

Note that when searching for a value, the JSONPath query does not change whether the data is stored as an array or as an object. For example, consider the following documents stored as an array and as an object:

[ "a" , "b" ]
{ "0" : "a" , "1" : "b" }

When searching these documents for the value "a", the following query succeeds on both documents:

SELECT jsonCol[0] FROM jsonTable;

The reason is that JSONPath normalizes the request query. One of the normalizations involves removing the square brackets, so when evaluating the query, it is not known if the user wanted to search for a key called "0" or an element at index 0 of an array. JSONPath will do the following:

  1. Search for the key.
  2. If the search fails, and the value being searched for is a positive integer and we are in an array, then look for an element at the specified index.

Therefore, both cases will return "a" when searched.

In summary, the limitation only applies when reading the BSON data and converting it back to text. The resulting text document may look a little different, but it can be searched in the same way.