The DATASET data type includes a schema and data, which can both have a variable length. You can use the INLINE LENGTH option to specify an inline storage size. When the data is smaller than or equal to the inline storage size, it is stored inside the base row. Otherwise, the data is stored as a LOB (large object).
If the data is stored inline, it is treated as a non-LOB type. In this case, the performance may be better because there is no LOB overhead. You may see some performance improvement, especially when the data type is used with UDFs.
Each specification of the DATASET data type includes the following information:
- Maximum length
- In-line length
- Storage format
Specify the STORAGE FORMAT option in the data type specification syntax. Avro is the currently supported storage format. DATASET information, including both the schema and data, is stored as follows:
|Storage Location||Maximum Length||Minimum Length||Default Length|
|LOB||2 GB||100 bytes||2 GB|
|Inline||64 KB||100 bytes||10 KB|
The schema can be specified in any supported JSON format, but it is stored as UNICODE text and encoded in UTF-8. The schema is null-terminated. The maximum size of an Avro schema that is created at the instance level, not the table level, is 16 MB. The maximum size of a binary-encoded Avro value is also 16 MB.