JSON_PUBLISH Table Operator Examples | Teradata Vantage - Examples: JSON_PUBLISH - Advanced SQL Engine

JSON_PUBLISH Table Operator Examples | Teradata Vantage - Examples: JSON_PUBLISH - Advanced SQL Engine - Teradata Database

JSON Data Type

Product

Advanced SQL Engine

Teradata Database

Release Number

17.05

17.00

Published

September 2020

Language

English (United States)

Last Update

2021-01-23

dita:mapPath

gzn1554761068186.ditamap

dita:ditavalPath

lze1555437562152.ditaval

dita:id

B035-1150

lifecycle

Product Category

Teradata Vantage™

Setting Up the JSON_PUBLISH Examples

This simple table is used to illustrate the functionality of JSON_PUBLISH in the following examples.

CREATE TABLE employeeTable(
   empID INTEGER,
   empName VARCHAR(100),
   empDept VARCHAR(100));

INSERT INTO employeeTable(1,'George Smith','Accounting'); 
INSERT INTO employeeTable(2,'Pauline Kramer','HR');
INSERT INTO employeeTable(3,'Steven Mazzo','Engineering');

Examples Using JSON_PUBLISH

Publish the entire table, aggregating the results.

SELECT * FROM JSON_PUBLISH
(
   ON (SELECT * FROM employeeTable)	 
) AS jsonData;

Result:

data
----------------------------------------------------
[{
     "empID": 3,
     "empName": "Steven Mazzo",
     "empDept": "Engineering"
},
{
     "empID": 1,
     "empName": "George Smith",
     "empDept": "Accounting"
},
{
     "empID": 2,
     "empName": "Pauline Kramer",
     "empDept": "HR"
}]

Publish the entire table, without aggregating results.

SELECT * FROM JSON_PUBLISH
(
   ON (SELECT * FROM employeeTable)	 
   USING DO_AGGREGATE('N')
) AS jsonData ORDER BY data..empID;

Result:

data
----------------------------------------------------
[{
     "empID": 1,
     "empName": "George Smith",
     "empDept": "Accounting"
}]
----------------------------------------------------
[{
     "empID": 2,
     "empName": "Pauline Kramer",
     "empDept": "HR"
}]
----------------------------------------------------
[{
     "empID": 3,
     "empName": "Steven Mazzo",
     "empDept": "Engineering"
}]
----------------------------------------------------

Publish the entire table, renaming columns, without aggregating.

SELECT * FROM JSON_PUBLISH
(
   ON (SELECT empID as ID, empName as "Name", empDept as Dept 
       FROM employeeTable)
   USING DO_AGGREGATE('N')
) AS jsonData ORDER BY data.."ID";

Results:

data
----------------------------------------------------
[{
   "ID": 1,
   "Name": "George Smith",
   "Dept": "Accounting"
}] 
----------------------------------------------------
[{
   "ID": 2,
   "Name": "Pauline Kramer",
   "Dept": "HR"
}] 
----------------------------------------------------
[{
   "ID": 3,
   "Name": "Steven Mazzo",
   "Dept": "Engineering"
}] 
----------------------------------------------------

Publish the entire table, without aggregating, without building JSON array, using the UNICODE character set.

SELECT * FROM JSON_PUBLISH
(
   ON (SELECT * FROM employeeTable) 
   RETURNS (col1 JSON CHARACTER SET UNICODE)
   USING WRITE_ARRAY('N') DO_AGGREGATE('N') 
) AS jsonData ORDER BY col1..empID;

Results:

col1
---------------------------------------------------------------
{"empID":1,"empName":"George Smith","empDept":"Accounting"}
---------------------------------------------------------------
{"empID":2,"empName":"Pauline Kramer","empDept":"HR"}
---------------------------------------------------------------
{"empID":3,"empName":"Steven Mazzo","empDept":"Engineering"}
---------------------------------------------------------------

JSON_PUBLISH may be used to aggregate multiple values from multiple rows into one instance. However, this aggregation is done locally on each AMP within Vantage. Therefore, JSON_PUBLISH may be invoked twice within a statement to perform a single aggregation of all input values. If JSON_PUBLISH is invoked once without the PARTITION BY clause, each AMP produces one row of aggregated output.

The following INSERT statements add some input data to the example table to illustrate this point. Note that actual aggregation in this initial step depends on the architecture of the system on which it is run.

INSERT INTO employeeTable(4,'Jose Hernandez','Engineering'); 
INSERT INTO employeeTable(5,'Kyle Newman','Engineering');
INSERT INTO employeeTable(6,'Pamela Giles','Sales');

SELECT data FROM JSON_PUBLISH
(
   ON (SELECT * FROM employeeTable)	 
) AS jsonData;

Results:

data
----------------------------------------------------
[{
   "empID": 5,
   "empName": "Kyle Newman",
   "empDept": "Engineering"
},
{
   "empID": 3,
   "empName": "Steven Mazzo",
   "empDept": "Engineering"
},
{
   "empID": 1,
   "empName": "George Smith",
   "empDept": "Accounting"
},
{
   "empID": 2,
   "empName": "Pauline Kramer",
   "empDept": "HR"
}]
----------------------------------------------------
[{
   "empID": 4,
   "empName": "Jose Hernandez",
   "empDept": "Engineering"
}]
----------------------------------------------------
[{
   "empID": 6,
   "empName": "Pamela Giles",
   "empDept": "Sales"
}]
----------------------------------------------------

To do a final union from all AMPs, the local results from each AMP must be aggregated. This may be achieved by nesting a second call to JSON_PUBLISH. The inner call to JSON_PUBLISH performs the local aggregation on each AMP as described above, and the outer call will do a final aggregation of the local aggregations and produce a single result row that represents the union of all input values.

Use the PARTITION BY clause with a constant value (for example, a value of 1) to perform the final aggregation on a single AMP. By specifying a constant value, like the number 1, the locally aggregated rows from each AMP will be in the same partition and as a result will all be redistributed to the same AMP for the final aggregation. The outer query will partition by this constant shown as "1 as p" below. A single output row will be returned.

Additionally, the aggregated result may be referenced using dot notation by using the recursive descent operator '..' to reference the name of the result of the inner JSON_PUBLISH query followed by an array reference composed of the '*' wildcard, which will retrieve one array composed of one record per input row. The final query would look something like the following:

select data..record[*] FROM JSON_PUBLISH 
( 
   ON (SELECT data as record, 1 as p FROM JSON_PUBLISH
   (
      ON (SELECT * FROM employeeTable)
   )as L
) partition by p
)G;

Results:

data..record[*]
----------------------------------------------------
[{
   "empID": 5,
   "empName": "Kyle Newman",
   "empDept": "Engineering"
},
{
   "empID": 3,
   "empName": "Steven Mazzo",
   "empDept": "Engineering"
},
{
   "empID": 1,
   "empName": "George Smith",
   "empDept": "Accounting"
},
{
   "empID": 2,
   "empName": "Pauline Kramer",
   "empDept": "HR"
},
{
   "empID": 6,
   "empName": "Pamela Giles",
   "empDept": "Sales"
},
{
   "empID": 4,
   "empName": "Jose Hernandez",
   "empDept": "Engineering"
}]

Note that you could aggregate all values with a single call to JSON_PUBLISH by partitioning by a constant value, but that would redistribute all rows to a single AMP to perform the aggregation, which would not take advantage of the parallel processing capability of Vantage. By using two calls to JSON_PUBLISH, the AMPs will perform local aggregations in parallel and only the final aggregation will be performed on a single AMP.