Geospatial Join Predicate Examples - Advanced SQL Engine

Geospatial Join Predicate Examples - Advanced SQL Engine - Teradata Database

Geospatial Data Types

Product

Advanced SQL Engine

Teradata Database

Release Number

17.10

Published

July 2021

Language

English (United States)

Last Update

2021-07-27

dita:mapPath

ufo1590694251930.ditamap

dita:ditavalPath

ufo1590694251930.ditaval

dita:id

B035-1181

lifecycle

Product Category

Teradata Vantage™

Example: Using Geospatial Join Predicates

The following example shows the use of geospatial join predicates to determine which points in one table lie within polygons defined in a second table. For such a query the Teradata Optimizer would consider creating a nested join to speed data access.

CREATE TABLE DB_TEST.T1
(a INTEGER, b char(20000), GeoCol ST_Geometry)
INDEX (GeoCol);

CREATE TABLE DB_TEST.T2
(pkey INTEGER, buffer char(20000), Geom ST_Geometry)
INDEX (Geom);


INSERT INTO DB_TEST.T1 (1, 'Teradata01', 'POINT(10 20)');
INSERT INTO DB_TEST.T1 (1, 'Teradata02', 'POINT(20 30)');

INSERT INTO DB_TEST.T2 (1, 'Teradata01', 'POLYGON((3 3, 3 8, 8 8, 8 3, 3 3))');
INSERT INTO DB_TEST.T2 (1, 'Teradata02', 'POLYGON((15 15, 15 35, 35 35, 35 15, 15 15))');

SELECT a,b (format 'x(12)'), geocol
FROM T2
INNER JOIN T1 ON T1.GeoCol.ST_WITHIN(T2.Geom)= 1;

          a b            GeoCol
----------- ------------ -----------------
          1 Teradata02   POINT (20 30)

The number of rows inserted into the table and the nature of the geospatial data itself determines whether the Optimizer chooses to use a geospatial index. In this simple example, with only two rows in each table, the geospatial index would probably not be used. With significantly larger tables or more complex shapes, and depending on the nature and configuration of the database, the Optimizer could use the index to speed query processing. If it did use the index, the EXPLAIN would look similar to this, where step 5 shows the geospatial index use:

Explanation
---------------------------------------------------------------------------
  1) First, we lock DB_TEST.T1 for read on a reserved RowHash to
     prevent global deadlock.
  2) Next, we lock DB_TEST.T2 for read on a reserved RowHash to
     prevent global deadlock.
  3) We lock DB_TEST.T1 for read, and we lock DB_TEST.T2 for read.
  4) We do an all-AMPs RETRIEVE step from DB_TEST.T2 by way of an
     all-rows scan with a condition of ("DB_TEST.T2.b > 1") into
     Spool 2 (all_amps), which is duplicated on all AMPs. The size of
     Spool 2 is estimated with no confidence to be 116 rows (1,168,468
     bytes). The estimated time for this step is 0.67 seconds.
  5) We do an all-AMPs JOIN step from Spool 2 (Last Use) by way of an
     all-rows scan, which is joined to DB_TEST.T1 by way of Spatial
     index # 4 "Retrieving from CDT indexed column DB_TEST.T1.sp via
     CDT key expression {LeftTable}.sp .ST_MBR ()"extracting row ids
     only. Spool 2 and DB_TEST.T1 are joined using a nested join,
     with a join condition of ("(1=1)"). The result goes into Spool 3
     (all_amps), which is built locally on the AMPs. Then we do a SORT
     to order Spool 3 by field Id 1. The size of Spool 3 is estimated
     with no confidence to be 1 row (10,083 bytes). The estimated time
     for this step is 0.01 seconds.
  6) We do an all-AMPs JOIN step from Spool 3 (Last Use) by way of an
     all-rows scan, which is joined to DB_TEST.T1 by way of an
     all-rows scan with a condition of ("DB_TEST.T1.b > 1"). Spool
     3 and DB_TEST.T1 are joined using a row id join, with a join
     condition of ("(sp .ST_WITHIN ({RightTable}.sp ))= 1"). The
     result goes into Spool 1 (group_amps), which is built locally on
     the AMPs. Then we do a SORT to order Spool 1 by the sort key in
     spool field1. The size of Spool 1 is estimated with no confidence
     to be 1 row (95 bytes). The estimated time for this step is 0.05 
     seconds.
  7) Finally, we send out an END TRANSACTION step to all AMPs involved
     in processing the request.
     -> The contents of Spool 1 are sent back to the user as the result of
     statement 1. The total estimated time is 0.73 seconds.

Using Both an ST_Geometry Column and a Column Expression in a Geospatial Join

The following example shows that one side of a geospatial join, either the owner expression or the argument, can be a column expression that contains a column and evaluates to an ST_Geometry value. The other side of the join must be a geospatial column that has a geospatial index. The example uses the ST_DISTANCE predicate. g2.geo.ST_Buffer(1) is an expression containing a geospatial column from g2 that generates an ST_Geometry object. g1.geo is an ST_Geometry column in g1 that has a geospatial index.

SELECT * FROM geotable g1, geotable2 g2
WHERE g1.geo.ST_Distance (g2.geo.ST_Buffer(1)) < 5
ORDER BY g1.a;