Deleting Duplicate Edges Table Rows

Deleting Duplicate Edges Table Rows - Aster Analytics

Teradata Aster® Analytics Foundation User GuideUpdate 2

Product

Aster Analytics

Release Number

7.00.02

Published

September 2017

Language

English (United States)

Last Update

2018-04-17

dita:mapPath

uce1497542673292.ditamap

dita:ditavalPath

AA-notempfilter_pdf_output.ditaval

dita:id

B700-1022

lifecycle

Product Category

Software

The edges table of an undirected graph can have duplicate rows, because each edge between vertices A and B is represented by two rows—one row has A in the source column and B in the target column, and the other row has B in the source column and A in the target column. Teradata recommends deleting duplicate rows from the edges table, using this code (where edges_table is the edges table name):

DROP TABLE IF EXISTS copy;

CREATE TABLE copy DISTRIBUTE BY HASH(source)
  AS SELECT *, ROW_NUMBER() OVER(ORDER BY source, target) rn
    FROM edges_table;

DROP TABLE IF EXISTS DuplicatesRemoved;

CREATE TABLE DuplicatesRemoved AS * SELECT FROM copy;

DELETE FROM DuplicatesRemoved WHERE rn IN (
  SELECT a.rn FROM DuplicatesRemoved a
  JOIN Copy b
  ON a.source=b.target AND a.target=b.source AND a.rn < b.rn);

DROP TABLE IF EXISTS Copy;

AllPairsShortestPath DuplicatesRemoved Table Schema
Column Name	Data Type	Description
source	VARCHAR	Source key.
target	VARCHAR	Target key.
rn	INTEGER	Row number in edges_table.