The Edges table of an undirected graph can have duplicate rows, because each edge between vertices A and B is represented by two rows—one row has A in the source column and B in the target column, and the other row has B in the source column and A in the target column. Teradata recommends deleting duplicate rows from the Edges table, using this code (where edges_table is the Edges table name):
DROP TABLE copy; CREATE MULTISET TABLE copy AS ( SELECT *, ROW_NUMBER() OVER(ORDER BY source, target) rn FROM edges_table ) WITH DATA; DROP TABLE DuplicatesRemoved; CREATE MULTISET TABLE DuplicatesRemoved AS ( SELECT * FROM copy ) WITH DATA; DELETE FROM DuplicatesRemoved WHERE rn IN ( SELECT a.rn FROM DuplicatesRemoved a JOIN Copy b ON a.source=b.target AND a.target=b.source AND a.rn < b.rn ); DROP TABLE Copy;
Column | Data Type | Description |
---|---|---|
source | VARCHAR | Source key. |
target | VARCHAR | Target key. |
rn | INTEGER | Row number in edges_table. |