MurmurHash Example: 32-Bit Hash Value (by Default) | Teradata Vantage - MurmurHash Example: 32-Bit Hash Value (by Default) - Teradata Vantage

Machine Learning Engine Analytic Function Reference

Product
Teradata Vantage
Release Number
9.02
9.01
2.0
1.3
Published
February 2022
Language
English (United States)
Last Update
2022-02-10
dita:mapPath
rnn1580259159235.ditamap
dita:ditavalPath
ybt1582220416951.ditaval
dita:id
B700-4003
lifecycle
previous
Product Category
Teradata Vantageā„¢

Input

The input table, murmurhash_input, is a log of midnight temperatures (in degrees Fahrenheit) for five consecutive nights in five cities. The rows of all columns except id and city_varchar are converted to hash values. Because the hash value depends on data type, the input table has three columns each, with different data types, for the time period and temperature:

Columns Content Data Types
3-5 Time period TIMESTAMP, VARCHAR, DATE
6-8 Temperature DOUBLE PRECISION, INTEGER, VARCHAR
Input table: murmurhash_input
id city_varchar period_timestamp period_text period_date temp_f_real temp_f_integer temp_f_text
1 Asheville 2008-10-10 00:00:00 10-10-2008 00:00:00 2008-10-10 34.9 35 34.9
3 Brownsville 2008-10-12 00:00:00 10-12-2008 00:00:00 2008-10-12 34 34 34
5 Knoxville 2008-10-14 00:00:00 10-14-2008 00:00:00 2008-10-14 32 32 32
2 Greenville 2008-10-11 00:00:00 10-11-2008 00:00:00 2008-10-11 34.4 34 34.4
4 Nashville 2008-10-13 00:00:00 10-13-2008 00:00:00 2008-10-13 35.6 36 35.6

SQL Call

SELECT * FROM MurmurHash (
  ON murmurhash_input
  USING
  TargetColumns ('[2:7]')
  Accumulate ('id')
) AS dt ORDER BY 1;

For the TargetColumns syntax element, columns are numbered 0, 1, 2, and so on (not 1, 2, 3 and so on, as in the id column of the table murmurhash_input.

Output

The hash values for each time period are different for each data type. When the temperature value in the input table is an integer, the hash values are the same for INTEGER and VARCHAR, but different for REAL. When the temperature value in the input table is real, the hash values are the same for REAL and VARCHAR, but different for INTEGER.

 id period_timestamp_murmurhash period_text_murmurhash period_date_murmurhash temp_f_real_murmurhash temp_f_integer_murmurhash temp_f_text_murmurhash 
 -- --------------------------- ---------------------- ---------------------- ---------------------- ------------------------- ---------------------- 
  1                  1962681181             -463297848              557349452              491855154               -2026295078              491855154
  2                 -1741097193             1584740227            -1509759604             -499861358               -2026295078             -499861358
  3                  1530898040             1693067691             -122411470             -141326552               -2026295078            -2026295078
  4                 -1426854377            -2116413651             1981218164              138771797                -392240485              138771797
  5                 -1560585364             2011908706             1085005088             1047067045                -902858061             -902858061

Download a zip file of all examples and a SQL script file that creates their input tables.