1.1 - 8.10 - TextTagger Example: Text, Unicode Emoticons (Emojis) - Teradata Vantage

Teradata Vantage™ - Machine Learning Engine Analytic Function Reference

Product
Teradata Vantage
Release Number
1.1
8.10
Release Date
October 2019
Content Type
Programming Reference
Publication ID
B700-4003-079K
Language
English (United States)

You can run queries with emojis only from the bteq prompt, not using Teradata Studio™.

Input

============================================
Input
============================================
 id |      title_1      |                                                                                                                                                                                                        contents                                                                                                                                                                                                         | catalog  
----+-------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------
  1 | Chennai Floods    | Chennai floods have battered the capital city of Tamil Nadu and its adjoining areas 😩. Normal life came to a standstill when roads were submerged in water and all modes of transport were severely affected. In the past, Chennai has had tsunamis and earthquakes                                                                                                                                                       | Regional
  2 | Tennis Superstars | Roger Federer born on 8 August 1981, is a greatest 👍 tennis player, who has been continuously ranked inside the top 10 since October 2002 and has won Wimbledon, USOpen, Australian and FrenchOpen titles mutiple times                                                                                                                                                                                                   | sports
  3 | Sports Rivalry    | The Federer Nadal rivalry, known by many as Fedal, is between two professional tennis players, Roger Federer of Switzerland and Rafael Nadal of Spain. They are currently engaged in a storied rivalry, which many consider to be the greatest in tennis history. They have played 34 times, most recently in the 2015 Swiss Indoors final, and Nadal leads their eleven-year-old rivalry with an overall record of 231 | sports
  4 | Sports Rivalry    | The India Pakistan cricket rivalry is one of the most intense 😣 sports rivalries in the world. An India-Pakistan cricket match has been estimated to attract up to one billion viewers, according to TV ratings firms and various other reports. The 2011 World Cup semifinal between the two teams attracted around 988 million television viewers                                                                       | sports
  5 | Sports Rivalry    | An Ashes series is traditionally of five Tests, hosted in turn by England and Australia at least once every four years. As of August 2015, England hold the ashes, having won three of the five Tests in the 2015 Ashes series. Overall, Australia has won 32 series, England 32 and five series have been drawn.                                                                                                       | sports
(5 rows)

SQL Call

SELECT * FROM TextTagger(
ON text_inputs_emojis
USING
TaggingRules (
 'contain(contents, "👍", 1,) AS Thumbs',
    'contain(contents, "👍", 1,) or
        contain(contents, "greatest", 1,) AS Fabulous',
    'contain(contents, "greatest", 1,) AS Wonderful',
    'contain(contents, "😩", 1,) AS Weary',
    'contain(title_1, "Tennis", 1,) and
        contain(contents, "Roger", 1,) AS Tennis-Greats',
    'contain(contents, "India", 1,) and
        contain(contents, "Pakistan", 1,) AS Cricket-Rivalry',
    'contain(contents,"Australia",1,) and
        contain(contents, "England", 1,) AS The-Ashes'
 )
    OutputByTag ('true')
    Accumulate ('id')
) AS dt ORDER BY id;

Output

         id tag
----------- ---------------------------------------------------------------
          1 Weary
          2 Wonderful
          2 Fabulous
          2 Tennis-Greats
          2 Thumbs
          3 Wonderful
          3 Fabulous
          4 Cricket-Rivalry
          5 The-Ashes

Download a zip file of all examples and a SQL script file that creates their input tables from the attachment in the left sidebar.