NGramSplitter_MLE Example | Teradata Vantage - NGramSplitter_MLE Example: Overlapping ('true'), OutputTotalGramCount ('true') - Teradata Vantage

Machine Learning Engine Analytic Function Reference

Product
Teradata Vantage
Release Number
9.02
9.01
2.0
1.3
Published
February 2022
Language
English (United States)
Last Update
2022-02-10
dita:mapPath
rnn1580259159235.ditamap
dita:ditavalPath
ybt1582220416951.ditaval
dita:id
B700-4003
lifecycle
previous
Product Category
Teradata Vantageā„¢

Input

  • Input Table: paragraphs_input, which has paragraphs about common analytics topics (regression, decision Trees, and so on)
Input Table: paragraphs_input
paraid paratopic paratext
1 Decision Trees Decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. It is one of the predictive modeling approaches used in statistics, data mining and machine learning. Tree models where the target variable can take a finite set of values are called classification trees. In these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. Decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.
2 Simple Regression In statistics, simple linear regression is the least squares estimator of a linear regression model with a single explanatory variable. In other words, simple linear regression fits a straight line through the set of n points in such a way that makes the sum of squared residuals of the model (that is, vertical distances between the points of the data set and the fitted line) as small as possible
... ... ...

SQL Call

SELECT * FROM NGramSplitter_MLE (
  ON paragraphs_input
  USING
  TextColumn ('paratext')
  Delimiter (' ')
  Grams ('4-6')
  OverLapping ('true')
  ConvertToLowerCase ('true')
  Reset ('[.,?!]')
  Punctuation ('[`~#^&*()-]')
  OutputTotalGramCount ('true')
  Accumulate ('paraid', 'paratopic')
) AS dt ORDER BY paraid, paratopic, ngram;

Output

 paraid paratopic                 ngram                                                     n frequency totalcnt 
 ------ ------------------------- --------------------------------------------------------- - --------- -------- 
      1 decision trees            a decision tree as                                        4         1       73
      1 decision trees            a decision tree as a                                      5         1       66
      1 decision trees            a decision tree as a predictive                           6         1       60
      1 decision trees            a finite set of                                           4         1       73
      1 decision trees            a finite set of values                                    5         1       66
      1 decision trees            a finite set of values are                                6         1       60
      1 decision trees            a predictive model which                                  4         1       73
      1 decision trees            a predictive model which maps                             5         1       66
      1 decision trees            a predictive model which maps observations                6         1       60
      ...
      2 simple regression         a linear regression model                                 4         1       55
      2 simple regression         a linear regression model with                            5         1       52
      2 simple regression         a linear regression model with a                          6         1       49
      2 simple regression         a single explanatory variable                             4         1       55
      2 simple regression         a straight line through                                   4         1       55
      2 simple regression         a straight line through the                               5         1       52
      2 simple regression         a straight line through the set                           6         1       49
      2 simple regression         a way that makes                                          4         1       55
      2 simple regression         a way that makes the                                      5         1       52
      ...

Download a zip file of all examples and a SQL script file that creates their input tables.