POSTagger Example - Teradata Vantage

Machine Learning Engine Analytic Function Reference

Product
Teradata Vantage
Release Number
8.10
1.1
Published
October 2019
Language
English (United States)
Last Update
2019-12-31
dita:mapPath
ima1540829771750.ditamap
dita:ditavalPath
jsj1481748799576.ditaval
dita:id
B700-4003
lifecycle
previous
Product Category
Teradata Vantageā„¢

Input

SQL Call

SELECT * FROM POSTagger (
  ON SentenceExtractor (
    ON paragraphs_input 
    USING
    TextColumn ('paratext')
    Accumulate ('paraid')
  ) 
  USING
  TextColumn ('sentence')
  Accumulate ('sentence','sentence_sn')
) AS dt ORDER BY sentence_sn, word_sn;

Output

 sentence                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      sentence_sn word_sn word                pos_tag 
 ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- ----------- ------- ------------------- ------- 
 in statistics, simple linear regression is the least squares estimator of a linear regression model with a single explanatory variable. in other words, simple linear regression fits a straight line through the set of n points in such a way that makes the sum of squared residuals of the model (that is, vertical distances between the points of the data set and the fitted line) as small as possible.                                                                                                                                                                                                                                                                                                                         1       1 in                  IN     
 logistic regression was developed by statistician david cox in 1958[2][3](although much work was done in the single independent variable case almost two decades earlier). the binary logistic model is used to estimate the probability of a binary response based on one or more predictor (or independent) variables (features). as such it is not a classification method. it could be called a qualitative response/discrete choice model in the terminology of economics.                                                                                                                                                                                                                                                         1       1 logistic            JJ     
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1       1 association         NN     
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1       1 decision            NN     
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1       1 cluster             NN     
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1       2 analysis            NN     
 logistic regression was developed by statistician david cox in 1958[2][3](although much work was done in the single independent variable case almost two decades earlier). the binary logistic model is used to estimate the probability of a binary response based on one or more predictor (or independent) variables (features). as such it is not a classification method. it could be called a qualitative response/discrete choice model in the terminology of economics.                                                                                                                                                                                                                                                         1       2 regression          NN     
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1       2 tree                NN     
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1       2 rule                NN     
 in statistics, simple linear regression is the least squares estimator of a linear regression model with a single explanatory variable. in other words, simple linear regression fits a straight line through the set of n points in such a way that makes the sum of squared residuals of the model (that is, vertical distances between the points of the data set and the fitted line) as small as possible.                                                                                                                                                                                                                                                                                                                         1       2 statistics          NNS    
 logistic regression was developed by statistician david cox in 1958[2][3](although much work was done in the single independent variable case almost two decades earlier). the binary logistic model is used to estimate the probability of a binary response based on one or more predictor (or independent) variables (features). as such it is not a classification method. it could be called a qualitative response/discrete choice model in the terminology of economics.                                                                                                                                                                                                                                                         1       3 was                 VBD    
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1       3 learning            NN     
 in statistics, simple linear regression is the least squares estimator of a linear regression model with a single explanatory variable. in other words, simple linear regression fits a straight line through the set of n points in such a way that makes the sum of squared residuals of the model (that is, vertical distances between the points of the data set and the fitted line) as small as possible.                                                                                                                                                                                                                                                                                                                         1       3 ,                   O      
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1       3 learning            NN     
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1       3 or                  CC     
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1       4 clustering          NN     
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1       4 is                  VBZ    
 logistic regression was developed by statistician david cox in 1958[2][3](although much work was done in the single independent variable case almost two decades earlier). the binary logistic model is used to estimate the probability of a binary response based on one or more predictor (or independent) variables (features). as such it is not a classification method. it could be called a qualitative response/discrete choice model in the terminology of economics.                                                                                                                                                                                                                                                         1       4 developed           VBN    
 in statistics, simple linear regression is the least squares estimator of a linear regression model with a single explanatory variable. in other words, simple linear regression fits a straight line through the set of n points in such a way that makes the sum of squared residuals of the model (that is, vertical distances between the points of the data set and the fitted line) as small as possible.                                                                                                                                                                                                                                                                                                                         1       4 simple              JJ     
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1       4 uses                VBZ    
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1       5 a                   DT     
 in statistics, simple linear regression is the least squares estimator of a linear regression model with a single explanatory variable. in other words, simple linear regression fits a straight line through the set of n points in such a way that makes the sum of squared residuals of the model (that is, vertical distances between the points of the data set and the fitted line) as small as possible.                                                                                                                                                                                                                                                                                                                         1       5 linear              JJ     
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1       5 is                  VBZ    
 logistic regression was developed by statistician david cox in 1958[2][3](although much work was done in the single independent variable case almost two decades earlier). the binary logistic model is used to estimate the probability of a binary response based on one or more predictor (or independent) variables (features). as such it is not a classification method. it could be called a qualitative response/discrete choice model in the terminology of economics.                                                                                                                                                                                                                                                         1       5 by                  IN     
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1       5 a                   DT     
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1       6 decision            NN     
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1       6 method              NN     
 in statistics, simple linear regression is the least squares estimator of a linear regression model with a single explanatory variable. in other words, simple linear regression fits a straight line through the set of n points in such a way that makes the sum of squared residuals of the model (that is, vertical distances between the points of the data set and the fitted line) as small as possible.                                                                                                                                                                                                                                                                                                                         1       6 regression          NN     
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1       6 the                 DT     
 logistic regression was developed by statistician david cox in 1958[2][3](although much work was done in the single independent variable case almost two decades earlier). the binary logistic model is used to estimate the probability of a binary response based on one or more predictor (or independent) variables (features). as such it is not a classification method. it could be called a qualitative response/discrete choice model in the terminology of economics.                                                                                                                                                                                                                                                         1       6 statistician        JJ     
 logistic regression was developed by statistician david cox in 1958[2][3](although much work was done in the single independent variable case almost two decades earlier). the binary logistic model is used to estimate the probability of a binary response based on one or more predictor (or independent) variables (features). as such it is not a classification method. it could be called a qualitative response/discrete choice model in the terminology of economics.                                                                                                                                                                                                                                                         1       7 david               JJ     
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1       7 task                NN     
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1       7 tree                NN     
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1       7 for                 IN     
 in statistics, simple linear regression is the least squares estimator of a linear regression model with a single explanatory variable. in other words, simple linear regression fits a straight line through the set of n points in such a way that makes the sum of squared residuals of the model (that is, vertical distances between the points of the data set and the fitted line) as small as possible.                                                                                                                                                                                                                                                                                                                         1       7 is                  VBZ    
 in statistics, simple linear regression is the least squares estimator of a linear regression model with a single explanatory variable. in other words, simple linear regression fits a straight line through the set of n points in such a way that makes the sum of squared residuals of the model (that is, vertical distances between the points of the data set and the fitted line) as small as possible.                                                                                                                                                                                                                                                                                                                         1       8 the                 DT     
 logistic regression was developed by statistician david cox in 1958[2][3](although much work was done in the single independent variable case almost two decades earlier). the binary logistic model is used to estimate the probability of a binary response based on one or more predictor (or independent) variables (features). as such it is not a classification method. it could be called a qualitative response/discrete choice model in the terminology of economics.                                                                                                                                                                                                                                                         1       8 cox                 NN     
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1       8 discovering         VBG    
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1       8 as                  IN     
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1       8 of                  IN     
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1       9 a                   DT     
 logistic regression was developed by statistician david cox in 1958[2][3](although much work was done in the single independent variable case almost two decades earlier). the binary logistic model is used to estimate the probability of a binary response based on one or more predictor (or independent) variables (features). as such it is not a classification method. it could be called a qualitative response/discrete choice model in the terminology of economics.                                                                                                                                                                                                                                                         1       9 in                  IN     
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1       9 grouping            VBG    
 in statistics, simple linear regression is the least squares estimator of a linear regression model with a single explanatory variable. in other words, simple linear regression fits a straight line through the set of n points in such a way that makes the sum of squared residuals of the model (that is, vertical distances between the points of the data set and the fitted line) as small as possible.                                                                                                                                                                                                                                                                                                                         1       9 least               JJS    
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1       9 interesting         JJ     
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1      10 a                   DT     
 in statistics, simple linear regression is the least squares estimator of a linear regression model with a single explanatory variable. in other words, simple linear regression fits a straight line through the set of n points in such a way that makes the sum of squared residuals of the model (that is, vertical distances between the points of the data set and the fitted line) as small as possible.                                                                                                                                                                                                                                                                                                                         1      10 squares             VBZ    
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1      10 relations           NNS    
 logistic regression was developed by statistician david cox in 1958[2][3](although much work was done in the single independent variable case almost two decades earlier). the binary logistic model is used to estimate the probability of a binary response based on one or more predictor (or independent) variables (features). as such it is not a classification method. it could be called a qualitative response/discrete choice model in the terminology of economics.                                                                                                                                                                                                                                                         1      10 1958[2][3](although JJ     
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1      10 predictive          JJ     
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1      11 between             IN     
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1      11 model               NN     
 logistic regression was developed by statistician david cox in 1958[2][3](although much work was done in the single independent variable case almost two decades earlier). the binary logistic model is used to estimate the probability of a binary response based on one or more predictor (or independent) variables (features). as such it is not a classification method. it could be called a qualitative response/discrete choice model in the terminology of economics.                                                                                                                                                                                                                                                         1      11 much                JJ     
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1      11 set                 NN     
 in statistics, simple linear regression is the least squares estimator of a linear regression model with a single explanatory variable. in other words, simple linear regression fits a straight line through the set of n points in such a way that makes the sum of squared residuals of the model (that is, vertical distances between the points of the data set and the fitted line) as small as possible.                                                                                                                                                                                                                                                                                                                         1      11 estimator           NN     
 in statistics, simple linear regression is the least squares estimator of a linear regression model with a single explanatory variable. in other words, simple linear regression fits a straight line through the set of n points in such a way that makes the sum of squared residuals of the model (that is, vertical distances between the points of the data set and the fitted line) as small as possible.                                                                                                                                                                                                                                                                                                                         1      12 of                  IN     
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1      12 of                  IN     
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1      12 variables           NNS    
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1      12 which               WDT    
 logistic regression was developed by statistician david cox in 1958[2][3](although much work was done in the single independent variable case almost two decades earlier). the binary logistic model is used to estimate the probability of a binary response based on one or more predictor (or independent) variables (features). as such it is not a classification method. it could be called a qualitative response/discrete choice model in the terminology of economics.                                                                                                                                                                                                                                                         1      12 work                NN     
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1      13 maps                VBZ    
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1      13 objects             NNS    
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1      13 in                  IN     
 logistic regression was developed by statistician david cox in 1958[2][3](although much work was done in the single independent variable case almost two decades earlier). the binary logistic model is used to estimate the probability of a binary response based on one or more predictor (or independent) variables (features). as such it is not a classification method. it could be called a qualitative response/discrete choice model in the terminology of economics.                                                                                                                                                                                                                                                         1      13 was                 VBD    
 in statistics, simple linear regression is the least squares estimator of a linear regression model with a single explanatory variable. in other words, simple linear regression fits a straight line through the set of n points in such a way that makes the sum of squared residuals of the model (that is, vertical distances between the points of the data set and the fitted line) as small as possible.                                                                                                                                                                                                                                                                                                                         1      13 a                   DT     
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1      14 in                  IN     
 logistic regression was developed by statistician david cox in 1958[2][3](although much work was done in the single independent variable case almost two decades earlier). the binary logistic model is used to estimate the probability of a binary response based on one or more predictor (or independent) variables (features). as such it is not a classification method. it could be called a qualitative response/discrete choice model in the terminology of economics.                                                                                                                                                                                                                                                         1      14 done                VBN    
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1      14 observations        NNS    
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1      14 large               JJ     
 in statistics, simple linear regression is the least squares estimator of a linear regression model with a single explanatory variable. in other words, simple linear regression fits a straight line through the set of n points in such a way that makes the sum of squared residuals of the model (that is, vertical distances between the points of the data set and the fitted line) as small as possible.                                                                                                                                                                                                                                                                                                                         1      14 linear              JJ     
 in statistics, simple linear regression is the least squares estimator of a linear regression model with a single explanatory variable. in other words, simple linear regression fits a straight line through the set of n points in such a way that makes the sum of squared residuals of the model (that is, vertical distances between the points of the data set and the fitted line) as small as possible.                                                                                                                                                                                                                                                                                                                         1      15 regression          NN     
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1      15 such                PDT    
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1      15 about               IN     
 logistic regression was developed by statistician david cox in 1958[2][3](although much work was done in the single independent variable case almost two decades earlier). the binary logistic model is used to estimate the probability of a binary response based on one or more predictor (or independent) variables (features). as such it is not a classification method. it could be called a qualitative response/discrete choice model in the terminology of economics.                                                                                                                                                                                                                                                         1      15 in                  IN     
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1      15 databases           NNS    
 logistic regression was developed by statistician david cox in 1958[2][3](although much work was done in the single independent variable case almost two decades earlier). the binary logistic model is used to estimate the probability of a binary response based on one or more predictor (or independent) variables (features). as such it is not a classification method. it could be called a qualitative response/discrete choice model in the terminology of economics.                                                                                                                                                                                                                                                         1      16 the                 DT     
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1      16 .                   O      
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1      16 a                   DT     
 in statistics, simple linear regression is the least squares estimator of a linear regression model with a single explanatory variable. in other words, simple linear regression fits a straight line through the set of n points in such a way that makes the sum of squared residuals of the model (that is, vertical distances between the points of the data set and the fitted line) as small as possible.                                                                                                                                                                                                                                                                                                                         1      16 model               NN     
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1      16 an                  DT     
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1      17 way                 NN     
 in statistics, simple linear regression is the least squares estimator of a linear regression model with a single explanatory variable. in other words, simple linear regression fits a straight line through the set of n points in such a way that makes the sum of squared residuals of the model (that is, vertical distances between the points of the data set and the fitted line) as small as possible.                                                                                                                                                                                                                                                                                                                         1      17 with                IN     
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1      17 item                NN     
 logistic regression was developed by statistician david cox in 1958[2][3](although much work was done in the single independent variable case almost two decades earlier). the binary logistic model is used to estimate the probability of a binary response based on one or more predictor (or independent) variables (features). as such it is not a classification method. it could be called a qualitative response/discrete choice model in the terminology of economics.                                                                                                                                                                                                                                                         1      17 single              JJ     
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1      17 it                  PRP    
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1      18 that                WDT    
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1      18 to                  TO     
 logistic regression was developed by statistician david cox in 1958[2][3](although much work was done in the single independent variable case almost two decades earlier). the binary logistic model is used to estimate the probability of a binary response based on one or more predictor (or independent) variables (features). as such it is not a classification method. it could be called a qualitative response/discrete choice model in the terminology of economics.                                                                                                                                                                                                                                                         1      18 independent         JJ     
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1      18 is                  VBZ    
 in statistics, simple linear regression is the least squares estimator of a linear regression model with a single explanatory variable. in other words, simple linear regression fits a straight line through the set of n points in such a way that makes the sum of squared residuals of the model (that is, vertical distances between the points of the data set and the fitted line) as small as possible.                                                                                                                                                                                                                                                                                                                         1      18 a                   DT     
 logistic regression was developed by statistician david cox in 1958[2][3](although much work was done in the single independent variable case almost two decades earlier). the binary logistic model is used to estimate the probability of a binary response based on one or more predictor (or independent) variables (features). as such it is not a classification method. it could be called a qualitative response/discrete choice model in the terminology of economics.                                                                                                                                                                                                                                                         1      19 variable            JJ     
 in statistics, simple linear regression is the least squares estimator of a linear regression model with a single explanatory variable. in other words, simple linear regression fits a straight line through the set of n points in such a way that makes the sum of squared residuals of the model (that is, vertical distances between the points of the data set and the fitted line) as small as possible.                                                                                                                                                                                                                                                                                                                         1      19 single              JJ     
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1      19 objects             VBZ    
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1      19 conclusions         NNS    
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1      19 intended            VBN    
 logistic regression was developed by statistician david cox in 1958[2][3](although much work was done in the single independent variable case almost two decades earlier). the binary logistic model is used to estimate the probability of a binary response based on one or more predictor (or independent) variables (features). as such it is not a classification method. it could be called a qualitative response/discrete choice model in the terminology of economics.                                                                                                                                                                                                                                                         1      20 case                NN     
 in statistics, simple linear regression is the least squares estimator of a linear regression model with a single explanatory variable. in other words, simple linear regression fits a straight line through the set of n points in such a way that makes the sum of squared residuals of the model (that is, vertical distances between the points of the data set and the fitted line) as small as possible.                                                                                                                                                                                                                                                                                                                         1      20 explanatory         NN     
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1      20 to                  TO     
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1      20 in                  IN     
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1      20 about               IN     
 logistic regression was developed by statistician david cox in 1958[2][3](although much work was done in the single independent variable case almost two decades earlier). the binary logistic model is used to estimate the probability of a binary response based on one or more predictor (or independent) variables (features). as such it is not a classification method. it could be called a qualitative response/discrete choice model in the terminology of economics.                                                                                                                                                                                                                                                         1      21 almost              RB     
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1      21 the                 DT     
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1      21 identify            VB     
 in statistics, simple linear regression is the least squares estimator of a linear regression model with a single explanatory variable. in other words, simple linear regression fits a straight line through the set of n points in such a way that makes the sum of squared residuals of the model (that is, vertical distances between the points of the data set and the fitted line) as small as possible.                                                                                                                                                                                                                                                                                                                         1      21 variable            JJ     
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1      21 the                 DT     
 in statistics, simple linear regression is the least squares estimator of a linear regression model with a single explanatory variable. in other words, simple linear regression fits a straight line through the set of n points in such a way that makes the sum of squared residuals of the model (that is, vertical distances between the points of the data set and the fitted line) as small as possible.                                                                                                                                                                                                                                                                                                                         1      22 .                   O      
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1      22 strong              JJ     
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1      22 items               NNS    
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1      22 same                JJ     
 logistic regression was developed by statistician david cox in 1958[2][3](although much work was done in the single independent variable case almost two decades earlier). the binary logistic model is used to estimate the probability of a binary response based on one or more predictor (or independent) variables (features). as such it is not a classification method. it could be called a qualitative response/discrete choice model in the terminology of economics.                                                                                                                                                                                                                                                         1      22 two                 CD     
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1      23 rules               NNS    
 in statistics, simple linear regression is the least squares estimator of a linear regression model with a single explanatory variable. in other words, simple linear regression fits a straight line through the set of n points in such a way that makes the sum of squared residuals of the model (that is, vertical distances between the points of the data set and the fitted line) as small as possible.                                                                                                                                                                                                                                                                                                                         1      23 in                  IN     
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1      23 group               NN     
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1      23 target              NN     
 logistic regression was developed by statistician david cox in 1958[2][3](although much work was done in the single independent variable case almost two decades earlier). the binary logistic model is used to estimate the probability of a binary response based on one or more predictor (or independent) variables (features). as such it is not a classification method. it could be called a qualitative response/discrete choice model in the terminology of economics.                                                                                                                                                                                                                                                         1      23 decades             NNS    
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1      24 (                   O      
 in statistics, simple linear regression is the least squares estimator of a linear regression model with a single explanatory variable. in other words, simple linear regression fits a straight line through the set of n points in such a way that makes the sum of squared residuals of the model (that is, vertical distances between the points of the data set and the fitted line) as small as possible.                                                                                                                                                                                                                                                                                                                         1      24 other               JJ     
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1      24 value               NN     
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1      24 discovered          VBN    
 logistic regression was developed by statistician david cox in 1958[2][3](although much work was done in the single independent variable case almost two decades earlier). the binary logistic model is used to estimate the probability of a binary response based on one or more predictor (or independent) variables (features). as such it is not a classification method. it could be called a qualitative response/discrete choice model in the terminology of economics.                                                                                                                                                                                                                                                         1      24 earlier)            VBP    
 in statistics, simple linear regression is the least squares estimator of a linear regression model with a single explanatory variable. in other words, simple linear regression fits a straight line through the set of n points in such a way that makes the sum of squared residuals of the model (that is, vertical distances between the points of the data set and the fitted line) as small as possible.                                                                                                                                                                                                                                                                                                                         1      25 words               NNS    
 logistic regression was developed by statistician david cox in 1958[2][3](although much work was done in the single independent variable case almost two decades earlier). the binary logistic model is used to estimate the probability of a binary response based on one or more predictor (or independent) variables (features). as such it is not a classification method. it could be called a qualitative response/discrete choice model in the terminology of economics.                                                                                                                                                                                                                                                         1      25 .                   O      
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1      25 called              VBD    
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1      25 .                   O      
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1      25 in                  IN     
 in statistics, simple linear regression is the least squares estimator of a linear regression model with a single explanatory variable. in other words, simple linear regression fits a straight line through the set of n points in such a way that makes the sum of squared residuals of the model (that is, vertical distances between the points of the data set and the fitted line) as small as possible.                                                                                                                                                                                                                                                                                                                         1      26 ,                   O      
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1      26 a                   DT     
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1      26 databases           NNS    
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1      26 it                  PRP    
 logistic regression was developed by statistician david cox in 1958[2][3](although much work was done in the single independent variable case almost two decades earlier). the binary logistic model is used to estimate the probability of a binary response based on one or more predictor (or independent) variables (features). as such it is not a classification method. it could be called a qualitative response/discrete choice model in the terminology of economics.                                                                                                                                                                                                                                                         1      26 the                 DT     
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1      27 cluster             NN     
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1      27 is                  VBZ    
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1      27 using               VBG    
 in statistics, simple linear regression is the least squares estimator of a linear regression model with a single explanatory variable. in other words, simple linear regression fits a straight line through the set of n points in such a way that makes the sum of squared residuals of the model (that is, vertical distances between the points of the data set and the fitted line) as small as possible.                                                                                                                                                                                                                                                                                                                         1      27 simple              JJ     
 logistic regression was developed by statistician david cox in 1958[2][3](although much work was done in the single independent variable case almost two decades earlier). the binary logistic model is used to estimate the probability of a binary response based on one or more predictor (or independent) variables (features). as such it is not a classification method. it could be called a qualitative response/discrete choice model in the terminology of economics.                                                                                                                                                                                                                                                         1      27 binary              JJ     
 in statistics, simple linear regression is the least squares estimator of a linear regression model with a single explanatory variable. in other words, simple linear regression fits a straight line through the set of n points in such a way that makes the sum of squared residuals of the model (that is, vertical distances between the points of the data set and the fitted line) as small as possible.                                                                                                                                                                                                                                                                                                                         1      28 linear              JJ     
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1      28 )                   O      
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1      28 different           JJ     
 logistic regression was developed by statistician david cox in 1958[2][3](although much work was done in the single independent variable case almost two decades earlier). the binary logistic model is used to estimate the probability of a binary response based on one or more predictor (or independent) variables (features). as such it is not a classification method. it could be called a qualitative response/discrete choice model in the terminology of economics.                                                                                                                                                                                                                                                         1      28 logistic            JJ     
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1      28 one                 CD     
 in statistics, simple linear regression is the least squares estimator of a linear regression model with a single explanatory variable. in other words, simple linear regression fits a straight line through the set of n points in such a way that makes the sum of squared residuals of the model (that is, vertical distances between the points of the data set and the fitted line) as small as possible.                                                                                                                                                                                                                                                                                                                         1      29 regression          NN     
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1      29 are                 VBP    
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1      29 of                  IN     
 logistic regression was developed by statistician david cox in 1958[2][3](although much work was done in the single independent variable case almost two decades earlier). the binary logistic model is used to estimate the probability of a binary response based on one or more predictor (or independent) variables (features). as such it is not a classification method. it could be called a qualitative response/discrete choice model in the terminology of economics.                                                                                                                                                                                                                                                         1      29 model               NN     
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1      29 measures            NNS    
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1      30 more                RBR    
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1      30 the                 DT     
 in statistics, simple linear regression is the least squares estimator of a linear regression model with a single explanatory variable. in other words, simple linear regression fits a straight line through the set of n points in such a way that makes the sum of squared residuals of the model (that is, vertical distances between the points of the data set and the fitted line) as small as possible.                                                                                                                                                                                                                                                                                                                         1      30 fits                VBZ    
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1      30 of                  IN     
 logistic regression was developed by statistician david cox in 1958[2][3](although much work was done in the single independent variable case almost two decades earlier). the binary logistic model is used to estimate the probability of a binary response based on one or more predictor (or independent) variables (features). as such it is not a classification method. it could be called a qualitative response/discrete choice model in the terminology of economics.                                                                                                                                                                                                                                                         1      30 is                  VBZ    
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1      31 similar             JJ     
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1      31 predictive          JJ     
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1      31 interestingness     NN     
 in statistics, simple linear regression is the least squares estimator of a linear regression model with a single explanatory variable. in other words, simple linear regression fits a straight line through the set of n points in such a way that makes the sum of squared residuals of the model (that is, vertical distances between the points of the data set and the fitted line) as small as possible.                                                                                                                                                                                                                                                                                                                         1      31 a                   DT     
 logistic regression was developed by statistician david cox in 1958[2][3](although much work was done in the single independent variable case almost two decades earlier). the binary logistic model is used to estimate the probability of a binary response based on one or more predictor (or independent) variables (features). as such it is not a classification method. it could be called a qualitative response/discrete choice model in the terminology of economics.                                                                                                                                                                                                                                                         1      31 used                VBN    
 in statistics, simple linear regression is the least squares estimator of a linear regression model with a single explanatory variable. in other words, simple linear regression fits a straight line through the set of n points in such a way that makes the sum of squared residuals of the model (that is, vertical distances between the points of the data set and the fitted line) as small as possible.                                                                                                                                                                                                                                                                                                                         1      32 straight            JJ     
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1      32 modelling           JJ     
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1      32 .                   O      
 logistic regression was developed by statistician david cox in 1958[2][3](although much work was done in the single independent variable case almost two decades earlier). the binary logistic model is used to estimate the probability of a binary response based on one or more predictor (or independent) variables (features). as such it is not a classification method. it could be called a qualitative response/discrete choice model in the terminology of economics.                                                                                                                                                                                                                                                         1      32 to                  TO     
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1      32 (                   O      
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1      33 approaches          NNS    
 in statistics, simple linear regression is the least squares estimator of a linear regression model with a single explanatory variable. in other words, simple linear regression fits a straight line through the set of n points in such a way that makes the sum of squared residuals of the model (that is, vertical distances between the points of the data set and the fitted line) as small as possible.                                                                                                                                                                                                                                                                                                                         1      33 line                NN     
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1      33 based               VBN    
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1      33 in                  IN     
 logistic regression was developed by statistician david cox in 1958[2][3](although much work was done in the single independent variable case almost two decades earlier). the binary logistic model is used to estimate the probability of a binary response based on one or more predictor (or independent) variables (features). as such it is not a classification method. it could be called a qualitative response/discrete choice model in the terminology of economics.                                                                                                                                                                                                                                                         1      33 estimate            VB     
 in statistics, simple linear regression is the least squares estimator of a linear regression model with a single explanatory variable. in other words, simple linear regression fits a straight line through the set of n points in such a way that makes the sum of squared residuals of the model (that is, vertical distances between the points of the data set and the fitted line) as small as possible.                                                                                                                                                                                                                                                                                                                         1      34 through             IN     
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1      34 used                VBN    
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1      34 some                DT     
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1      34 on                  IN     
 logistic regression was developed by statistician david cox in 1958[2][3](although much work was done in the single independent variable case almost two decades earlier). the binary logistic model is used to estimate the probability of a binary response based on one or more predictor (or independent) variables (features). as such it is not a classification method. it could be called a qualitative response/discrete choice model in the terminology of economics.                                                                                                                                                                                                                                                         1      34 the                 DT     
 in statistics, simple linear regression is the least squares estimator of a linear regression model with a single explanatory variable. in other words, simple linear regression fits a straight line through the set of n points in such a way that makes the sum of squared residuals of the model (that is, vertical distances between the points of the data set and the fitted line) as small as possible.                                                                                                                                                                                                                                                                                                                         1      35 the                 DT     
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1      35 in                  IN     
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1      35 the                 DT     
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1      35 sense               NN     
 logistic regression was developed by statistician david cox in 1958[2][3](although much work was done in the single independent variable case almost two decades earlier). the binary logistic model is used to estimate the probability of a binary response based on one or more predictor (or independent) variables (features). as such it is not a classification method. it could be called a qualitative response/discrete choice model in the terminology of economics.                                                                                                                                                                                                                                                         1      35 probability         NN     
 in statistics, simple linear regression is the least squares estimator of a linear regression model with a single explanatory variable. in other words, simple linear regression fits a straight line through the set of n points in such a way that makes the sum of squared residuals of the model (that is, vertical distances between the points of the data set and the fitted line) as small as possible.                                                                                                                                                                                                                                                                                                                         1      36 set                 NN     
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1      36 concept             NN     
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1      36 statistics          NNS    
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1      36 or                  CC     
 logistic regression was developed by statistician david cox in 1958[2][3](although much work was done in the single independent variable case almost two decades earlier). the binary logistic model is used to estimate the probability of a binary response based on one or more predictor (or independent) variables (features). as such it is not a classification method. it could be called a qualitative response/discrete choice model in the terminology of economics.                                                                                                                                                                                                                                                         1      36 of                  IN     
 logistic regression was developed by statistician david cox in 1958[2][3](although much work was done in the single independent variable case almost two decades earlier). the binary logistic model is used to estimate the probability of a binary response based on one or more predictor (or independent) variables (features). as such it is not a classification method. it could be called a qualitative response/discrete choice model in the terminology of economics.                                                                                                                                                                                                                                                         1      37 a                   DT     
 in statistics, simple linear regression is the least squares estimator of a linear regression model with a single explanatory variable. in other words, simple linear regression fits a straight line through the set of n points in such a way that makes the sum of squared residuals of the model (that is, vertical distances between the points of the data set and the fitted line) as small as possible.                                                                                                                                                                                                                                                                                                                         1      37 of                  IN     
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1      37 ,                   O      
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1      37 of                  IN     
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1      37 another             DT     
 in statistics, simple linear regression is the least squares estimator of a linear regression model with a single explanatory variable. in other words, simple linear regression fits a straight line through the set of n points in such a way that makes the sum of squared residuals of the model (that is, vertical distances between the points of the data set and the fitted line) as small as possible.                                                                                                                                                                                                                                                                                                                         1      38 n                   JJ     
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1      38 strong              JJ     
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1      38 )                   O      
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1      38 data                NN     
 logistic regression was developed by statistician david cox in 1958[2][3](although much work was done in the single independent variable case almost two decades earlier). the binary logistic model is used to estimate the probability of a binary response based on one or more predictor (or independent) variables (features). as such it is not a classification method. it could be called a qualitative response/discrete choice model in the terminology of economics.                                                                                                                                                                                                                                                         1      38 binary              JJ     
 in statistics, simple linear regression is the least squares estimator of a linear regression model with a single explanatory variable. in other words, simple linear regression fits a straight line through the set of n points in such a way that makes the sum of squared residuals of the model (that is, vertical distances between the points of the data set and the fitted line) as small as possible.                                                                                                                                                                                                                                                                                                                         1      39 points              NNS    
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1      39 mining              NN     
 logistic regression was developed by statistician david cox in 1958[2][3](although much work was done in the single independent variable case almost two decades earlier). the binary logistic model is used to estimate the probability of a binary response based on one or more predictor (or independent) variables (features). as such it is not a classification method. it could be called a qualitative response/discrete choice model in the terminology of economics.                                                                                                                                                                                                                                                         1      39 response            NN     
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1      39 rules               NNS    
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1      39 to                  TO     
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1      40 and                 CC     
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1      40 ,                   O      
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1      40 each                DT     
 in statistics, simple linear regression is the least squares estimator of a linear regression model with a single explanatory variable. in other words, simple linear regression fits a straight line through the set of n points in such a way that makes the sum of squared residuals of the model (that is, vertical distances between the points of the data set and the fitted line) as small as possible.                                                                                                                                                                                                                                                                                                                         1      40 in                  IN     
 logistic regression was developed by statistician david cox in 1958[2][3](although much work was done in the single independent variable case almost two decades earlier). the binary logistic model is used to estimate the probability of a binary response based on one or more predictor (or independent) variables (features). as such it is not a classification method. it could be called a qualitative response/discrete choice model in the terminology of economics.                                                                                                                                                                                                                                                         1      40 based               VBN    
 in statistics, simple linear regression is the least squares estimator of a linear regression model with a single explanatory variable. in other words, simple linear regression fits a straight line through the set of n points in such a way that makes the sum of squared residuals of the model (that is, vertical distances between the points of the data set and the fitted line) as small as possible.                                                                                                                                                                                                                                                                                                                         1      41 such                PDT    
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1      41 machine             NN     
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1      41 rakesh              JJ     
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1      41 other               JJ     
 logistic regression was developed by statistician david cox in 1958[2][3](although much work was done in the single independent variable case almost two decades earlier). the binary logistic model is used to estimate the probability of a binary response based on one or more predictor (or independent) variables (features). as such it is not a classification method. it could be called a qualitative response/discrete choice model in the terminology of economics.                                                                                                                                                                                                                                                         1      41 on                  IN     
 logistic regression was developed by statistician david cox in 1958[2][3](although much work was done in the single independent variable case almost two decades earlier). the binary logistic model is used to estimate the probability of a binary response based on one or more predictor (or independent) variables (features). as such it is not a classification method. it could be called a qualitative response/discrete choice model in the terminology of economics.                                                                                                                                                                                                                                                         1      42 one                 CD     
 in statistics, simple linear regression is the least squares estimator of a linear regression model with a single explanatory variable. in other words, simple linear regression fits a straight line through the set of n points in such a way that makes the sum of squared residuals of the model (that is, vertical distances between the points of the data set and the fitted line) as small as possible.                                                                                                                                                                                                                                                                                                                         1      42 a                   DT     
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1      42 learning            VBG    
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1      42 agrawal             JJ     
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1      42 than                IN     
 in statistics, simple linear regression is the least squares estimator of a linear regression model with a single explanatory variable. in other words, simple linear regression fits a straight line through the set of n points in such a way that makes the sum of squared residuals of the model (that is, vertical distances between the points of the data set and the fitted line) as small as possible.                                                                                                                                                                                                                                                                                                                         1      43 way                 NN     
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1      43 .                   O      
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1      43 to                  TO     
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1      43 et                  NN     
 logistic regression was developed by statistician david cox in 1958[2][3](although much work was done in the single independent variable case almost two decades earlier). the binary logistic model is used to estimate the probability of a binary response based on one or more predictor (or independent) variables (features). as such it is not a classification method. it could be called a qualitative response/discrete choice model in the terminology of economics.                                                                                                                                                                                                                                                         1      43 or                  CC     
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1      44 those               DT     
 in statistics, simple linear regression is the least squares estimator of a linear regression model with a single explanatory variable. in other words, simple linear regression fits a straight line through the set of n points in such a way that makes the sum of squared residuals of the model (that is, vertical distances between the points of the data set and the fitted line) as small as possible.                                                                                                                                                                                                                                                                                                                         1      44 that                WDT    
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1      44 tree                CD     
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1      44 al.[2               NN     
 logistic regression was developed by statistician david cox in 1958[2][3](although much work was done in the single independent variable case almost two decades earlier). the binary logistic model is used to estimate the probability of a binary response based on one or more predictor (or independent) variables (features). as such it is not a classification method. it could be called a qualitative response/discrete choice model in the terminology of economics.                                                                                                                                                                                                                                                         1      44 more                JJR    
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1      45 models              NNS    
 in statistics, simple linear regression is the least squares estimator of a linear regression model with a single explanatory variable. in other words, simple linear regression fits a straight line through the set of n points in such a way that makes the sum of squared residuals of the model (that is, vertical distances between the points of the data set and the fitted line) as small as possible.                                                                                                                                                                                                                                                                                                                         1      45 makes               VBZ    
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1      45 ]                   O      
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1      45 in                  IN     
 logistic regression was developed by statistician david cox in 1958[2][3](although much work was done in the single independent variable case almost two decades earlier). the binary logistic model is used to estimate the probability of a binary response based on one or more predictor (or independent) variables (features). as such it is not a classification method. it could be called a qualitative response/discrete choice model in the terminology of economics.                                                                                                                                                                                                                                                         1      45 predictor           NN     
 logistic regression was developed by statistician david cox in 1958[2][3](although much work was done in the single independent variable case almost two decades earlier). the binary logistic model is used to estimate the probability of a binary response based on one or more predictor (or independent) variables (features). as such it is not a classification method. it could be called a qualitative response/discrete choice model in the terminology of economics.                                                                                                                                                                                                                                                         1      46 (                   O      
 in statistics, simple linear regression is the least squares estimator of a linear regression model with a single explanatory variable. in other words, simple linear regression fits a straight line through the set of n points in such a way that makes the sum of squared residuals of the model (that is, vertical distances between the points of the data set and the fitted line) as small as possible.                                                                                                                                                                                                                                                                                                                         1      46 the                 DT     
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1      46 other               JJ     
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1      46 introduced          JJ     
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1      46 where               WRB    
 in statistics, simple linear regression is the least squares estimator of a linear regression model with a single explanatory variable. in other words, simple linear regression fits a straight line through the set of n points in such a way that makes the sum of squared residuals of the model (that is, vertical distances between the points of the data set and the fitted line) as small as possible.                                                                                                                                                                                                                                                                                                                         1      47 sum                 NN     
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1      47 association         NN     
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1      47 groups              NNS    
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1      47 the                 DT     
 logistic regression was developed by statistician david cox in 1958[2][3](although much work was done in the single independent variable case almost two decades earlier). the binary logistic model is used to estimate the probability of a binary response based on one or more predictor (or independent) variables (features). as such it is not a classification method. it could be called a qualitative response/discrete choice model in the terminology of economics.                                                                                                                                                                                                                                                         1      47 or                  CC     
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1      48 (                   O      
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1      48 rules               NNS    
 in statistics, simple linear regression is the least squares estimator of a linear regression model with a single explanatory variable. in other words, simple linear regression fits a straight line through the set of n points in such a way that makes the sum of squared residuals of the model (that is, vertical distances between the points of the data set and the fitted line) as small as possible.                                                                                                                                                                                                                                                                                                                         1      48 of                  IN     
 logistic regression was developed by statistician david cox in 1958[2][3](although much work was done in the single independent variable case almost two decades earlier). the binary logistic model is used to estimate the probability of a binary response based on one or more predictor (or independent) variables (features). as such it is not a classification method. it could be called a qualitative response/discrete choice model in the terminology of economics.                                                                                                                                                                                                                                                         1      48 independent         JJ     
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1      48 target              NN     
 in statistics, simple linear regression is the least squares estimator of a linear regression model with a single explanatory variable. in other words, simple linear regression fits a straight line through the set of n points in such a way that makes the sum of squared residuals of the model (that is, vertical distances between the points of the data set and the fitted line) as small as possible.                                                                                                                                                                                                                                                                                                                         1      49 squared             JJ     
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1      49 clusters)           NN     
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1      49 variable            NN     
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1      49 for                 IN     
 logistic regression was developed by statistician david cox in 1958[2][3](although much work was done in the single independent variable case almost two decades earlier). the binary logistic model is used to estimate the probability of a binary response based on one or more predictor (or independent) variables (features). as such it is not a classification method. it could be called a qualitative response/discrete choice model in the terminology of economics.                                                                                                                                                                                                                                                         1      49 )                   O      
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1      50 discovering         VBG    
 in statistics, simple linear regression is the least squares estimator of a linear regression model with a single explanatory variable. in other words, simple linear regression fits a straight line through the set of n points in such a way that makes the sum of squared residuals of the model (that is, vertical distances between the points of the data set and the fitted line) as small as possible.                                                                                                                                                                                                                                                                                                                         1      50 residuals           NNS    
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1      50 .                   O      
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1      50 can                 MD     
 logistic regression was developed by statistician david cox in 1958[2][3](although much work was done in the single independent variable case almost two decades earlier). the binary logistic model is used to estimate the probability of a binary response based on one or more predictor (or independent) variables (features). as such it is not a classification method. it could be called a qualitative response/discrete choice model in the terminology of economics.                                                                                                                                                                                                                                                         1      50 variables           VBZ    
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1      51 regularities        NNS    
 in statistics, simple linear regression is the least squares estimator of a linear regression model with a single explanatory variable. in other words, simple linear regression fits a straight line through the set of n points in such a way that makes the sum of squared residuals of the model (that is, vertical distances between the points of the data set and the fitted line) as small as possible.                                                                                                                                                                                                                                                                                                                         1      51 of                  IN     
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1      51 it                  PRP    
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1      51 take                VB     
 logistic regression was developed by statistician david cox in 1958[2][3](although much work was done in the single independent variable case almost two decades earlier). the binary logistic model is used to estimate the probability of a binary response based on one or more predictor (or independent) variables (features). as such it is not a classification method. it could be called a qualitative response/discrete choice model in the terminology of economics.                                                                                                                                                                                                                                                         1      51 (                   O      
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1      52 between             IN     
 in statistics, simple linear regression is the least squares estimator of a linear regression model with a single explanatory variable. in other words, simple linear regression fits a straight line through the set of n points in such a way that makes the sum of squared residuals of the model (that is, vertical distances between the points of the data set and the fitted line) as small as possible.                                                                                                                                                                                                                                                                                                                         1      52 the                 DT     
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1      52 is                  VBZ    
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1      52 a                   DT     
 logistic regression was developed by statistician david cox in 1958[2][3](although much work was done in the single independent variable case almost two decades earlier). the binary logistic model is used to estimate the probability of a binary response based on one or more predictor (or independent) variables (features). as such it is not a classification method. it could be called a qualitative response/discrete choice model in the terminology of economics.                                                                                                                                                                                                                                                         1      52 features)           NN     
 in statistics, simple linear regression is the least squares estimator of a linear regression model with a single explanatory variable. in other words, simple linear regression fits a straight line through the set of n points in such a way that makes the sum of squared residuals of the model (that is, vertical distances between the points of the data set and the fitted line) as small as possible.                                                                                                                                                                                                                                                                                                                         1      53 model               NN     
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1      53 a                   DT     
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1      53 products            NNS    
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1      53 finite              JJ     
 logistic regression was developed by statistician david cox in 1958[2][3](although much work was done in the single independent variable case almost two decades earlier). the binary logistic model is used to estimate the probability of a binary response based on one or more predictor (or independent) variables (features). as such it is not a classification method. it could be called a qualitative response/discrete choice model in the terminology of economics.                                                                                                                                                                                                                                                         1      53 .                   O      
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1      54 main                JJ     
 in statistics, simple linear regression is the least squares estimator of a linear regression model with a single explanatory variable. in other words, simple linear regression fits a straight line through the set of n points in such a way that makes the sum of squared residuals of the model (that is, vertical distances between the points of the data set and the fitted line) as small as possible.                                                                                                                                                                                                                                                                                                                         1      54 (                   O      
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1      54 in                  IN     
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1      54 set                 NN     
 logistic regression was developed by statistician david cox in 1958[2][3](although much work was done in the single independent variable case almost two decades earlier). the binary logistic model is used to estimate the probability of a binary response based on one or more predictor (or independent) variables (features). as such it is not a classification method. it could be called a qualitative response/discrete choice model in the terminology of economics.                                                                                                                                                                                                                                                         1      54 as                  IN     
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1      55 task                NN     
 in statistics, simple linear regression is the least squares estimator of a linear regression model with a single explanatory variable. in other words, simple linear regression fits a straight line through the set of n points in such a way that makes the sum of squared residuals of the model (that is, vertical distances between the points of the data set and the fitted line) as small as possible.                                                                                                                                                                                                                                                                                                                         1      55 that                WDT    
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1      55 large-scale         JJ     
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1      55 of                  IN     
 logistic regression was developed by statistician david cox in 1958[2][3](although much work was done in the single independent variable case almost two decades earlier). the binary logistic model is used to estimate the probability of a binary response based on one or more predictor (or independent) variables (features). as such it is not a classification method. it could be called a qualitative response/discrete choice model in the terminology of economics.                                                                                                                                                                                                                                                         1      55 such                JJ     
 in statistics, simple linear regression is the least squares estimator of a linear regression model with a single explanatory variable. in other words, simple linear regression fits a straight line through the set of n points in such a way that makes the sum of squared residuals of the model (that is, vertical distances between the points of the data set and the fitted line) as small as possible.                                                                                                                                                                                                                                                                                                                         1      56 is                  VBZ    
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1      56 of                  IN     
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1      56 values              NNS    
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1      56 transaction         NN     
 logistic regression was developed by statistician david cox in 1958[2][3](although much work was done in the single independent variable case almost two decades earlier). the binary logistic model is used to estimate the probability of a binary response based on one or more predictor (or independent) variables (features). as such it is not a classification method. it could be called a qualitative response/discrete choice model in the terminology of economics.                                                                                                                                                                                                                                                         1      56 it                  PRP    
 logistic regression was developed by statistician david cox in 1958[2][3](although much work was done in the single independent variable case almost two decades earlier). the binary logistic model is used to estimate the probability of a binary response based on one or more predictor (or independent) variables (features). as such it is not a classification method. it could be called a qualitative response/discrete choice model in the terminology of economics.                                                                                                                                                                                                                                                         1      57 is                  VBZ    
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1      57 data                NNS    
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1      57 exploratory         JJ     
 in statistics, simple linear regression is the least squares estimator of a linear regression model with a single explanatory variable. in other words, simple linear regression fits a straight line through the set of n points in such a way that makes the sum of squared residuals of the model (that is, vertical distances between the points of the data set and the fitted line) as small as possible.                                                                                                                                                                                                                                                                                                                         1      57 ,                   O      
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1      57 are                 VBP    
 logistic regression was developed by statistician david cox in 1958[2][3](although much work was done in the single independent variable case almost two decades earlier). the binary logistic model is used to estimate the probability of a binary response based on one or more predictor (or independent) variables (features). as such it is not a classification method. it could be called a qualitative response/discrete choice model in the terminology of economics.                                                                                                                                                                                                                                                         1      58 not                 RB     
 in statistics, simple linear regression is the least squares estimator of a linear regression model with a single explanatory variable. in other words, simple linear regression fits a straight line through the set of n points in such a way that makes the sum of squared residuals of the model (that is, vertical distances between the points of the data set and the fitted line) as small as possible.                                                                                                                                                                                                                                                                                                                         1      58 vertical            JJ     
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1      58 data                NN     
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1      58 called              VBN    
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1      58 recorded            VBN    
 logistic regression was developed by statistician david cox in 1958[2][3](although much work was done in the single independent variable case almost two decades earlier). the binary logistic model is used to estimate the probability of a binary response based on one or more predictor (or independent) variables (features). as such it is not a classification method. it could be called a qualitative response/discrete choice model in the terminology of economics.                                                                                                                                                                                                                                                         1      59 a                   DT     
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1      59 mining              NN     
 in statistics, simple linear regression is the least squares estimator of a linear regression model with a single explanatory variable. in other words, simple linear regression fits a straight line through the set of n points in such a way that makes the sum of squared residuals of the model (that is, vertical distances between the points of the data set and the fitted line) as small as possible.                                                                                                                                                                                                                                                                                                                         1      59 distances           NNS    
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1      59 by                  IN     
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1      59 classification      NN     
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1      60 trees               NNS    
 in statistics, simple linear regression is the least squares estimator of a linear regression model with a single explanatory variable. in other words, simple linear regression fits a straight line through the set of n points in such a way that makes the sum of squared residuals of the model (that is, vertical distances between the points of the data set and the fitted line) as small as possible.                                                                                                                                                                                                                                                                                                                         1      60 between             IN     
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1      60 ,                   O      
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1      60 point-of-sale       NN     
 logistic regression was developed by statistician david cox in 1958[2][3](although much work was done in the single independent variable case almost two decades earlier). the binary logistic model is used to estimate the probability of a binary response based on one or more predictor (or independent) variables (features). as such it is not a classification method. it could be called a qualitative response/discrete choice model in the terminology of economics.                                                                                                                                                                                                                                                         1      60 classification      NN     
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1      61 and                 CC     
 in statistics, simple linear regression is the least squares estimator of a linear regression model with a single explanatory variable. in other words, simple linear regression fits a straight line through the set of n points in such a way that makes the sum of squared residuals of the model (that is, vertical distances between the points of the data set and the fitted line) as small as possible.                                                                                                                                                                                                                                                                                                                         1      61 the                 DT     
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1      61 (                   O      
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1      61 .                   O      
 logistic regression was developed by statistician david cox in 1958[2][3](although much work was done in the single independent variable case almost two decades earlier). the binary logistic model is used to estimate the probability of a binary response based on one or more predictor (or independent) variables (features). as such it is not a classification method. it could be called a qualitative response/discrete choice model in the terminology of economics.                                                                                                                                                                                                                                                         1      61 method              NN     
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1      62 a                   DT     
 in statistics, simple linear regression is the least squares estimator of a linear regression model with a single explanatory variable. in other words, simple linear regression fits a straight line through the set of n points in such a way that makes the sum of squared residuals of the model (that is, vertical distances between the points of the data set and the fitted line) as small as possible.                                                                                                                                                                                                                                                                                                                         1      62 points              NNS    
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1      62 pos                 NNS    
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1      62 in                  IN     
 logistic regression was developed by statistician david cox in 1958[2][3](although much work was done in the single independent variable case almost two decades earlier). the binary logistic model is used to estimate the probability of a binary response based on one or more predictor (or independent) variables (features). as such it is not a classification method. it could be called a qualitative response/discrete choice model in the terminology of economics.                                                                                                                                                                                                                                                         1      62 .                   O      
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1      63 common              JJ     
 in statistics, simple linear regression is the least squares estimator of a linear regression model with a single explanatory variable. in other words, simple linear regression fits a straight line through the set of n points in such a way that makes the sum of squared residuals of the model (that is, vertical distances between the points of the data set and the fitted line) as small as possible.                                                                                                                                                                                                                                                                                                                         1      63 of                  IN     
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1      63 )                   O      
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1      63 these               DT     
 logistic regression was developed by statistician david cox in 1958[2][3](although much work was done in the single independent variable case almost two decades earlier). the binary logistic model is used to estimate the probability of a binary response based on one or more predictor (or independent) variables (features). as such it is not a classification method. it could be called a qualitative response/discrete choice model in the terminology of economics.                                                                                                                                                                                                                                                         1      63 it                  PRP    
 in statistics, simple linear regression is the least squares estimator of a linear regression model with a single explanatory variable. in other words, simple linear regression fits a straight line through the set of n points in such a way that makes the sum of squared residuals of the model (that is, vertical distances between the points of the data set and the fitted line) as small as possible.                                                                                                                                                                                                                                                                                                                         1      64 the                 DT     
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1      64 tree                NN     
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1      64 systems             NNS    
 logistic regression was developed by statistician david cox in 1958[2][3](although much work was done in the single independent variable case almost two decades earlier). the binary logistic model is used to estimate the probability of a binary response based on one or more predictor (or independent) variables (features). as such it is not a classification method. it could be called a qualitative response/discrete choice model in the terminology of economics.                                                                                                                                                                                                                                                         1      64 could               MD     
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1      64 technique           NN     
 in statistics, simple linear regression is the least squares estimator of a linear regression model with a single explanatory variable. in other words, simple linear regression fits a straight line through the set of n points in such a way that makes the sum of squared residuals of the model (that is, vertical distances between the points of the data set and the fitted line) as small as possible.                                                                                                                                                                                                                                                                                                                         1      65 data                NNS    
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1      65 in                  IN     
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1      65 structures          NNS    
 logistic regression was developed by statistician david cox in 1958[2][3](although much work was done in the single independent variable case almost two decades earlier). the binary logistic model is used to estimate the probability of a binary response based on one or more predictor (or independent) variables (features). as such it is not a classification method. it could be called a qualitative response/discrete choice model in the terminology of economics.                                                                                                                                                                                                                                                         1      65 be                  VB     
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1      65 for                 IN     
 in statistics, simple linear regression is the least squares estimator of a linear regression model with a single explanatory variable. in other words, simple linear regression fits a straight line through the set of n points in such a way that makes the sum of squared residuals of the model (that is, vertical distances between the points of the data set and the fitted line) as small as possible.                                                                                                                                                                                                                                                                                                                         1      66 set                 VBN    
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1      66 supermarkets        NNS    
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1      66 ,                   O      
 logistic regression was developed by statistician david cox in 1958[2][3](although much work was done in the single independent variable case almost two decades earlier). the binary logistic model is used to estimate the probability of a binary response based on one or more predictor (or independent) variables (features). as such it is not a classification method. it could be called a qualitative response/discrete choice model in the terminology of economics.                                                                                                                                                                                                                                                         1      66 called              VBN    
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1      66 statistical         JJ     
 in statistics, simple linear regression is the least squares estimator of a linear regression model with a single explanatory variable. in other words, simple linear regression fits a straight line through the set of n points in such a way that makes the sum of squared residuals of the model (that is, vertical distances between the points of the data set and the fitted line) as small as possible.                                                                                                                                                                                                                                                                                                                         1      67 and                 CC     
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1      67 .                   O      
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1      67 leaves              VBZ    
 logistic regression was developed by statistician david cox in 1958[2][3](although much work was done in the single independent variable case almost two decades earlier). the binary logistic model is used to estimate the probability of a binary response based on one or more predictor (or independent) variables (features). as such it is not a classification method. it could be called a qualitative response/discrete choice model in the terminology of economics.                                                                                                                                                                                                                                                         1      67 a                   DT     
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1      67 data                NNS    
 in statistics, simple linear regression is the least squares estimator of a linear regression model with a single explanatory variable. in other words, simple linear regression fits a straight line through the set of n points in such a way that makes the sum of squared residuals of the model (that is, vertical distances between the points of the data set and the fitted line) as small as possible.                                                                                                                                                                                                                                                                                                                         1      68 the                 DT     
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1      68 for                 IN     
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1      68 represent           JJ     
 logistic regression was developed by statistician david cox in 1958[2][3](although much work was done in the single independent variable case almost two decades earlier). the binary logistic model is used to estimate the probability of a binary response based on one or more predictor (or independent) variables (features). as such it is not a classification method. it could be called a qualitative response/discrete choice model in the terminology of economics.                                                                                                                                                                                                                                                         1      68 qualitative         JJ     
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1      68 analysis            NN     
 in statistics, simple linear regression is the least squares estimator of a linear regression model with a single explanatory variable. in other words, simple linear regression fits a straight line through the set of n points in such a way that makes the sum of squared residuals of the model (that is, vertical distances between the points of the data set and the fitted line) as small as possible.                                                                                                                                                                                                                                                                                                                         1      69 fitted              JJ     
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1      69 example             NN     
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1      69 class               NN     
 logistic regression was developed by statistician david cox in 1958[2][3](although much work was done in the single independent variable case almost two decades earlier). the binary logistic model is used to estimate the probability of a binary response based on one or more predictor (or independent) variables (features). as such it is not a classification method. it could be called a qualitative response/discrete choice model in the terminology of economics.                                                                                                                                                                                                                                                         1      69 response/discrete   JJ     
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1      69 ,                   O      
 in statistics, simple linear regression is the least squares estimator of a linear regression model with a single explanatory variable. in other words, simple linear regression fits a straight line through the set of n points in such a way that makes the sum of squared residuals of the model (that is, vertical distances between the points of the data set and the fitted line) as small as possible.                                                                                                                                                                                                                                                                                                                         1      70 line                NN     
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1      70 ,                   O      
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1      70 labels              NNS    
 logistic regression was developed by statistician david cox in 1958[2][3](although much work was done in the single independent variable case almost two decades earlier). the binary logistic model is used to estimate the probability of a binary response based on one or more predictor (or independent) variables (features). as such it is not a classification method. it could be called a qualitative response/discrete choice model in the terminology of economics.                                                                                                                                                                                                                                                         1      70 choice              NN     
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1      70 used                VBN    
 in statistics, simple linear regression is the least squares estimator of a linear regression model with a single explanatory variable. in other words, simple linear regression fits a straight line through the set of n points in such a way that makes the sum of squared residuals of the model (that is, vertical distances between the points of the data set and the fitted line) as small as possible.                                                                                                                                                                                                                                                                                                                         1      71 )                   O      
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1      71 the                 DT     
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1      71 and                 CC     
 logistic regression was developed by statistician david cox in 1958[2][3](although much work was done in the single independent variable case almost two decades earlier). the binary logistic model is used to estimate the probability of a binary response based on one or more predictor (or independent) variables (features). as such it is not a classification method. it could be called a qualitative response/discrete choice model in the terminology of economics.                                                                                                                                                                                                                                                         1      71 model               NN     
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1      71 in                  IN     
 in statistics, simple linear regression is the least squares estimator of a linear regression model with a single explanatory variable. in other words, simple linear regression fits a straight line through the set of n points in such a way that makes the sum of squared residuals of the model (that is, vertical distances between the points of the data set and the fitted line) as small as possible.                                                                                                                                                                                                                                                                                                                         1      72 as                  RB     
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1      72 rule                NN     
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1      72 branches            NNS    
 logistic regression was developed by statistician david cox in 1958[2][3](although much work was done in the single independent variable case almost two decades earlier). the binary logistic model is used to estimate the probability of a binary response based on one or more predictor (or independent) variables (features). as such it is not a classification method. it could be called a qualitative response/discrete choice model in the terminology of economics.                                                                                                                                                                                                                                                         1      72 in                  IN     
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1      72 many                JJ     
 in statistics, simple linear regression is the least squares estimator of a linear regression model with a single explanatory variable. in other words, simple linear regression fits a straight line through the set of n points in such a way that makes the sum of squared residuals of the model (that is, vertical distances between the points of the data set and the fitted line) as small as possible.                                                                                                                                                                                                                                                                                                                         1      73 small               JJ     
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1      73 {                   O      
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1      73 represent           VBP    
 logistic regression was developed by statistician david cox in 1958[2][3](although much work was done in the single independent variable case almost two decades earlier). the binary logistic model is used to estimate the probability of a binary response based on one or more predictor (or independent) variables (features). as such it is not a classification method. it could be called a qualitative response/discrete choice model in the terminology of economics.                                                                                                                                                                                                                                                         1      73 the                 DT     
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1      73 fields              NNS    
 in statistics, simple linear regression is the least squares estimator of a linear regression model with a single explanatory variable. in other words, simple linear regression fits a straight line through the set of n points in such a way that makes the sum of squared residuals of the model (that is, vertical distances between the points of the data set and the fitted line) as small as possible.                                                                                                                                                                                                                                                                                                                         1      74 as                  IN     
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1      74 onions              NNS    
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1      74 conjunctions        NNS    
 logistic regression was developed by statistician david cox in 1958[2][3](although much work was done in the single independent variable case almost two decades earlier). the binary logistic model is used to estimate the probability of a binary response based on one or more predictor (or independent) variables (features). as such it is not a classification method. it could be called a qualitative response/discrete choice model in the terminology of economics.                                                                                                                                                                                                                                                         1      74 terminology         NN     
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1      74 ,                   O      
 in statistics, simple linear regression is the least squares estimator of a linear regression model with a single explanatory variable. in other words, simple linear regression fits a straight line through the set of n points in such a way that makes the sum of squared residuals of the model (that is, vertical distances between the points of the data set and the fitted line) as small as possible.                                                                                                                                                                                                                                                                                                                         1      75 possible            JJ     
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1      75 ,                   O      
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1      75 of                  IN     
 logistic regression was developed by statistician david cox in 1958[2][3](although much work was done in the single independent variable case almost two decades earlier). the binary logistic model is used to estimate the probability of a binary response based on one or more predictor (or independent) variables (features). as such it is not a classification method. it could be called a qualitative response/discrete choice model in the terminology of economics.                                                                                                                                                                                                                                                         1      75 of                  IN     
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1      75 including           VBG    
 in statistics, simple linear regression is the least squares estimator of a linear regression model with a single explanatory variable. in other words, simple linear regression fits a straight line through the set of n points in such a way that makes the sum of squared residuals of the model (that is, vertical distances between the points of the data set and the fitted line) as small as possible.                                                                                                                                                                                                                                                                                                                         1      76 .                   O      
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1      76 potatoes}=>{burger  NN     
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1      76 features            NNS    
 logistic regression was developed by statistician david cox in 1958[2][3](although much work was done in the single independent variable case almost two decades earlier). the binary logistic model is used to estimate the probability of a binary response based on one or more predictor (or independent) variables (features). as such it is not a classification method. it could be called a qualitative response/discrete choice model in the terminology of economics.                                                                                                                                                                                                                                                         1      76 economics           NNS    
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1      76 machine             NN     
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1      77 }                   O      
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1      77 that                WDT    
 logistic regression was developed by statistician david cox in 1958[2][3](although much work was done in the single independent variable case almost two decades earlier). the binary logistic model is used to estimate the probability of a binary response based on one or more predictor (or independent) variables (features). as such it is not a classification method. it could be called a qualitative response/discrete choice model in the terminology of economics.                                                                                                                                                                                                                                                         1      77 .                   O      
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1      77 learning            NN     
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1      78 found               VBN    
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1      78 lead                VBP    
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1      78 ,                   O      
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1      79 in                  IN     
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1      79 to                  TO     
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1      79 pattern             JJ     
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1      80 the                 DT     
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1      80 those               DT     
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1      80 recognition         NN     
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1      81 sales               NNS    
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1      81 class               NN     
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1      81 ,                   O      
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1      82 data                NN     
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1      82 labels              NNS    
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1      82 image               NN     
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1      83 of                  IN     
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1      83 .                   O      
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1      83 analysis            NN     
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1      84 a                   DT     
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1      84 decision            NN     
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1      84 ,                   O      
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1      85 supermarket         NN     
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1      85 trees               NNS    
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1      85 information         NN     
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1      86 would               MD     
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1      86 where               WRB    
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1      86 retrieval           NN     
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1      87 the                 DT     
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1      87 indicate            VB     
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1      87 ,                   O      
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1      88 target              NN     
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1      88 that                IN     
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1      88 and                 CC     
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1      89 bioinformatics      NNS    
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1      89 if                  IN     
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1      89 variable            NN     
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1      90 .                   O      
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1      90 a                   DT     
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1      90 can                 MD     
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1      91 customer            NN     
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1      91 take                VB     
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1      91 cluster             NN     
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1      92 buys                VBZ    
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1      92 continuous          JJ     
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1      92 analysis            NN     
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1      93 onions              NNS    
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1      93 values              NNS    
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1      93 itself              PRP    
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1      94 and                 CC     
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1      94 (                   O      
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1      94 is                  VBZ    
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1      95 potatoes            VBZ    
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1      95 typically           RB     
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1      95 not                 RB     
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1      96 together            RB     
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1      96 real                JJ     
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1      96 one                 CD     
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1      97 ,                   O      
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1      97 numbers             NNS    
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1      97 specific            JJ     
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1      98 they                PRP    
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1      98 )                   O      
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1      98 algorithm           NN     
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1      99 are                 VBP    
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1      99 are                 VBP    
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1      99 ,                   O      
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1     100 likely              JJ     
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1     100 called              VBN    
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1     100 but                 CC     
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1     101 to                  TO     
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1     101 regression          NN     
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1     101 the                 DT     
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1     102 also                RB     
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1     102 trees               NNS    
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1     102 general             JJ     
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1     103 buy                 VB     
 decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the items target value. it is one of the predictive modelling approaches used in statistics, data mining and machine learning. tree models where the target variable can take a finite set of values are called classification trees. in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.                                                                                                   1     103 .                   O      
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1     103 task                NN     
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1     104 hamburger           JJR    
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1     104 to                  TO     
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1     105 meat                NN     
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1     105 be                  VB     
 association rule learning is a method for discovering interesting relations between variables in large databases. it is intended to identify strong rules discovered in databases using different measures of interestingness. based on the concept of strong rules, rakesh agrawal et al.[2] introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (pos) systems in supermarkets. for example, the rule {onions, potatoes}=>{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.                                                          1     106 .                   O      
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1     106 solved              VBN    
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1     107 .                   O      
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1     108 it                  PRP    
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1     109 can                 MD     
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1     110 be                  VB     
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1     111 achieved            VBN    
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1     112 by                  IN     
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1     113 various             JJ     
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1     114 algorithms          NNS    
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1     115 that                WDT    
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1     116 differ              VBP    
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1     117 significantly       RB     
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1     118 in                  IN     
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1     119 their               PRP$   
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1     120 notion              NN     
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1     121 of                  IN     
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1     122 what                WP     
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1     123 constitutes         VBZ    
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1     124 a                   DT     
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1     125 cluster             NN     
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1     126 and                 CC     
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1     127 how                 WRB    
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1     128 to                  TO     
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1     129 efficiently         RB     
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1     130 find                VB     
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1     131 them                PRP    
 cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). it is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. cluster analysis itself is not one specific algorithm, but the general task to be solved. it can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them.           1     132 .                   O

Download a zip file of all examples and a SQL script file that creates their input tables from the attachment in the left sidebar.