Features - Teradata Vantage

ClearScape Analytics ModelOps User Guide

Deployment
VantageCloud
VantageCore
Edition
Enterprise
IntelliFlex
VMware
Product
Teradata Vantage
Release Number
7.0
Published
April 2023
Language
English (United States)
Last Update
2023-04-19
dita:mapPath
rfi1654194187578.ditamap
dita:ditavalPath
ayr1485454803741.ditaval
dita:id
B700-1175

Features are the basic building blocks of datasets. The quality of the features in your dataset has a major impact on the quality of the insights you will gain when you use that dataset for training and evaluating models.

ModelOps automatically detects each feature’s data type (categorical, continuous) and performs basic statistical analysis (mean, median, standard deviation, and more) on each feature. Additionally, ModelOps automatically generates a histogram for each feature.

Property Description
Name Specifies the name of the feature.
Type Specifies the data type of feature as Continuous or Categorical.
Importance Specifies the feature importance. Feature importance measures the increase in the prediction error of the model after we permuted the feature's values, which breaks the relationship between the feature and the true outcome.
The importance of a feature is measured by calculating the increase in the model's prediction error after permuting the feature.
  • A feature is important if shuffling its values increases the model error, because in this case the model relied on the feature for the prediction.
  • A feature is unimportant if shuffling its values leaves the model error unchanged, because in this case the model ignored the feature for the prediction.

To view details of a feature, select a feature in the Features table. The right section of the page displays the Distribution histogram and Dataset statistics for the selected feature.

Distribution

The Distribution histogram displays the feature value on the x-axis and the count on the y-axis.



Statistics

The dataset statistics display the following measures for the selected feature.
  • count (cnt)
  • minimum (min)
  • maximum (max)
  • mean
  • standarddeviation (std)
  • skewness (skew)
  • kurtosis (kurt)
  • standarderror (ste)
  • coefficientofvariance (cv)
  • variance (var)
  • sum
  • uncorrectedsumofsquares (uss)
  • correctedsumofsquares (css)