Features are the basic building blocks of datasets. The quality of the features in your dataset has a major impact on the quality of the insights you will gain when you use that dataset for training and evaluating models.
ModelOps automatically detects each feature’s data type (categorical, continuous) and performs basic statistical analysis (mean, median, standard deviation, and more) on each feature. Additionally, ModelOps automatically generates a histogram for each feature.
Property | Description |
---|---|
Name | Specifies the name of the feature. |
Type | Specifies the data type of feature as Continuous or Categorical. |
Importance | Specifies the feature importance. Feature importance measures the increase in the prediction error of the model after we permuted the feature's values, which breaks the relationship between the feature and the true outcome. The importance of a feature is measured by calculating the increase in the model's prediction error after permuting the feature.
|
To view details of a feature, select a feature in the Features table. The right section of the page displays the Distribution histogram and Dataset statistics for the selected feature.
Distribution
The Distribution histogram displays the feature value on the x-axis and the count on the y-axis.
Statistics
- count (cnt)
- minimum (min)
- maximum (max)
- mean
- standarddeviation (std)
- skewness (skew)
- kurtosis (kurt)
- standarderror (ste)
- coefficientofvariance (cv)
- variance (var)
- sum
- uncorrectedsumofsquares (uss)
- correctedsumofsquares (css)