Data Quality Reports | Factor Analysis | Vantage Analytics Library - Data Quality Reports - Vantage Analytics Library

Vantage Analytics Library User Guide

Deployment
VantageCloud
VantageCore
Edition
Enterprise
IntelliFlex
Lake
VMware
Product
Vantage Analytics Library
Release Number
2.2.0
Published
March 2023
Language
English (United States)
Last Update
2024-01-02
dita:mapPath
ibw1595473364329.ditamap
dita:ditavalPath
iup1603985291876.ditaval
dita:id
zyl1473786378775
Product Category
Teradata Vantage

Variable Statistics

This report gives the mean value and standard deviation of each variable in the model. It is automatically generated and does not need to be requested.

Near Dependency

If selected with the neardependencyreport parameter, this report lists collinear variables or near dependencies in the data based on the SSCP matrix provided as input. Entries in the Near Dependency report are triggered by two conditions occurring simultaneously. The first is the occurrence of a large condition index value associated with a specially constructed principal factor. If a factor has a condition index greater than the parameter specified with neardependencyreport, it is a candidate for the Near Dependency report. The other is when two or more variables have a variance proportion greater than a threshold value for a factor with a high condition index. Another way of saying this is that a ‘suspect’ factor accounts for a high proportion of the variance of two or more variables. The parameter to defines what a high proportion of variance is also set with neardependencyreport. A default value of 0.5.

The following is an example of a Near Dependency report.

Variable Name Factor Condition Index Variance Proportion Mean Standard Deviation
CONSTANT 7 15001.8594 1 * *
cust_id 7 15001.8594 1 1362987.891 293.5012
age 6 52.6169 .9963 33.744 22.3731
combo2 6 52.6169 .9935 25.733 23.4274
children 6 52.6169 .713 .534 1.0029
income 5 35.3599 .9951 16978.026 21586.8442
combo1 5 35.3599 .995 33654.602 43110.862