TD_GetRowsWithoutMissingValues | GetRowsWithoutMissingValues - TD_GetRowsWithoutMissingValues - Analytics Database

Database Analytic Functions

Deployment
VantageCloud
VantageCore
Edition
Enterprise
IntelliFlex
VMware
Product
Analytics Database
Release Number
17.20
Published
June 2022
Language
English (United States)
Last Update
2024-10-04
dita:mapPath
gjn1627595495337.ditamap
dita:ditavalPath
ayr1485454803741.ditaval
dita:id
jmh1512506877710
Product Category
Teradata Vantageā„¢

TD_GetRowsWithoutMissingValues displays the rows that have non-NULL values in specified input table columns.

A null value indicates the absence of information. This means that a real value is unknown or non-existent, such as no data is assigned to the column in that specific row. A null value is not a zero or a blank.

Null values can occur for reasons, such as incomplete data entry, data corruption, or errors in data processing. They can also occur when data is not collected or is not applicable for certain records or fields.

Null values can pose a challenge for data analysis and modeling because they can affect statistical calculations and analysis. There are advantages of removing null values from data, for example:
  • Improved accuracy: Null values can skew analysis and modeling results, leading to inaccurate insights and decisions. By removing null values, you make sure your data is more accurate and reliable, which can lead to better business decisions.
  • Enhanced efficiency: Null values can slow down data processing and analysis, leading to inefficiencies and longer processing times. By removing null values, you can streamline your data processing workflows and make your operations more efficient.
  • Increased data quality: Null values can indicate missing or incomplete data, which can affect the overall quality of the data. By removing null values, you can improve the quality of your data, making it more useful and valuable for analysis and decision-making.
  • Better customer insights: Null values in customer data can limit the ability to understand customer behavior and preferences. By removing null values, you can gain a more complete picture of your customers, enabling you to provide better products and services that meet your needs.
Therefore, it's important to identify and handle null values appropriately. Null values can be removed from a dataset using different techniques, depending on the nature and size of the dataset and the purpose of the analysis. Common techniques for removing null values include:
  • Deleting rows with null values: This technique involves removing all rows that contain null values. While this approach is straightforward, this can result in a loss of data, especially if the null values are spread across multiple columns.
  • Deleting columns with null values: This technique involves removing all columns that contain null values. While it retains the maximum number of rows, it can lead to a loss of information if the deleted columns are relevant to the analysis.
  • Imputing null values: This technique involves filling in the null values with an appropriate value, such as the mean or median of the column. Imputing null values can help retain the maximum amount of data and reduce the bias introduced by deleting rows or columns.
  • Using data modeling techniques: Advanced techniques such as regression, clustering, and decision trees can be used to predict the missing values based on other attributes in the dataset.

Overall, the best approach to remove null values depends on the specific needs and objectives of the analysis and the characteristics of the dataset itself.

Important: Carefully consider the trade-offs between retaining maximum data, preserving data integrity, and minimizing the potential for bias or distortion.
  • This function requires the UTF8 client character set for UNICODE data.
  • This function does not support Pass Through Characters (PTCs).

    For information about PTCs, see International Character Set Support, B035-1125.

  • This function does not support KanjiSJIS or Graphic data types.
  • SELECT TOP gives non-deterministic results. Therefore, identical queries including this instruction may produce different results.