TD_GetFutileColumns InputTable
passenger gender ticket cabin survived --------- ------ ---------------- ----- -------- 1 male A/5 21171 C 0 2 Female PC 17599 C 1 3 Female STON/O2. 3101282 C 1 4 male 113803 C 1 5 Female 373450 C 0
Creating Summary Table Using Columns
CREATE TABLE cateogrySummaryTable AS ( SELECT * FROM TD_CATEGORICALSUMMARY ( ON getFutileColumns_titanic AS InputTable USING TargetColumns('Cabin','gender','Ticket') ) AS dt) WITH data;
Output:
ColumnName DistinctValue DistinctValueCount ---------- ---------------- ------------------ cabin C 5 gender Female 3 gender male 2 ticket 373450 1 ticket A/5 21171 1 ticket PC 17599 1 ticket STON/O2. 3101282 1 ticket 113803 1
Creating Summary Table Using Indexes
Another way to run the query shown in the preceding input table is to use indexing. Instead of writing column names, you can also use indexes. Indexes start from 0 (first column is indexed by 0) and then 1, and so on.
Running the following query gives you the stats for the first, second, and third columns:
CREATE TABLE cateogrySummaryTable AS ( SELECT * FROM TD_CATEGORICALSUMMARY ( ON getFutileColumns_titanic AS InputTable USING TargetColumns('[1:3]') ) AS dt) WITH data;
Output:
ColumnName DistinctValue DistinctValueCount ---------- ---------------- ------------------ cabin C 5 gender Female 3 gender male 2 ticket 373450 1 ticket A/5 21171 1 ticket PC 17599 1 ticket STON/O2. 3101282 1 ticket 113803 1
To include all columns, you can also use '[:]'. This method only works on categorical columns, so the dataset must have all categorical columns.
TD_GetFutileColumns SQL Call
SELECT * FROM TD_getFutileColumns( ON getFutileColumns_titanic AS InputTable PARTITION BY ANY ON cateogrySummaryTable AS categorytable DIMENSION USING CategoricalSummaryColumn('ColumnName') ThresholdValue(0.7) )As dt;
TD_GetFutileColumns Output Table
ColumnName ---------- ticket cabin