Every complete example in this document is available in a zip file that you can download. The zip file includes a SQL script file that creates the input tables for the examples. If you are reading this document on https://docs.teradata.com/, you can download the zip file from the attachment in the left sidebar.
This example tests whether gender influences smoking habits affiliation. The null hypothesis is that gender and smoking habits affiliation are independent. TD_ChiSq compares the null hypothesis (expected frequencies) to the contingency table (observed frequencies).
Input: contingency1
The contingency table contains the frequencies of men and women affiliated with each smoking habit. category_1, gender, has labels "female" and "male". category_2, habits, has labels "smokers" and "non-smokers".
This example illustrates a two-way contingency table with two categories, category_1 and category_2 respectively.
Each row has a label, i which has a value between 1 to r, and each column has a label, j which has a value between 2 to c. The values of c and r are 3 and 2 respectively.
Here, category 1_label1 corresponds to females and category_1_label2 corresponds to males. Similarly, category2_label1 corresponds to smokers and category2_label2 corresponds to non-smokers.
Query to create the contingency table is as follows:
--Query to construct table contingency1 CREATE MULTISET TABLE contingency1 as ( select gender as gender , sum((case when smokinghabit = 'smoker' then 1 else 0 end)) as smoker_cnt , sum((case when smokinghabit= 'nonsmoker' then 1 else 0 end)) as nonsmoker_cnt from mytesttable group by gender ) with data; --Alternate Query to construct table contingency1 using PIVOT CREATE MULTISET TABLE contingency1 as ( SELECT * FROM (select gender as gender, smokinghabit, count(smokinghabit) as smokinghabit_count from mytesttable group by gender, smokinghabit) as mytesttable PIVOT ( SUM(smokinghabit_count) as smokinghabit_cnt FOR smokinghabit IN ('smoker' AS smoker, 'nonsmoker' AS nonsmoker))Tmp ) with data;
Habits gender smokers non-smokers ---------- --------- ------------- female 6 9 male 8 5
SQL Call
SELECT * from TD_CHISQ ( ON contingency1 AS CONTINGENCY OUT TABLE EXPCOUNTS (exptable1) USING Alpha (0.05) ) AS dt;
Output Table
chi_square cramers_v df alpha p_value criticalvalue conclusion ---------- ------------------- ---- -------------------- ------------------- -------------- ----------------------------- 1.29230769230769E 000 2.14834462211830E-001 1 5.00000000000000E-002 2.55623107546413E-001 3.84145882069412E 000 Fail to reject Null hypothesis
exptable1
gender smokers non-smokers ---------- ---------------------- ---------------------- female 7.50000000000000E 000 7.50000000000000E 000 male 6.50000000000000E 000 6.50000000000000E 000