This is a tutorial project demonstrating the idea of a “template” project for building an Analytic Data Set. One of the features of Teradata Warehouse Miner and its derivative products is that up to three projects can be automatically loaded into the project workspace at startup time, in conjunction with automatically connecting to a specified data source. Also, a feature of the ADS functions is that Join Paths, Variables, Dimensions and Transformations can be loaded from any project that has been loaded into the project workspace. Therefore, it may be helpful to have a “template” project that contains predefined Join Paths, Variables, Dimensions and/or Transformations that can be re-used as-is or after modification.
A template project will typically contain predefined Join Paths from a “subject” table, containing information about the “subject” of the ADS, to secondary tables that contain descriptive information about the subject. For the TWM demonstration tables, the primary subject table or “anchor table” is twm_customer, which contains information about customers, based on the “anchor column” cust_id. One of the tricks that this tutorial hopes to impart is that it can be helpful to first build a much smaller table to be used as the anchor table, based on the subject table.
The initial analysis, Build ADS1, builds such a table called twm_customer_10 containing only 10 rows. This table is referenced in Variable Creation1-5 as the anchor table, extracting only the anchor column cust_id from this table. All other descriptive fields are extracted from the “true” subject table, twm_customer. This procedure is followed both for performance advantages during the development of the ADS, and to prepare for substituting other anchor tables in future projects using a Refresh or Publish analysis or a stored procedure.
|Build twm_customer_10||This analysis builds the reduced anchor table twm_customer_10 by including an expert Where Clause “cust_id < 1362497”.|
|Variable Creation1||This analysis joins twm_customer_10 to tables with the same key structure, starting with the true subject table twm_customer. The variables are not of special interest; the purpose of the analysis is to provide the Join Paths to these tables.|
|Variable Creation2||This analysis joins the reduced anchor table twm_customer_10 to the combined accounts table twm_accounts and the combined transactions table twm_transactions. Some useful dimensions have been defined on the Dimensions tab, including CC, CK, SV and active. These were then combined using the Combine button to form active_CC, active_CK and active_SV. Note that the dimensions were renamed and the ELSE Condition was set to ELSE ZERO so that the dimensions can be used to create indicator variables.
The variables based on the accounts table demonstrate a somewhat curious technique. First, the variable account is defined as MAX(1). Then on the dimensions tab (not the one next to Variables, but the one just under OUTPUT), the dimensions active_CC, active_CK and active_SV are applied to the variable account. This leads to the creation of three “indicator” variables, active_CC_account, active_CK_account and active_SV_account, which assume a value of 1 if the account exists and is active for a given customer and 0 if not.
While this could have been achieved using SQL Case statements directly on the Variables tab without using the dimensions tab, these dimensions may be useful as templates. Also, MAX(1) is used rather than just the literal '1' because dimensioning would otherwise use the default aggregate of MIN, which would not give an indication of '1' if a row for the particular account was present.
The variables created based on the combined transactions table are dimensioned averages which require that the dimensions use an ELSE value of NULL. This can be seen by considering that if an ELSE value of 0 were to be used, the averages would be taken over all of the transactions for a customer rather than just those of the appropriate account type.
|Variable Creation3||This analysis introduces three dimension values based on the Target Date parameter, which has been set to 3/31/1995 on the analysis parameters tab. The demonstration tables include transactions that span the year 1995. These dimension values, prev_month, prev_2_months and prev_3_months, have been applied to the variable avg_amt_ck to produce three output columns, prev_month_avg_amt_ck, prev_2_months_avg_amt_ck and prev_3_months_avg_amt_ck. The same three dimension values have also been applied to the variables min_bal_ck and max_bal_ck, which contain the minimum and maximum checking account balances.|
|Variable Creation4||This analysis applies three similar dimension values with the same names as those in Variable Creation3, prev_month, prev_2_months and prev_3_months, to each of the variables avg_amt_cc, min_bal_cc and max_bal_cc derived from the credit transaction table.
These dimension values are similar to those in Variable Creation3 with the difference that the tran_date column is selected from the credit transaction table rather than the checking transaction table. One way to create these dimensions would be to drag and drop them from Variable Creation3, then select the credit transaction table and right-click on the tran_date column and select the option to Switch '<table name>' to Current Input. It would also be necessary to set the Target Date parameter appropriately on the literal parameters tab.
|Variable Creation5||This analysis applies three similar dimension values with the same names as those in Variable Creation5, prev_month, prev_2_months and prev_3_months, to each of the variables avg_amt_sv, min_bal_sv and max_bal_sv derived from the savings transaction table.
These dimension values are similar to those in Variable Creation3 with the difference that the tran_date column is selected from the savings transaction table rather than the checking transaction table. One way to create these dimensions would be to drag and drop them from Variable Creation3, then select the savings transaction table and right-click on the tran_date column and select the option to Switch '<table name>' to Current Input. It would also be necessary to set the Target Date parameter appropriately on the literal parameters tab.
|Build ADS1||Build ADS1 builds the final data set, joining together volatile output from Variable Creation1-5 by referencing them for input. The Join Path Wizard was used to specify that they all be joined together on cust_id, resulting in the creation of the output table “twm_supplemental_tutorial”.|