CALL td_analyze ( 'dataexplorer', 'required_parameter_list [ optional_parameter; [...] ]' );
- required_parameter_list
database = input_database_name [,...]; tablename = input_table_name [,...]; outputdatabase = output_database_name;
- optional_parameter
{ bins = number_of_bins | binstyle = { bins | quantiles } | columns = column_name [,...] | frequencyoutputtablename = freq_output_table | gensql = { true | false } | histogramoutputtablename = hist_output_table | maxuniquecharvalues = max_unique_char_values | maxuniquenumvalues = max_unique_num_values | overwrite = { true | false } | restrictedfreqproc = { true | false } | restrictedthreshold = threshold | statisticalmethod = { population | sample } | statisticsoutputtablename = stat_output_table | statsoptions = { all | stat_option [,...] } | uniques = { true | false } | valuesoutputtablename = values_output_table | where = expression }
Syntax Elements
- database
- The database or databases containing the table or tables to analyze.
- tablename
- The table or tables containing the columns to analyze.
- outputdatabase
- The database in which to store the output tables specified by frequencyoutputtablename, histogramoutputtablename, statisticsoutputtablename, and valuesoutputtablename.
- bins
- [Optional] The number of bins to create in the histogram.
- binstyle
- [Optional] The bin style, which is one of the following options:
Option Description bins=n[,…] Create n bins of equal width. If you specify multiple columns, you can specify the number of bins for each column.
If you specify fewer n values than columns, leftover columns have default number of bins, 10.
widths=n[,…] Create bins of width n. If you specify multiple columns, you must specify a width for each column.
If you specify fewer n values than columns, an error message displays.
quantiles=n[,…] Create approximately n equally populated bins. If you specify multiple columns, you can specify the approximate number of equally populated bins for each column.
If you specify fewer n values than columns, leftover columns have default number of equally populated bins, 10.
boundaries=n[,…] or
boundaries={n[,…]}...
Create bins with boundaries at specified numbers. Example: boundaries=0,50,100,150 creates 3 bins between 0 and 150 (0 to 50, 50 to 100, and 100 to 150).
If you specify multiple columns, you must specify boundaries for each column. Example: boundaries={0, 50000, 100000, 150000}, {0, 50, 100}
binwithminmax=n, min, max or
boundaries={n, min, max}...
Create n bins ranging from min to max. Example: binwithminmax=5,0,200 creates 5 bins ranging from 0 to 200.
If you specify multiple columns, you must specify n, min, and max for each column. Example: binwithminmax={10, 0, 200000}, {5, 0, 100}
- columns
- [Optional] The column or columns to analyze.
- frequencyoutputtablename
- [Optional] The name of the frequency output table. If you do not want to overwrite previous results, specify a unique name.
- gensql
- [Optional] True returns the SQL for the function as a result set after initiating, up to a limited number of characters.
- histogramoutputtablename
- [Optional] The name of the histogram output table. If you do not want to overwrite previous results, specify a unique name.
- maxuniquecharvalues
- [Optional] The maximum number of unique character values for unrestricted frequency analysis.
- maxuniquenumvalues
- [Optional] The maximum number of unique numeric or date values for frequency analysis.
- overwrite
- [Optional] Whether to drop the output tables before creating new ones.
- restrictedfreqproc
- [Optional] Whether to perform restricted Frequency analysis on character columns with more unique values than maxuniquecharvalues specifies. That is, whether to include their prominent values in the frequency output table.
- restrictedthreshold
- [Optional] The minimum percentage of rows in which a value must occur to be included in the results.
- statisticalmethod
- [Optional] Whether to calculate the statistic by population or sample.
- statisticsoutputtablename
- [Optional] The name of the statistics output table. If you do not want to overwrite previous results, specify a unique name.
- statsoptions
- [Optional] The statistic or statistics to calculate for the specified column or columns, either stat_option or its abbreviation.
- uniques
- [Optional] Whether to count unique values for each selected column.
- valuesoutputtablename
- [Optional] The name of the values output table. If you do not want to overwrite previous results, specify a unique name.
- where
- [Optional] The expression in the SQL WHERE clause to include in the generated SQL to filter rows selected for analysis.Examples:
- where = cust_id > 0 includes this WHERE clause in the generated SQL:
WHERE cust_id > 0
- where = gender = ''F'' includes this WHERE clause in the generated SQL:
WHERE gender='F'
- Use this form of the where parameter to request a WHERE clause for each input table. The WHERE clause must match positionally with the tablename and database parameters with each WHERE clause in braces {} as in the following example:
where= {cust_id < 1362490}, {cust_id < 1362490};
- where = cust_id > 0 includes this WHERE clause in the generated SQL: