CALL td_analyze ( 'histogram', 'required_parameter_list [ optional_parameter; [...] ]' );
- required_parameter_list
database = input_database_name; tablename = input_table_name; columns = { column_name [,...] | keyword };
- optional_parameter
{ binstyle = bin_style | columnstoexclude = column_name [,...] | gensqlonly = { true | false } | outputdatabase = output_database_name | outputtablename = output_table_name | overlaycolumns = column_name [,...] | overwrite = { true | false } | statisticscolumns = column_name [,...] | style = { basic | crosstab } | where = expression }
Syntax Elements
- database
- The database containing the table to analyze.
- tablename
- The table containing the columns to analyze.
- columns
- The columns to analyze.
- binstyle
- [Optional] The bin style, which is one of the following options:
Option Description bins=n[,…] Create n bins of equal width. If you specify multiple columns, you can specify the number of bins for each column.
If you specify fewer n values than columns, leftover columns have default number of bins, 10.
widths=n[,…] Create bins of width n. If you specify multiple columns, you must specify a width for each column.
If you specify fewer n values than columns, an error message displays.
quantiles=n[,…] Create approximately n equally populated bins. If you specify multiple columns, you can specify the approximate number of equally populated bins for each column.
If you specify fewer n values than columns, leftover columns have default number of equally populated bins, 10.
boundaries=n[,…] or
boundaries={n[,…]}...
Create bins with boundaries at specified numbers. Example: boundaries=0,50,100,150 creates 3 bins between 0 and 150 (0 to 50, 50 to 100, and 100 to 150).
If you specify multiple columns, you must specify boundaries for each column. Example: boundaries={0, 50000, 100000, 150000}, {0, 50, 100}
binwithminmax=n, min, max or
boundaries={n, min, max}...
Create n bins ranging from min to max. Example: binwithminmax=5,0,200 creates 5 bins ranging from 0 to 200.
If you specify multiple columns, you must specify n, min, and max for each column. Example: binwithminmax={10, 0, 200000}, {5, 0, 100}
- columnstoexclude
- [Optional] The columns to exclude when columns specifies a keyword.
- gensqlonly
- [Optional] True returns the SQL for the function as a result set but does not run it.
- outputdatabase
- [Optional] The name of the database to contain the analysis results table.
- outputtablename
- [Optional] The name of the table to store the analysis results.
- overlaycolumns
- [Optional] The column or columns for which to calculate frequencies within each bin for each value. Each specified column must be a categorical variable with few values, and statisticscolumns cannot also specify it.
- overwrite
- [Optional] Whether to drop the output tables before creating new ones.
- statisticscolumns
- [Optional] The column or columns for which to calculate simple statistics (minimum, maximum, mean and standard deviation) in each bin. Each specified column must be numeric (not DATE), and overlaycolumns cannot also specify it.
- style
- [Optional] The histogram style:
Histogram Style Description basic (default) Histogram for individual columns. If columns specifies multiple columns, function repeats SELECT statement for each column.
crosstab Multidimensional histogram (combines columns). If columns specifies multiple columns, function cross-tabulates all columns within single SELECT statement. If you also specify outputtablename, function creates output table only once, repeating INSERT SELECT statement.
- where
- [Optional] The expression in the SQL WHERE clause to include in the generated SQL to filter rows selected for analysis.