Column Specification
Some analytic functions in the Analytics Database provide arguments for selecting or performing operations on multiple columns.
- Pass a single column name.
For example, column_arg = "col1"
- Pass multiple columns (specific columns only).
For example, column_arg = ["col1", "col3", "col8"]
- Pass multiple columns using DataFrame.columns and slice filtering.
For example, column_arg = list(set(df.columns[2:10]) - set(df.columns[5:7]))
- Pass multiple column as a column range.
For example, column_arg = "column_range"
Specifying Column Range column_range
- Without column exclusion.
Syntax: "start_column:end_column"
Must be passed as string. - With column exclusion.
Syntax: ["start_column:end_column", "-exclude_column1", ...]
Must be passed as a list of strings.
- Column names.
For example, "column1:column2".
- Nonnegative integers that represent the indexes of columns in the table. The first column has index 0.
For example, "0:4" specifies the first five columns in the table.
- Empty.For example,
- ":4" or ":columnD" specifies all columns up to and including the column with index 4 or columnD.
- "4:" or "columnD:" specifies the column with index 4 (or columnD) and all columns after it.
- ":" specifies all columns in the table.
The exclude_column is a column in the specified range, represented by either its name or its index.
For example, ["0:99", "-[50]", "-column10"] specifies the columns with indexes from 0 to 99, except the column with index 50 and column10.
- Column can be enclosed in double quotes. For example, if DataFrame has columns :columnA, columnB: and :column:C:, columns can be selected by enclosing in double quotes:
- "\":columnA\"" or '":columnA"'
- "\"columnB:\"" or '"columnB:"'
- "\":column:C:\"" or '":column:C:"'
- You can always use column range and exclusion by index instead of column names.
Column Range Examples
Assume "insect_sprays" DataFrame has columns groupA, :groupB, groupC:CC, groupD:, groupE:EE:EEE and :groupF:, the following examples show how to select range of columns for the group_columns argument in the ANOVA function.
ANOVA(data=insect_sprays, group_columns="2:5", alpha = 0.025)
ANOVA(data=insect_sprays, group_columns=":5", alpha=0.025)
ANOVA(data=insect_sprays, group_columns= '"groupC:CC":"groupE:EE:EEE"', alpha = 0.05)
ANOVA(data=insect_sprays, group_columns='"groupE:EE:EEE":', alpha = 0.05)
ANOVA(data=insect_sprays, group_columns=[':', "-\":groupB\""], # or [':', "-[2]"] alpha = 0.05)