| |
- BinomialTest(data, first_column=None, binomial_prob=0.5, exact_matches='negative', fallback=False, group_columns=None, allow_duplicates=False, second_column=None, single_tail=False, stats_database=None, style='binomial', probability_threshold=0.05, gen_sql_only=False)
- DESCRIPTION:
In a binomial test, there are assumed to be N independent trials, each with two
possible outcomes, each of equal probability. You can choose to perform a binomial
test, in which the sign of the difference between a first and second column is
analyzed, or a sign test, in which the sign of a single column is analyzed. In a
binomial test, user can choose to use a probability different from the default value,
whereas in a sign test, the binomial probability is fixed at 0.5.
PARAMETERS:
data:
Required Argument.
Specifies the input data to run statistical tests.
Types: teradataml DataFrame
first_column:
Required Argument.
Specifies the name of the column to analyze.
Types: str
binomial_prob:
Optional Argument.
Specifies the binomial probability to use for Binomial Test.
Note:
This is not available to use with sign test.
Default Value: 0.5
Types: float
exact_matches:
Optional Argument.
Specifies the category to place exact matches in.
Note:
This is not allowed with sign test.
Permitted Values:
* 'zero' - exact match is discarded.
* 'positive' - match is placed with values greater than or equal to zero.
* 'negative' - match is placed with values less than or equal to zero.
Default Value: 'negative'
Types: str
fallback:
Optional Argument.
Specifies whether the FALLBACK is requested as in the output result or not.
Default Value: False (Not requested)
Types: bool
group_columns:
Optional Argument.
Specifies the name(s) of the column(s) for grouping so that a separate result
is produced for each value or combination of values in the specified column or
columns.
Types: str OR list of Strings (str)
allow_duplicates:
Optional Argument.
Specifies whether duplicates are allowed in the output or not.
Default Value: False
Types: bool
second_column:
Required argument for binomial test.
Specifies the name of the column representing the second variable to analyze.
Note:
This is not allowed with sign test.
Types: str
single_tail:
Optional Argument.
Specifies whether to request single-tailed test or not. When set to True, a
single-tailed test is requested. Otherwise, a two-tailed test is requested.
Note:
If the binomial probability is not 0.5, "single_tail" must be set to True.
Default Value: False
Types: bool
stats_database:
Optional Argument.
Specifies the database where the statistical test metadata tables are installed.
If not specified, the source database is searched for these metadata tables.
Types: str
style:
Optional Argument.
Specifies the test style.
Permitted Values: 'binomial', 'sign'
Default Value: 'binomial'
Types: str
probability_threshold:
Optional Argument.
Specifies the threshold probability, i.e., "alpha" probability, below which the
null hypothesis is rejected.
Default Value: 0.05
Types: float
gen_sql_only:
Optional Argument.
Specifies whether to generate only SQL for the function.
When set to True, function SQL is generated, not executed, which can be accessed
using show_query() method, otherwise SQL is just executed but not returned.
Default Value: False
Types: bool
RETURNS:
An instance of BinomialTest.
Output teradataml DataFrames can be accessed using attribute references, such as
BinomialTestObj.<attribute_name>.
Output teradataml DataFrame attribute name is: result.
RAISES:
TeradataMlException, TypeError, ValueError
EXAMPLES:
# Notes:
# 1. To execute Vantage Analytic Library functions,
# a. import "valib" object from teradataml.
# b. set 'configure.val_install_location' to the database name where Vantage
# analytic library functions are installed.
# 2. Datasets used in these examples can be loaded using Vantage Analytic Library
# installer.
# 3. The Statistical Test metadata tables must be loaded into the database where
# Analytics Library is installed.
# Import valib object from teradataml to execute this function.
from teradataml import valib
# Set the 'configure.val_install_location' variable.
from teradataml import configure
configure.val_install_location = "SYSLIB"
# Create required teradataml DataFrame.
custanly = DataFrame("customer_analysis")
print(custanly)
# Example 1: A binomial test without any grouping.
obj = valib.BinomialTest(data= custanly,
first_column="avg_sv_bal",
second_column="avg_ck_bal")
# Print the results.
print(obj.result)
# Example 2: A binomial test with grouping done by gender.
obj = valib.BinomialTest(data= custanly,
first_column="avg_sv_bal",
second_column="avg_ck_bal",
group_columns="gender")
# Print the results.
print(obj.result)
# Example 3: A sign test without any grouping.
obj = valib.BinomialTest(data= custanly,
first_column="avg_sv_bal",
style="sign")
# Print the results.
print(obj.result)
# Example 4: A sign test with grouping done by gender.
obj = valib.BinomialTest(data= custanly,
first_column="avg_sv_bal",
style="sign",
group_columns="gender")
# Print the results.
print(obj.result)
# Example 5: Generate only SQL for the function, but do not execute the same.
obj = valib.BinomialTest(data= df,
first_column="avg_sv_bal",
group_columns="gender",
second_column="avg_ck_bal",
stats_database="alice",
gen_sql_only=True)
# Print the generated SQL.
print(obj.show_query("sql"))
# Print both generated SQL and stored procedure call.
print(obj.show_query("both"))
# Print the stored procedure call.
print(obj.show_query())
print(obj.show_query("sp"))
|