| |
Methods defined here:
- __init__(self, data=None, value_column=None, accumulate=None, segmentation_method='normal_distribution', window_size=10, threshold=10.0, output_option='CHANGEPOINT', data_sequence_column=None, data_partition_column=None, data_order_column=None)
- DESCRIPTION:
The ChangePointDetectionRT function detects change points in a
stochastic process or time series, using real-time change-point
detection, implemented with these algorithms:
• Search algorithm: sliding window
• Segmentation algorithm: normal distribution
Use this function when the input data cannot be stored in Teradata
Vantage memory, or when the application requires a real-time
response. If the input data can be stored in Teradata Vantage memory
and the application does not require a real-time response, use the
function ChangePointDetection.
PARAMETERS:
data:
Required Argument.
Specifies the teradataml DataFrame defining the input time series
data.
data_partition_column:
Required Argument.
Specifies Partition By columns for data. Values to this argument
can be provided as list, if multiple columns are used for
partitioning.
Types: str OR list of Strings (str)
data_order_column:
Required Argument.
Specifies Order By columns for data. Values to this argument can
be provided as list, if multiple columns are used for ordering.
Types: str OR list of Strings (str)
value_column:
Required Argument.
Specifies the name of the input teradataml DataFrame column that
contains the time series data.
Types: str OR list of Strings (str)
accumulate:
Optional Argument.
Specifies the names of the input teradataml DataFrame columns to
copy to the output teradataml DataFrame.
Tip: To identify change points in the output teradataml DataFrame,
specify the columns that appear in data_partition_column and
data_order_column.
Note:
'accumulate' argument is required when teradataml is connected to
Vantage version prior to 1.1.1.
Types: str OR list of Strings (str)
segmentation_method:
Optional Argument.
Specifies the segmentation method, normal distribution (in each
segment, the data is in a normal distribution).
Default Value: normal_distribution
Permitted Values: normal_distribution
Types: str
window_size:
Optional Argument.
Specifies the size of the sliding window. The ideal window size
depends heavily on the data. You might need to experiment with
this value.
Default Value: 10
Types: int
threshold:
Optional Argument.
A double threshold value. Specifies a float value that the function
compares to ln(L1) - ln(L0). The definition of Log(L1) and Log(L0)
are in "Background". They are the logarithms of L1 and L0.
Default Value: 10.0
Types: float
output_option:
Optional Argument.
Specifies the output teradataml DataFrame columns.
Default Value: CHANGEPOINT
Permitted Values: CHANGEPOINT, SEGMENT, VERBOSE
Types: str
data_sequence_column:
Optional Argument.
Specifies the list of column(s) that uniquely identifies each row
of the input argument "data". The argument is used to ensure
deterministic results for functions which produce results that vary
from run to run.
Types: str OR list of Strings (str)
RETURNS:
Instance of ChangePointDetectionRT.
Output teradataml DataFrames can be accessed using attribute
references, such as ChangePointDetectionRTObj.<attribute_name>.
Output teradataml DataFrame attribute name is:
result
RAISES:
TeradataMlException
EXAMPLES:
# Load example data.
load_example_data("changepointdetectionRT", "cpt")
# Provided example table, 'cpt' contains time series data in
# column 'val', each of which is identified by columns 'sid'
# and 'id'.
# Create teradataml DataFrame objects.
cpt = DataFrame.from_table('cpt')
# Example 1 : With default window_size, threshold, output_option
ChangePointDetectionRT_out = ChangePointDetectionRT(data = cpt,
value_column = "val",
data_partition_column = 'sid',
data_order_column = 'id',
accumulate = ["sid","id"]
)
# Print the results
print(ChangePointDetectionRT_out.result)
# Example 2 : With window_size 3, threshold 20, VERBOSE output
ChangePointDetectionRT_out = ChangePointDetectionRT(data = cpt,
value_column = "val",
data_partition_column = 'sid',
data_order_column = 'id',
accumulate = ["sid","id"],
window_size = 3,
threshold = 20.0,
output_option = "verbose"
)
# Print the results
print(ChangePointDetectionRT_out.result)
- __repr__(self)
- Returns the string representation for a ChangePointDetectionRT class instance.
- get_build_time(self)
- Function to return the build time of the algorithm in seconds.
When model object is created using retrieve_model(), then the value returned is
as saved in the Model Catalog.
- get_prediction_type(self)
- Function to return the Prediction type of the algorithm.
When model object is created using retrieve_model(), then the value returned is
as saved in the Model Catalog.
- get_target_column(self)
- Function to return the Target Column of the algorithm.
When model object is created using retrieve_model(), then the value returned is
as saved in the Model Catalog.
- show_query(self)
- Function to return the underlying SQL query.
When model object is created using retrieve_model(), then None is returned.
|