| |
Methods defined here:
- __init__(self, data=None, input_columns=None, index_columns=None, range=None, wavelet=None, wavelet_filter=None, level=None, extension_mode='sym', compact_output=True, partition_columns=None, data_sequence_column=None, wavelet_filter_sequence_column=None)
- DESCRIPTION:
The DWT2D function implements the Mallat algorithm (an iterative
algorithm in the Discrete Wavelet Transform field) on 2-dimensional
matrices and applies wavelet transform on multiple sequences
simultaneously.
The input is a set of sequences. Typically, each sequence is a matrix
that contains a position in 2-dimensional space (y and x indexes or
coordinates) and its corresponding values. You specify the wavelet
name or wavelet filter teradataml DataFrame, transform level, and
(optionally) extension mode. The function returns the transformed
sequences in Hilbert space with the corresponding component
identifiers and indices. (The transformation is also called the
decomposition.)
PARAMETERS:
data:
Required Argument.
Specifies the name of the teradataml DataFrame that contains the
sequences to be transformed.
input_columns:
Required Argument.
Specifies the names of the columns in the input teradataml DataFrame
that contain the data to be transformed. These columns must contain
numeric values between -1e308 and 1e308. The function treats
NULL in columns as 0.
Types: str OR list of Strings (str)
index_columns:
Required Argument.
Specifies the columns that contain the indexes of the input
sequences. This argument should have exactly two column names.
One column contains the x coordinates and the other
contains y coordinates.
Types: str OR list of Strings (str)
range:
Optional Argument.
Specifies the start and end indexes of the input data, all of which
must be integers. The default values for each sequence are:
• starty: minimum y index
• startx: minimum x index
• endy: maximum y index
• endx: maximum x index.
The function treats any NULL value as 0.
The range can specify a maximum of 1,000,000 cells.
Types: str
wavelet:
Optional Argument.
Specifies a wavelet filter name.
Wavelet Family Supported Wavelet Names (wavelet values)
Daubechies 'db1' or 'haar', 'db2', .... ,'db10'
Coiflets 'coif1', ... , 'coif5'
Symlets 'sym1', ... ,' sym10'
Discrete Meyer 'dmey'
Biorthogonal 'bior1.1', 'bior1.3', 'bior1.5',
'bior2.2', 'bior2.4', 'bior2.6', 'bior2.8',
'bior3.1', 'bior3.3', 'bior3.5', 'bior3.7', 'bior3.9',
'bior4.4', 'bior5.5'
Reverse Biorthogonal 'rbio1.1', 'rbio1.3', 'rbio1.5'
'rbio2.2', 'rbio2.4', 'rbio2.6', 'rbio2.8',
'rbio3.1', 'rbio3.3', 'rbio3.5', 'rbio3.7','rbio3.9',
'rbio4.4', 'rbio5.5'
Permitted values for wavelet are under column 'Supported Wavelet Names' above.
Types: str
wavelet_filter:
Optional Argument.
Specifies the name of the teradataml DataFrame that contains the
coefficients of the wave filters.
level:
Required Argument.
Specifies the wavelet transform level. The value level must be an
integer in the range [1, 1000].
Types: int
extension_mode:
Optional Argument.
Specifies the method for handling border distortion
Supported Extension Modes (extension_mode values):
• "sym" : Symmetrically replicate boundary values, mirroring
the points near the boundaries.
For example: 4 4 3 2 1 | 1 2 3 4 | 4 3 2 1 1
• "zpd" : Zero-pad boundary values with zero.
For example: 0 0 0 0 0 | 1 2 3 4 | 0 0 0 0 0
• "ppd" : Periodic extension, fill boundary values as the
input sequence is a periodic one.
For example: 4 1 2 3 4 | 1 2 3 4 | 1 2 3 4 1
Default Value: "sym"
Permitted Values: sym, zpd, ppd
Types: str
compact_output:
Optional Argument.
Specifies whether to ignore (not output) rows in which all
coefficient values are very small (having an absolute value less
than 1e-12). For a sparse input matrix, ignoring such rows
reduces the output teradataml DataFrame size.
Default Value: True
Types: bool
partition_columns:
Optional Argument.
Specifies the names of the partition_columns, which identify the
sequences. Rows with the same partition_columns values belong to
the same sequence. If you specify multiple partition_columns,
then the function treats the first one as the distribute key of
the output and meta teradataml DataFrames. By default, all rows
belong to one sequence, and the function generates a distribute
key column named dwt_idrandom_name in both the output teradataml
DataFrame and the meta teradataml DataFrame. In both teradataml
DataFrames, every cell of dwt_idrandom_name has the value 1.
Types: str OR list of Strings (str)
data_sequence_column:
Optional Argument.
Specifies the list of column(s) that uniquely identifies each row of
the input argument "data". The argument is used to asensure
deterministic results for functions which produce results that vary
from run to run.
Types: str OR list of Strings (str)
wavelet_filter_sequence_column:
Optional Argument.
Specifies the list of column(s) that uniquely identifies each row of
the input argument "wavelet_filter". The argument is used to ensure
deterministic results for functions which produce results that vary
from run to run.
Types: str OR list of Strings (str)
RETURNS:
Instance of DWT2D.
Output teradataml DataFrames can be accessed using attribute
references, such as DWT2DObj.<attribute_name>.
Output teradataml DataFrame attribute names are:
1. coefficient
2. meta_table
3. output
RAISES:
TeradataMlException
EXAMPLES:
# This example uses climate data in many cities in the states of
# California (CA), Texas (TX), and Washington (WA). The cities are
# represented by two-dimensional coordinates (latitude and
# longitude). The data are temperature (in degrees Fahrenheit),
# pressure (in Mbars), and dew point (in degrees Fahrenheit). The
# function generates a coefficient model teradataml DataFrame and a
# meta_table teradataml DataFrame, which are used as input to the
# function IDWT2D.
# The table 'wft_testing' contains wavelet filter information
# needed to generate coefficient model teradataml DataFrame and a
# meta_table teradataml DataFrame.
# Load example data.
load_example_data("dwt2d", ["twod_climate_data", "wft_testing"])
# Create teradataml DataFrame objects.
twod_climate_data = DataFrame.from_table("twod_climate_data")
wft_testing = DataFrame.from_table("wft_testing")
# Example 1 : Using 'db2' wavelet to apply DWT2D function on columns,
# "temp_f", "pressure_mbar" and "dewpoint_f" (of DataFrame
# 'twod_climate_data') partitioned by the column "state".
DWT2D_out = DWT2D(data = twod_climate_data,
input_columns = ["temp_f","pressure_mbar","dewpoint_f"],
index_columns = ["latitude","longitude"],
wavelet = "db2",
level = 2,
compact_output = True,
partition_columns = ["state"]
)
# Print the results
print(DWT2D_out.coefficient) # Prints coefficient DataFrame which stores
# the coefficients generated by the wavelet
# transform.
print(DWT2D_out.meta_table) # Prints meta_table DataFrame which stores
# the meta information for the wavelet
# transform.
print(DWT2D_out.output) # Prints output teradataml DataFrame.
# Example 2 : Using wavelet_filter DataFrame to apply DWT2D function
# on columns, "temp_f", "pressure_mbar" and "dewpoint_f" (of
# DataFrame 'twod_climate_data') partitioned by the column
# "state".
DWT2D_out = DWT2D(data = twod_climate_data,
input_columns = ["temp_f","pressure_mbar","dewpoint_f"],
wavelet_filter = wft_testing,
index_columns = ["latitude","longitude"],
level = 2,
partition_columns = "state",
wavelet_filter_sequence_column="filtername"
)
# Print the results
print(DWT2D_out.coefficient) # Prints coefficient DataFrame which stores
# the coefficients generated by the wavelet
# transform.
print(DWT2D_out.meta_table) # Prints meta_table DataFrame which stores
# the meta information for the wavelet
# transform.
print(DWT2D_out.output) # Prints output teradataml DataFrame.
- __repr__(self)
- Returns the string representation for a DWT2D class instance.
|