| |
Methods defined here:
- __init__(self, vertices_data=None, edges_data=None, target_key=None, sample_rate=0.15, flyback_rate=0.15, seed=1000, accumulate=None, vertices_data_sequence_column=None, edges_data_sequence_column=None, vertices_data_partition_column=None, edges_data_partition_column=None)
- DESCRIPTION:
The RandomWalkSample function takes an input graph (which is typically
large) and outputs a sample graph.
PARAMETERS:
vertices_data:
Required Argument.
Specifies the teradataml DataFrame containing the vertex table.
vertices_data_partition_column:
Required Argument.
Specifies Partition By columns for vertices_data.
Values to this argument can be provided as list, if multiple
columns are used for partition.
Types: str OR list of Strings (str)
edges_data:
Required Argument.
Specifies the teradataml DataFrame containing the edge table.
edges_data_partition_column:
Required Argument.
Specifies Partition By columns for edges_data.
Values to this argument can be provided as list, if multiple
columns are used for partition.
Types: str OR list of Strings (str)
target_key:
Required Argument.
Specifies the names of the columns in the edges teradataml DataFrame that
identify the target vertex of an edge.
Types: str OR list of Strings (str)
sample_rate:
Optional Argument.
Specifies the sampling rate. This value must be in the range (0, 1.0).
Default Value: 0.15
Types: float
flyback_rate:
Optional Argument.
Specifies the chance, when visiting a vertex, of flying back to the starting
vertex. This value must be in the range (0, 1.0).
Default Value: 0.15
Types: float
seed:
Optional Argument.
Specifies the seed used to generate a series of random numbers
for sample_rate, flyback_rate, and any random number used
internally. Specifying this value guarantees that the function
result is repeatable on the same cluster.
Default Value: 1000
Types: int
accumulate:
Optional Argument.
Specifies the names of columns in the input vertex teradataml DataFrame
to copy to the output vertex teradataml DataFrame.
Types: str OR list of Strings (str)
vertices_data_sequence_column:
Optional Argument.
Specifies the list of column(s) that uniquely identifies each row of
the input argument "vertices_data". The argument is used to ensure
deterministic results for functions which produce results that vary
from run to run.
Types: str OR list of Strings (str)
edges_data_sequence_column:
Optional Argument.
Specifies the list of column(s) that uniquely identifies each row of
the input argument "edges_data". The argument is used to ensure
deterministic results for functions which produce results that vary
from run to run.
Types: str OR list of Strings (str)
RETURNS:
Instance of RandomWalkSample.
Output teradataml DataFrames can be accessed using attribute
references, such as RandomWalkSampleObj.<attribute_name>.
Output teradataml DataFrame attribute names are:
1. output_vertex_table
2. output_edge_table
3. output
RAISES:
TeradataMlException
EXAMPLES:
# Load example data.
load_example_data("randomwalksample", ["citvertices_2", "citedges_2"])
# Create teradataml DataFrame objects.
# The RandomWalkSample function has two required input tables:
# • Vertices, which defines the set of vertices in the input graph.
# • Edges, which defines the set of edges in the input graph.
citvertices_2 = DataFrame.from_table("citvertices_2")
citedges_2 = DataFrame.from_table("citedges_2")
# Example 1 - This function takes an input graph (which is typically
# large) and outputs a sample graph that preserves graph properties.
RandomWalkSample_out = RandomWalkSample(vertices_data = citvertices_2,
vertices_data_partition_column = ["id"],
edges_data = citedges_2,
edges_data_partition_column = ["from_id"],
target_key = ["to_id"],
sample_rate = 0.15,
flyback_rate = 0.15,
seed = 1000
)
# Print the result DataFrame
print(RandomWalkSample_out)
- __repr__(self)
- Returns the string representation for a RandomWalkSample class instance.
|