|
teradataml.analytics.mle.Betweenness(vertices_data=None, edges_data=None, target_key=None, sources_data=None, targets_data=None, directed=True, edge_weight=None, max_distance=10, group_size=None, sample_rate=1.0, accumulate=None, vertices_data_sequence_column=None, edges_data_sequence_column=None, sources_data_sequence_column=None, targets_data_sequence_column=None, vertices_data_partition_column=None, edges_data_partition_column=None, sources_data_partition_column=None, targets_data_partition_column=None, vertices_data_order_column=None, edges_data_order_column=None, sources_data_order_column=None, targets_data_order_column=None)
|
|
Methods defined here:
- __init__(self, vertices_data=None, edges_data=None, target_key=None, sources_data=None, targets_data=None, directed=True, edge_weight=None, max_distance=10, group_size=None, sample_rate=1.0, accumulate=None, vertices_data_sequence_column=None, edges_data_sequence_column=None, sources_data_sequence_column=None, targets_data_sequence_column=None, vertices_data_partition_column=None, edges_data_partition_column=None, sources_data_partition_column=None, targets_data_partition_column=None, vertices_data_order_column=None, edges_data_order_column=None, sources_data_order_column=None, targets_data_order_column=None)
- DESCRIPTION:
The Betweenness function returns the betweenness score, a centrality
measurement, for every vertex (node) in the input graph.
PARAMETERS:
vertices_data:
Required Argument.
Specifies a teradataml DataFrame where each row represents a
vertex of the graph.
vertices_data_partition_column:
Required Argument.
Specifies Partition By columns for vertices_data.
Values to this argument can be provided as a list, if multiple
columns are used for partition.
Types: str OR list of Strings (str)
vertices_data_order_column:
Optional Argument.
Specifies Order By columns for vertices_data.
Values to this argument can be provided as a list, if multiple
columns are used for ordering.
Types: str OR list of Strings (str)
edges_data:
Required Argument.
Specifies a teradataml DataFrame where each row represents an
edge of the graph.
edges_data_partition_column:
Required Argument.
Specifies Partition By columns for edges_data.
Values to this argument can be provided as a list, if multiple
columns are used for partition.
Types: str OR list of Strings (str)
edges_data_order_column:
Optional Argument.
Specifies Order By columns for edges_data.
Values to this argument can be provided as a list, if multiple
columns are used for ordering.
Types: str OR list of Strings (str)
target_key:
Required Argument.
Specifies the target key (the names of the edges_data teradataml DataFrame
columns that identify the target vertex). If you specify
targets_data, then the function uses only the vertices in
targets_data as targets (which must be a subset of those that this
argument specifies).
Types: str OR list of Strings (str)
sources_data:
Optional Argument.
Specifies the teradataml DataFrame which contains the vertices
to use as sources.
sources_data_partition_column:
Required Argument when sources_data is used.
Specifies Partition By columns for sources_data.
Values to this argument can be provided as a list, if multiple
columns are used for partition.
Types: str OR list of Strings (str)
sources_data_order_column:
Optional Argument.
Specifies Order By columns for sources_data.
Values to this argument can be provided as a list, if multiple
columns are used for ordering.
Types: str OR list of Strings (str)
targets_data:
Optional Argument.
Specifies the teradataml DataFrame which contains the vertices
to use as targets.
targets_data_partition_column:
Required Argument when targets_data is used.
Specifies Partition By columns for targets_data.
Values to this argument can be provided as a list, if multiple
columns are used for partition.
Types: str OR list of Strings (str)
targets_data_order_column:
Optional Argument.
Specifies Order By columns for targets_data.
Values to this argument can be provided as a list, if multiple
columns are used for ordering.
Types: str OR list of Strings (str)
directed:
Optional Argument.
Specifies whether the graph is directed.
Default Value: True
Types: bool
edge_weight:
Optional Argument.
Specifies the name of the edges_data teradataml DataFrame column that
contains edge weights. The weights are positive values.
By default, the weight of each edge is 1 (that is, the graph is unweighted).
Types: str
max_distance:
Optional Argument.
Specifies the maximum distance between the source and
target vertices. A negative max_distance specifies an infinite
distance. If vertices are separated by more than max_distance, the
function does not output them.
Default Value: 10
Types: int
group_size:
Optional Argument.
Specifies the number of source vertices that execute a single-node shortest
path (SNSP) algorithm in parallel. If group_size exceeds the number of
source vertices in each partition, s, then s is the group size.
By default, the function calculates the optimal group size based on
various cluster and query characteristics.
Running a group of vertices on each vWorker, in parallel, uses less
memory than running all vertices on each vWorker.
Types: int
sample_rate:
Optional Argument.
Specifies the sample rate (the percentage of source vertices to
sample), a float value in the range (0.0, 1.0]. The number of source
vertices that the function uses to generate betweenness is
approximately sample_rate*n, where n is the number of vertices in
the graph.
Default Value: 1.0
Types: float
accumulate:
Optional Argument.
Specifies the names of the vertices_data teradataml DataFrame columns to
copy to the output teradataml DataFrame.
Types: str OR list of Strings (str)
vertices_data_sequence_column:
Optional Argument.
Specifies the list of column(s) that uniquely identifies each row of
the input argument "vertices_data". The argument is used to ensure
deterministic results for functions which produce results that vary
from run to run.
Types: str OR list of Strings (str)
edges_data_sequence_column:
Optional Argument.
Specifies the list of column(s) that uniquely identifies each row of
the input argument "edges_data". The argument is used to ensure
deterministic results for functions which produce results that vary
from run to run.
Types: str OR list of Strings (str)
sources_data_sequence_column:
Optional Argument.
Specifies the list of column(s) that uniquely identifies each row of
the input argument "sources_data". The argument is used to ensure
deterministic results for functions which produce results that vary
from run to run.
Types: str OR list of Strings (str)
targets_data_sequence_column:
Optional Argument.
Specifies the list of column(s) that uniquely identifies each row of
the input argument "targets_data". The argument is used to ensure
deterministic results for functions which produce results that vary
from run to run.
Types: str OR list of Strings (str)
RETURNS:
Instance of Betweenness.
Output teradataml DataFrames can be accessed using attribute
references, such as BetweennessObj.<attribute_name>.
Output teradataml DataFrame attribute name is:
result
RAISES:
TeradataMlException
EXAMPLES:
# Load the data to run the example.
load_example_data("Betweenness", ["soc_nw_vertices", "soc_nw_edges"])
# Create teradataml DataFrame object.
soc_nw_vertices = DataFrame.from_table("soc_nw_vertices")
soc_nw_edges = DataFrame.from_table("soc_nw_edges")
# Example - This example computes the betweenness score for each person in the social network.
# vertices_data - The vertices_data DataFrame has the names of people.
# edges_data - The edges_data DataFrame represents the connections between the people.
betweenness_out = Betweenness(vertices_data=soc_nw_vertices,
vertices_data_partition_column='vertexid',
edges_data=soc_nw_edges,
edges_data_partition_column='source',
target_key='target',
accumulate='vertexid',
)
# Print the output DataFrames.
print(betweenness_out.result)
- __repr__(self)
- Returns the string representation for a Betweenness class instance.
- get_build_time(self)
- Function to return the build time of the algorithm in seconds.
When model object is created using retrieve_model(), then the value returned is
as saved in the Model Catalog.
- get_prediction_type(self)
- Function to return the Prediction type of the algorithm.
When model object is created using retrieve_model(), then the value returned is
as saved in the Model Catalog.
- get_target_column(self)
- Function to return the Target Column of the algorithm.
When model object is created using retrieve_model(), then the value returned is
as saved in the Model Catalog.
- show_query(self)
- Function to return the underlying SQL query.
When model object is created using retrieve_model(), then None is returned.
|