Teradata Package for Python Function Reference | 17.10 - Betweenness - Teradata Package for Python - Look here for syntax, methods and examples for the functions included in the Teradata Package for Python.

Teradata® Package for Python Function Reference

Product

Teradata Package for Python

Release Number

17.10

Published

April 2022

Language

English (United States)

Last Update

2022-08-19

lifecycle

Product Category

Teradata Vantage

teradataml.analytics.mle.Betweenness = class Betweenness(builtins.object)

teradataml.analytics.mle.Betweenness(vertices_data=None, edges_data=None, target_key=None, sources_data=None, targets_data=None, directed=True, edge_weight=None, max_distance=10, group_size=None, sample_rate=1.0, accumulate=None, vertices_data_sequence_column=None, edges_data_sequence_column=None, sources_data_sequence_column=None, targets_data_sequence_column=None, vertices_data_partition_column=None, edges_data_partition_column=None, sources_data_partition_column=None, targets_data_partition_column=None, vertices_data_order_column=None, edges_data_order_column=None, sources_data_order_column=None, targets_data_order_column=None)

Methods defined here:

__init__(self, vertices_data=None, edges_data=None, target_key=None, sources_data=None, targets_data=None, directed=True, edge_weight=None, max_distance=10, group_size=None, sample_rate=1.0, accumulate=None, vertices_data_sequence_column=None, edges_data_sequence_column=None, sources_data_sequence_column=None, targets_data_sequence_column=None, vertices_data_partition_column=None, edges_data_partition_column=None, sources_data_partition_column=None, targets_data_partition_column=None, vertices_data_order_column=None, edges_data_order_column=None, sources_data_order_column=None, targets_data_order_column=None): DESCRIPTION: The Betweenness function returns the betweenness score, a centrality measurement, for every vertex (node) in the input graph. PARAMETERS: vertices_data: Required Argument. Specifies a teradataml DataFrame where each row represents a vertex of the graph. vertices_data_partition_column: Required Argument. Specifies Partition By columns for vertices_data. Values to this argument can be provided as a list, if multiple columns are used for partition. Types: str OR list of Strings (str) vertices_data_order_column: Optional Argument. Specifies Order By columns for vertices_data. Values to this argument can be provided as a list, if multiple columns are used for ordering. Types: str OR list of Strings (str) edges_data: Required Argument. Specifies a teradataml DataFrame where each row represents an edge of the graph. edges_data_partition_column: Required Argument. Specifies Partition By columns for edges_data. Values to this argument can be provided as a list, if multiple columns are used for partition. Types: str OR list of Strings (str) edges_data_order_column: Optional Argument. Specifies Order By columns for edges_data. Values to this argument can be provided as a list, if multiple columns are used for ordering. Types: str OR list of Strings (str) target_key: Required Argument. Specifies the target key (the names of the edges_data teradataml DataFrame columns that identify the target vertex). If you specify targets_data, then the function uses only the vertices in targets_data as targets (which must be a subset of those that this argument specifies). Types: str OR list of Strings (str) sources_data: Optional Argument. Specifies the teradataml DataFrame which contains the vertices to use as sources. sources_data_partition_column: Required Argument when sources_data is used. Specifies Partition By columns for sources_data. Values to this argument can be provided as a list, if multiple columns are used for partition. Types: str OR list of Strings (str) sources_data_order_column: Optional Argument. Specifies Order By columns for sources_data. Values to this argument can be provided as a list, if multiple columns are used for ordering. Types: str OR list of Strings (str) targets_data: Optional Argument. Specifies the teradataml DataFrame which contains the vertices to use as targets. targets_data_partition_column: Required Argument when targets_data is used. Specifies Partition By columns for targets_data. Values to this argument can be provided as a list, if multiple columns are used for partition. Types: str OR list of Strings (str) targets_data_order_column: Optional Argument. Specifies Order By columns for targets_data. Values to this argument can be provided as a list, if multiple columns are used for ordering. Types: str OR list of Strings (str) directed: Optional Argument. Specifies whether the graph is directed. Default Value: True Types: bool edge_weight: Optional Argument. Specifies the name of the edges_data teradataml DataFrame column that contains edge weights. The weights are positive values. By default, the weight of each edge is 1 (that is, the graph is unweighted). Types: str max_distance: Optional Argument. Specifies the maximum distance between the source and target vertices. A negative max_distance specifies an infinite distance. If vertices are separated by more than max_distance, the function does not output them. Default Value: 10 Types: int group_size: Optional Argument. Specifies the number of source vertices that execute a single-node shortest path (SNSP) algorithm in parallel. If group_size exceeds the number of source vertices in each partition, s, then s is the group size. By default, the function calculates the optimal group size based on various cluster and query characteristics. Running a group of vertices on each vWorker, in parallel, uses less memory than running all vertices on each vWorker. Types: int sample_rate: Optional Argument. Specifies the sample rate (the percentage of source vertices to sample), a float value in the range (0.0, 1.0]. The number of source vertices that the function uses to generate betweenness is approximately sample_rate*n, where n is the number of vertices in the graph. Default Value: 1.0 Types: float accumulate: Optional Argument. Specifies the names of the vertices_data teradataml DataFrame columns to copy to the output teradataml DataFrame. Types: str OR list of Strings (str) vertices_data_sequence_column: Optional Argument. Specifies the list of column(s) that uniquely identifies each row of the input argument "vertices_data". The argument is used to ensure deterministic results for functions which produce results that vary from run to run. Types: str OR list of Strings (str) edges_data_sequence_column: Optional Argument. Specifies the list of column(s) that uniquely identifies each row of the input argument "edges_data". The argument is used to ensure deterministic results for functions which produce results that vary from run to run. Types: str OR list of Strings (str) sources_data_sequence_column: Optional Argument. Specifies the list of column(s) that uniquely identifies each row of the input argument "sources_data". The argument is used to ensure deterministic results for functions which produce results that vary from run to run. Types: str OR list of Strings (str) targets_data_sequence_column: Optional Argument. Specifies the list of column(s) that uniquely identifies each row of the input argument "targets_data". The argument is used to ensure deterministic results for functions which produce results that vary from run to run. Types: str OR list of Strings (str) RETURNS: Instance of Betweenness. Output teradataml DataFrames can be accessed using attribute references, such as BetweennessObj.<attribute_name>. Output teradataml DataFrame attribute name is: result RAISES: TeradataMlException EXAMPLES: # Load the data to run the example. load_example_data("Betweenness", ["soc_nw_vertices", "soc_nw_edges"]) # Create teradataml DataFrame object. soc_nw_vertices = DataFrame.from_table("soc_nw_vertices") soc_nw_edges = DataFrame.from_table("soc_nw_edges") # Example - This example computes the betweenness score for each person in the social network. # vertices_data - The vertices_data DataFrame has the names of people. # edges_data - The edges_data DataFrame represents the connections between the people. betweenness_out = Betweenness(vertices_data=soc_nw_vertices, vertices_data_partition_column='vertexid', edges_data=soc_nw_edges, edges_data_partition_column='source', target_key='target', accumulate='vertexid', ) # Print the output DataFrames. print(betweenness_out.result)

__repr__(self): Returns the string representation for a Betweenness class instance.

get_build_time(self): Function to return the build time of the algorithm in seconds. When model object is created using retrieve_model(), then the value returned is as saved in the Model Catalog.

get_prediction_type(self): Function to return the Prediction type of the algorithm. When model object is created using retrieve_model(), then the value returned is as saved in the Model Catalog.

get_target_column(self): Function to return the Target Column of the algorithm. When model object is created using retrieve_model(), then the value returned is as saved in the Model Catalog.

show_query(self): Function to return the underlying SQL query. When model object is created using retrieve_model(), then None is returned.