Teradata Package for Python Function Reference | 17.10 - merge - Teradata Package for Python - Look here for syntax, methods and examples for the functions included in the Teradata Package for Python.

Teradata® Package for Python Function Reference

Product

Teradata Package for Python

Release Number

17.10

Published

April 2022

Language

English (United States)

Last Update

2022-08-19

lifecycle

Product Category

Teradata Vantage

teradataml.geospatial.geodataframe.GeoDataFrame.merge = merge(self, right, on=None, how='inner', left_on=None, right_on=None, use_index=False, lsuffix=None, rsuffix=None): DESCRIPTION: Merges two teradataml GeoDataFrames together. Supported merge operations are: - inner: Returns only matching rows, non-matching rows are eliminated. - left: Returns all matching rows plus non-matching rows from the left teradataml GeoDataFrame. - right: Returns all matching rows plus non-matching rows from the right teradataml GeoDataFrame. - full: Returns all rows from both teradataml GeoDataFrames, including non matching rows. PARAMETERS: right: Required Argument. Specifies right teradataml DataFrame/GeoDataFrame on which merge is to be performed. Types: teradataml GeoDataFrame or teradataml DataFrame on: Optional Argument. Specifies list of conditions that indicate the columns used for the merge. When no arguments are provided for this condition, the merge is performed using the indexes of the teradataml GeoDataFrames. Both teradataml GeoDataFrames are required to have index labels to perform a merge operation when no arguments are provided for this condition. When either teradataml GeoDataFrame does not have a valid index label in the above case, an exception is thrown. • String comparisons, in the form of "col1 <= col2", where col1 is the column of left GeoDataFrame df1 and col2 is the column of right GeoDataFrame df2. Examples: 1. ["a","b"] indicates df1.a = df2.a and df1.b = df2.b. 2. ["a = b", "c = d"] indicates df1.a = df2.b and df1.c = df2.d 3. ["a <= b", "c > d"] indicates df1.a <= df2.b and df1.c > df2.d. 4. ["a < b", "c >= d"] indicates df1.a < df2.b and df1.c >= df2.d. 5. ["a <> b"] indicates df1.a != df2.b. Same is the case for ["a != b"]. • Column comparisons, in the form of df1.col1 <= df2.col2, where col1 is the column of left GeoDataFrame df1 and col2 is the column of right GeoDataFrame df2. Examples: 1. [df1.a == df2.a, df1.b == df2.b] indicates df1.a = df2.a and df1.b = df2.b. 2. [df1.a == df2.b, df1.c == df2.d] indicates df1.a = df2.b and df1.c = df2.d. 3. [df1.a <= df2.b and df1.c > df2.d] indicates df1.a <= df2.b and df1.c > df2.d. 4. [df1.a < df2.b and df1.c >= df2.d] indicates df1.a < df2.b and df1.c >= df2.d. 5. df1.a != df2.b indicates df1.a != df2.b. • The combination of both string comparisons and comparisons as column expressions. Examples: 1. ["a", df1.b == df2.b] indicates df1.a = df2.a and df1.b = df2.b. 2. [df1.a <= df2.b, "c > d"] indicates df1.a <= df2.b and df1.c > df2.d. Default Value: None Types: str or ColumnExpression or List of strings(str) or ColumnExpressions how: Optional Argument. Specifies the type of merge to perform. Supports inner, left, right, full and cross merge operations. When how is "cross", the arguments on, left_on, right_on and use_index are ignored. Default Value: "inner". Types: str left_on: Optional Argument. Specifies column to merge on, in the left teradataml GeoDataFrame. When both the 'on' and 'left_on' parameters are unspecified, the index columns of the teradataml GeoDataFrames are used to perform the merge operation. Default Value: None. Types: str or ColumnExpression or List of strings(str) or ColumnExpressions right_on: Optional Argument. Specifies column to merge on, in the right teradataml GeoDataFrame. When both the 'on' and 'right_on' parameters are unspecified, the index columns of the teradataml GeoDataFrames are used to perform the merge operation. Default Value: None. Types: str or ColumnExpression or List of strings(str) or ColumnExpressions use_index: Optional Argument. Specifies whether (or not) to use index from the teradataml GeoDataFrames as the merge key(s). When False, and 'on', 'left_on', and 'right_on' are all unspecified, the index columns of the teradataml GeoDataFrames are used to perform the merge operation. Default value: False. Types: bool lsuffix: Optional Argument. Specifies suffix to be added to the left table columns. Default Value: None. Types: str Note: A suffix is required if teradataml GeoDataFrames being merged have columns with the same name. rsuffix: Optional Argument. Specifies suffix to be added to the right table columns. Default Value: None. Types: str Note: A suffix is required if teradataml GeoDataFrames being merged have columns with the same name. RAISES: TeradataMlException RETURNS: teradataml GeoDataFrame EXAMPLES: # Load the data to run the example. >>> from teradataml import load_example_data, GeoDataFrame >>> load_example_data("geodataframe", ["sample_cities", "sample_streets"]) >>> >>> df1 = GeoDataFrame("sample_cities") >>> df2 = GeoDataFrame("sample_streets") >>> # Print GeoDataFrame. >>> df1 city_name city_shape skey 0 Oceanville POLYGON ((1 1,1 3,6 3,6 0,1 1)) 1 Seaside POLYGON ((10 10,10 20,20 20,20 15,10 10)) # Print GeoDataFrame. >>> df2 street_name street_shape skey 1 Coast Blvd LINESTRING (12 12,18 17) 1 Main Street LINESTRING (2 2,3 2,4 1) >>> # Example 1: Specifying on condition as string for common column. >>> df1.merge(right = df2, on = ["skey"], how = "inner", lsuffix = "t1", rsuffix = "t2") t1_skey t2_skey city_name city_shape street_name street_shape 0 1 1 Seaside POLYGON ((10 10,10 20,20 20,20 15,10 10)) Coast Blvd LINESTRING (12 12,18 17) 1 1 1 Seaside POLYGON ((10 10,10 20,20 20,20 15,10 10)) Main Street LINESTRING (2 2,3 2,4 1) >>> # Example 2: Specifying on condition as ColumnExpression with left outer join. >>> df1.merge(df2, on = [df1.city_name == df2.street_name], how = "left", lsuffix = "t1", rsuffix = "t2") t1_skey t2_skey city_name city_shape street_name street_shape 0 0 None Oceanville POLYGON ((1 1,1 3,6 3,6 0,1 1)) None None 1 1 None Seaside POLYGON ((10 10,10 20,20 20,20 15,10 10)) None None >>> # Example 3: Cross join "sample_cities" with "admissions_train", i.e., # joining teradataml GeoDataFrame with teradataml DataFrame. >>> load_example_data("dataframe", "admissions_train") >>> tdf = DataFrame("admissions_train") >>> df3 = df1.merge(right=tdf, how="cross", lsuffix="l", rsuffix="r") >>> df3 city_name city_shape id masters gpa stats programming admitted skey 0 Oceanville POLYGON ((1 1,1 3,6 3,6 0,1 1)) 13 no 4.00 Advanced Novice 1 0 Oceanville POLYGON ((1 1,1 3,6 3,6 0,1 1)) 11 no 3.13 Advanced Advanced 1 0 Oceanville POLYGON ((1 1,1 3,6 3,6 0,1 1)) 9 no 3.82 Advanced Advanced 1 1 Seaside POLYGON ((10 10,10 20,20 20,20 15,10 10)) 17 no 3.83 Advanced Advanced 1 1 Seaside POLYGON ((10 10,10 20,20 20,20 15,10 10)) 13 no 4.00 Advanced Novice 1 1 Seaside POLYGON ((10 10,10 20,20 20,20 15,10 10)) 32 yes 3.46 Advanced Beginner 0 1 Seaside POLYGON ((10 10,10 20,20 20,20 15,10 10)) 11 no 3.13 Advanced Advanced 1 1 Seaside POLYGON ((10 10,10 20,20 20,20 15,10 10)) 9 no 3.82 Advanced Advanced 1 1 Seaside POLYGON ((10 10,10 20,20 20,20 15,10 10)) 34 yes 3.85 Advanced Beginner 0 0 Oceanville POLYGON ((1 1,1 3,6 3,6 0,1 1)) 32 yes 3.46 Advanced Beginner 0 >>>