Teradata Package for Python Function Reference on VantageCloud Lake - merge - Teradata Package for Python - Look here for syntax, methods and examples for the functions included in the Teradata Package for Python.
Teradata® Package for Python Function Reference on VantageCloud Lake
- Deployment
- VantageCloud
- Edition
- Lake
- Product
- Teradata Package for Python
- Release Number
- 20.00.00.03
- Published
- December 2024
- ft:locale
- en-US
- ft:lastEdition
- 2024-12-19
- dita:id
- TeradataPython_FxRef_Lake_2000
- Product Category
- Teradata Vantage
- teradataml.geospatial.geodataframe.GeoDataFrame.merge = merge(self, right, on=None, how='inner', left_on=None, right_on=None, use_index=False, lsuffix=None, rsuffix=None)
- DESCRIPTION:
Merges two teradataml GeoDataFrames together.
Supported merge operations are:
- inner: Returns only matching rows, non-matching rows are eliminated.
- left: Returns all matching rows plus non-matching rows from the left teradataml GeoDataFrame.
- right: Returns all matching rows plus non-matching rows from the right teradataml GeoDataFrame.
- full: Returns all rows from both teradataml GeoDataFrames, including non matching rows.
PARAMETERS:
right:
Required Argument.
Specifies right teradataml DataFrame/GeoDataFrame on which merge is to be performed.
Types: teradataml GeoDataFrame or teradataml DataFrame
on:
Optional Argument.
Specifies list of conditions that indicate the columns used for the merge.
When no arguments are provided for this condition, the merge is performed using the indexes
of the teradataml GeoDataFrames. Both teradataml GeoDataFrames are required to have index labels to
perform a merge operation when no arguments are provided for this condition.
When either teradataml GeoDataFrame does not have a valid index label in the above case,
an exception is thrown.
• String comparisons, in the form of "col1 <= col2", where col1 is
the column of left GeoDataFrame df1 and col2 is the column of right
GeoDataFrame df2.
Examples:
1. ["a","b"] indicates df1.a = df2.a and df1.b = df2.b.
2. ["a = b", "c = d"] indicates df1.a = df2.b and df1.c = df2.d
3. ["a <= b", "c > d"] indicates df1.a <= df2.b and df1.c > df2.d.
4. ["a < b", "c >= d"] indicates df1.a < df2.b and df1.c >= df2.d.
5. ["a <> b"] indicates df1.a != df2.b. Same is the case for ["a != b"].
• Column comparisons, in the form of df1.col1 <= df2.col2, where col1
is the column of left GeoDataFrame df1 and col2 is the column of right
GeoDataFrame df2.
Examples:
1. [df1.a == df2.a, df1.b == df2.b] indicates df1.a = df2.a and df1.b = df2.b.
2. [df1.a == df2.b, df1.c == df2.d] indicates df1.a = df2.b and df1.c = df2.d.
3. [df1.a <= df2.b and df1.c > df2.d] indicates df1.a <= df2.b and df1.c > df2.d.
4. [df1.a < df2.b and df1.c >= df2.d] indicates df1.a < df2.b and df1.c >= df2.d.
5. df1.a != df2.b indicates df1.a != df2.b.
• The combination of both string comparisons and comparisons as column expressions.
Examples:
1. ["a", df1.b == df2.b] indicates df1.a = df2.a and df1.b = df2.b.
2. [df1.a <= df2.b, "c > d"] indicates df1.a <= df2.b and df1.c > df2.d.
Default Value: None
Types: str or ColumnExpression or List of strings(str) or ColumnExpressions
how:
Optional Argument.
Specifies the type of merge to perform. Supports inner, left, right, full and cross merge operations.
When how is "cross", the arguments on, left_on, right_on and use_index are ignored.
Default Value: "inner".
Types: str
left_on:
Optional Argument.
Specifies column to merge on, in the left teradataml GeoDataFrame.
When both the 'on' and 'left_on' parameters are unspecified, the index columns
of the teradataml GeoDataFrames are used to perform the merge operation.
Default Value: None.
Types: str or ColumnExpression or List of strings(str) or ColumnExpressions
right_on:
Optional Argument.
Specifies column to merge on, in the right teradataml GeoDataFrame.
When both the 'on' and 'right_on' parameters are unspecified, the index columns
of the teradataml GeoDataFrames are used to perform the merge operation.
Default Value: None.
Types: str or ColumnExpression or List of strings(str) or ColumnExpressions
use_index:
Optional Argument.
Specifies whether (or not) to use index from the teradataml GeoDataFrames as the merge key(s).
When False, and 'on', 'left_on', and 'right_on' are all unspecified, the index columns
of the teradataml GeoDataFrames are used to perform the merge operation.
Default value: False.
Types: bool
lsuffix:
Optional Argument.
Specifies suffix to be added to the left table columns.
Default Value: None.
Types: str
Note: A suffix is required if teradataml GeoDataFrames being merged have columns
with the same name.
rsuffix:
Optional Argument.
Specifies suffix to be added to the right table columns.
Default Value: None.
Types: str
Note: A suffix is required if teradataml GeoDataFrames being merged have columns
with the same name.
RAISES:
TeradataMlException
RETURNS:
teradataml GeoDataFrame
EXAMPLES:
# Load the data to run the example.
>>> from teradataml import load_example_data, GeoDataFrame
>>> load_example_data("geodataframe", ["sample_cities", "sample_streets"])
>>>
>>> df1 = GeoDataFrame("sample_cities")
>>> df2 = GeoDataFrame("sample_streets")
>>>
# Print GeoDataFrame.
>>> df1
city_name city_shape
skey
0 Oceanville POLYGON ((1 1,1 3,6 3,6 0,1 1))
1 Seaside POLYGON ((10 10,10 20,20 20,20 15,10 10))
# Print GeoDataFrame.
>>> df2
street_name street_shape
skey
1 Coast Blvd LINESTRING (12 12,18 17)
1 Main Street LINESTRING (2 2,3 2,4 1)
>>>
# Example 1: Specifying on condition as string for common column.
>>> df1.merge(right = df2, on = ["skey"], how = "inner", lsuffix = "t1", rsuffix = "t2")
t1_skey t2_skey city_name city_shape street_name street_shape
0 1 1 Seaside POLYGON ((10 10,10 20,20 20,20 15,10 10)) Coast Blvd LINESTRING (12 12,18 17)
1 1 1 Seaside POLYGON ((10 10,10 20,20 20,20 15,10 10)) Main Street LINESTRING (2 2,3 2,4 1)
>>>
# Example 2: Specifying on condition as ColumnExpression with left outer join.
>>> df1.merge(df2, on = [df1.city_name == df2.street_name], how = "left", lsuffix = "t1", rsuffix = "t2")
t1_skey t2_skey city_name city_shape street_name street_shape
0 0 None Oceanville POLYGON ((1 1,1 3,6 3,6 0,1 1)) None None
1 1 None Seaside POLYGON ((10 10,10 20,20 20,20 15,10 10)) None None
>>>
# Example 3: Cross join "sample_cities" with "admissions_train", i.e.,
# joining teradataml GeoDataFrame with teradataml DataFrame.
>>> load_example_data("dataframe", "admissions_train")
>>> tdf = DataFrame("admissions_train")
>>> df3 = df1.merge(right=tdf, how="cross", lsuffix="l", rsuffix="r")
>>> df3
city_name city_shape id masters gpa stats programming admitted
skey
0 Oceanville POLYGON ((1 1,1 3,6 3,6 0,1 1)) 13 no 4.00 Advanced Novice 1
0 Oceanville POLYGON ((1 1,1 3,6 3,6 0,1 1)) 11 no 3.13 Advanced Advanced 1
0 Oceanville POLYGON ((1 1,1 3,6 3,6 0,1 1)) 9 no 3.82 Advanced Advanced 1
1 Seaside POLYGON ((10 10,10 20,20 20,20 15,10 10)) 17 no 3.83 Advanced Advanced 1
1 Seaside POLYGON ((10 10,10 20,20 20,20 15,10 10)) 13 no 4.00 Advanced Novice 1
1 Seaside POLYGON ((10 10,10 20,20 20,20 15,10 10)) 32 yes 3.46 Advanced Beginner 0
1 Seaside POLYGON ((10 10,10 20,20 20,20 15,10 10)) 11 no 3.13 Advanced Advanced 1
1 Seaside POLYGON ((10 10,10 20,20 20,20 15,10 10)) 9 no 3.82 Advanced Advanced 1
1 Seaside POLYGON ((10 10,10 20,20 20,20 15,10 10)) 34 yes 3.85 Advanced Beginner 0
0 Oceanville POLYGON ((1 1,1 3,6 3,6 0,1 1)) 32 yes 3.46 Advanced Beginner 0
>>>