Teradata Package for Python Function Reference | 17.10 - td_intersect - Teradata Package for Python - Look here for syntax, methods and examples for the functions included in the Teradata Package for Python.

Teradata® Package for Python Function Reference

Product

Teradata Package for Python

Release Number

17.10

Published

April 2022

Language

English (United States)

Last Update

2022-08-19

lifecycle

Product Category

Teradata Vantage

teradataml.dataframe.setop.td_intersect = td_intersect(df_list, allow_duplicates=True): DESCRIPTION: Function intersects a list of teradataml DataFrames or GeoDataFrames along the index axis and returns a DataFrame with rows common to all input DataFrames. Note: This function should be applied to data frames of the same type: either all teradataml DataFrames, or all GeoDataFrames. PARAMETERS: df_list: Required argument. Specifies the list of teradataml DataFrames or GeoDataFrames on which the intersection is to be performed. Types: list of teradataml DataFrames or GeoDataFrames allow_duplicates: Optional argument. Specifies if the result of intersection can have duplicate rows. Default value: True Types: bool RETURNS: teradataml DataFrame when intersect is performed on teradataml DataFrames. teradataml GeoDataFrame when operation is performed on teradataml GeoDataFrames. RAISES: TeradataMlException, TypeError EXAMPLES: >>> from teradataml import load_example_data >>> load_example_data("dataframe", "setop_test1") >>> load_example_data("dataframe", "setop_test2") >>> load_example_data("geodataframe", ["sample_shapes"]) >>> from teradataml.dataframe.setop import td_intersect >>> >>> df1 = DataFrame('setop_test1') >>> df1 masters gpa stats programming admitted id 62 no 3.70 Advanced Advanced 1 53 yes 3.50 Beginner Novice 1 69 no 3.96 Advanced Advanced 1 61 yes 4.00 Advanced Advanced 1 58 no 3.13 Advanced Advanced 1 51 yes 3.76 Beginner Beginner 0 68 no 1.87 Advanced Novice 1 66 no 3.87 Novice Beginner 1 60 no 4.00 Advanced Novice 1 59 no 3.65 Novice Novice 1 >>> df2 = DataFrame('setop_test2') >>> df2 masters gpa stats programming admitted id 12 no 3.65 Novice Novice 1 15 yes 4.00 Advanced Advanced 1 14 yes 3.45 Advanced Advanced 0 20 yes 3.90 Advanced Advanced 1 18 yes 3.81 Advanced Advanced 1 17 no 3.83 Advanced Advanced 1 13 no 4.00 Advanced Novice 1 11 no 3.13 Advanced Advanced 1 60 no 4.00 Advanced Novice 1 19 yes 1.98 Advanced Advanced 0 >>> idf = td_intersect([df1, df2]) >>> idf masters gpa stats programming admitted id 64 yes 3.81 Advanced Advanced 1 60 no 4.00 Advanced Novice 1 58 no 3.13 Advanced Advanced 1 68 no 1.87 Advanced Novice 1 66 no 3.87 Novice Beginner 1 60 no 4.00 Advanced Novice 1 62 no 3.70 Advanced Advanced 1 >>> >>> idf = td_intersect([df1, df2], allow_duplicates=False) >>> idf masters gpa stats programming admitted id 64 yes 3.81 Advanced Advanced 1 60 no 4.00 Advanced Novice 1 58 no 3.13 Advanced Advanced 1 68 no 1.87 Advanced Novice 1 66 no 3.87 Novice Beginner 1 62 no 3.70 Advanced Advanced 1 >>> # intersecting more than two DataFrames >>> df3 = df1[df1.gpa <= 3.5] >>> df3 masters gpa stats programming admitted id 58 no 3.13 Advanced Advanced 1 67 yes 3.46 Novice Beginner 0 54 yes 3.50 Beginner Advanced 1 68 no 1.87 Advanced Novice 1 53 yes 3.50 Beginner Novice 1 >>> idf = td_intersect([df1, df2, df3]) >>> idf masters gpa stats programming admitted id 58 no 3.13 Advanced Advanced 1 68 no 1.87 Advanced Novice 1 # Perform intersection of two GeoDataFrames. >>> geo_dataframe = GeoDataFrame('sample_shapes') >>> geo_dataframe1 = geo_dataframe[geo_dataframe.skey == 1004].select(['skey','linestrings']) >>> geo_dataframe1 skey linestrings 1004 LINESTRING (10 20 30,40 50 60,70 80 80) >>> geo_dataframe2 = geo_dataframe[geo_dataframe.skey < 1010].select(['skey','linestrings']) >>> geo_dataframe2 skey linestrings 1009 MULTILINESTRING ((10 20 30,40 50 60),(70 80 80,90 100 110)) 1005 LINESTRING (1 3 6,3 0 6,6 0 1) 1004 LINESTRING (10 20 30,40 50 60,70 80 80) 1002 LINESTRING (1 3,3 0,0 1) 1001 LINESTRING (1 1,2 2,3 3,4 4) 1003 LINESTRING (1.35 3.6456,3.6756 0.23,0.345 1.756) 1007 MULTILINESTRING ((1 1,1 3,6 3),(10 5,20 1)) 1006 LINESTRING (1.35 3.6456 4.5,3.6756 0.23 6.8,0.345 1.756 8.9) 1008 MULTILINESTRING ((1 3,3 0,0 1),(1.35 3.6456,3.6756 0.23,0.345 1.756)) >>> td_intersect([geo_dataframe1,geo_dataframe2]) skey linestrings 1004 LINESTRING (10 20 30,40 50 60,70 80 80)