Use the td_except() function to return the rows that appear in the first teradataml DataFrame or GeoDataFrame, and not in other teradataml DataFrames or GeoDataFrames, along the index axis.
This function must be applied to data frames of the same type: either all teradataml DataFrames, or all GeoDataFrames.
Example Prerequisites
>>> from teradataml import load_example_data
>>> load_example_data("dataframe", "setop_test1") >>> load_example_data("dataframe", "setop_test2")
>>> from teradataml.dataframe import dataframe >>> from teradataml.dataframe.setop import td_except
Example 1: Run td_except() on rows from two DataFrames, using default signature
This example applies the except operation on rows from two teradataml DataFrames when using default signature of the function.
>>> df1 = DataFrame('setop_test1') >>> df1 masters gpa stats programming admitted id 62 no 3.70 Advanced Advanced 1 53 yes 3.50 Beginner Novice 1 69 no 3.96 Advanced Advanced 1 61 yes 4.00 Advanced Advanced 1 58 no 3.13 Advanced Advanced 1 51 yes 3.76 Beginner Beginner 0 68 no 1.87 Advanced Novice 1 66 no 3.87 Novice Beginner 1 60 no 4.00 Advanced Novice 1 59 no 3.65 Novice Novice 1
>>> df2 = DataFrame('setop_test2') >>> df2 masters gpa stats programming admitted id 12 no 3.65 Novice Novice 1 15 yes 4.00 Advanced Advanced 1 14 yes 3.45 Advanced Advanced 0 20 yes 3.90 Advanced Advanced 1 18 yes 3.81 Advanced Advanced 1 17 no 3.83 Advanced Advanced 1 13 no 4.00 Advanced Novice 1 11 no 3.13 Advanced Advanced 1 60 no 4.00 Advanced Novice 1 19 yes 1.98 Advanced Advanced 0
>>> idf = td_except([df1[df1.id<55] , df2]) >>> idf masters gpa stats programming admitted id 51 yes 3.76 Beginner Beginner 0 50 yes 3.95 Beginner Beginner 0 54 yes 3.50 Beginner Advanced 1 52 no 3.70 Novice Beginner 1 53 yes 3.50 Beginner Novice 1 53 yes 3.50 Beginner Novice 1
Example 2: Run td_except() on rows from two DataFrames, discarding duplicate rows
This examples applies the except operation on rows from the two teradataml DataFrames from previous example, discarding duplicate rows from the result by passing allow_duplicates = False.
>>> idf = td_except([df1[df1.id<55] , df2], allow_duplicates=False) >>> idf masters gpa stats programming admitted id 54 yes 3.50 Beginner Advanced 1 51 yes 3.76 Beginner Beginner 0 53 yes 3.50 Beginner Novice 1 50 yes 3.95 Beginner Beginner 0 52 no 3.70 Novice Beginner 1
Example 3: Run td_except() on more than two DataFrames
This example shows what happens when td_except is used on more than two teradataml DataFrames. In this example, you have three teradataml DataFrames as df1, df2 & df3, the operation is applied on df1 and df2 first, and then the operation is applied again on the result and df3.
>>> df3 = df1[df1.gpa <= 3.9]
>>> # Effective operation here would be, (df1-df2)-df3 >>> idf = td_except([df1, df2, df3]) >>> idf masters gpa stats programming admitted id 61 yes 4.00 Advanced Advanced 1 50 yes 3.95 Beginner Beginner 0 69 no 3.96 Advanced Advanced 1
Example 4: Run td_except on two GeoDataFrames
- Create GeoDataFrames
>>> geo_dataframe = GeoDataFrame('sample_shapes')
>>> geo_dataframe1 = geo_dataframe[geo_dataframe.skey == 1004].select(['skey','linestrings']) >>> geo_dataframe1 skey linestrings 1004 LINESTRING (10 20 30,40 50 60,70 80 80)
>>> geo_dataframe2 = geo_dataframe[geo_dataframe.skey < 1010].select(['skey','linestrings']) >>> geo_dataframe2 skey linestrings 1009 MULTILINESTRING ((10 20 30,40 50 60),(70 80 80,90 100 110)) 1005 LINESTRING (1 3 6,3 0 6,6 0 1) 1004 LINESTRING (10 20 30,40 50 60,70 80 80) 1002 LINESTRING (1 3,3 0,0 1) 1001 LINESTRING (1 1,2 2,3 3,4 4) 1003 LINESTRING (1.35 3.6456,3.6756 0.23,0.345 1.756) 1007 MULTILINESTRING ((1 1,1 3,6 3),(10 5,20 1)) 1006 LINESTRING (1.35 3.6456 4.5,3.6756 0.23 6.8,0.345 1.756 8.9) 1008 MULTILINESTRING ((1 3,3 0,0 1),(1.35 3.6456,3.6756 0.23,0.345 1.756))
- Run td_except
>>> td_except([geo_dataframe2,geo_dataframe1]) skey linestrings 1008 MULTILINESTRING ((1 3,3 0,0 1),(1.35 3.6456,3.6756 0.23,0.345 1.756)) 1003 LINESTRING (1.35 3.6456,3.6756 0.23,0.345 1.756) 1005 LINESTRING (1 3 6,3 0 6,6 0 1) 1006 LINESTRING (1.35 3.6456 4.5,3.6756 0.23 6.8,0.345 1.756 8.9) 1009 MULTILINESTRING ((10 20 30,40 50 60),(70 80 80,90 100 110)) 1001 LINESTRING (1 1,2 2,3 3,4 4) 1007 MULTILINESTRING ((1 1,1 3,6 3),(10 5,20 1)) 1002 LINESTRING (1 3,3 0,0 1)