Use the concat() API to concatenate a list of teradataml DataFrames, GeoDataFrames, or both, along the index axis. The operation is performed by carrying out a database-style UNION or UNION ALL operation.
If the list contains both teradataml DataFrames and GeoDataFrames, that is, it contains geometry data, the function returns a GeoDataFrame. See example 6.
Example Prerequisites
>>> df = DataFrame("admissions_train") >>> df masters gpa stats programming admitted id 22 yes 3.46 Novice Beginner 0 36 no 3.00 Advanced Novice 0 15 yes 4.00 Advanced Advanced 1 38 yes 2.65 Advanced Beginner 1 5 no 3.44 Novice Novice 0 17 no 3.83 Advanced Advanced 1 34 yes 3.85 Advanced Beginner 0 13 no 4.00 Advanced Novice 1 26 yes 3.57 Advanced Advanced 1 19 yes 1.98 Advanced Advanced 0
>>> df1 = df[df.gpa == 4].select(['id', 'stats', 'masters', 'gpa']) >>> df1 stats masters gpa id 13 Advanced no 4.0 29 Novice yes 4.0 15 Advanced yes 4.0
>>> df2 = df[df.gpa < 2].select(['id', 'stats', 'programming', 'admitted']) >>> df2 stats programming admitted id 24 Advanced Novice 1 19 Advanced Advanced 0
Example 1: Run concat() with default values for optional arguments
>>> cdf = concat([df1,df2]) >>> cdf stats masters gpa programming admitted id 19 Advanced None NaN Advanced 0 24 Advanced None NaN Novice 1 13 Advanced no 4.0 None None 29 Novice yes 4.0 None None 15 Advanced yes 4.0 None None
Example 2: Run concat() with optional argument "join"
Set join = inner
>>> cdf = concat([df1,df2], join='inner') >>> cdf stats id 19 Advanced 24 Advanced 13 Advanced 29 Novice 15 Advanced
Example 3: Run concat() with optional argument "allow_duplicates"
- Set allow_duplicates = True (default)
>>> cdf = concat([df1,df2]) >>> cdf stats masters gpa programming admitted id 19 Advanced None NaN Advanced 0 24 Advanced None NaN Novice 1 13 Advanced no 4.0 None None 29 Novice yes 4.0 None None 15 Advanced yes 4.0 None None
>>> cdf = concat([cdf,df2]) >>> cdf stats masters gpa programming admitted id 19 Advanced None NaN Advanced 0 13 Advanced no 4.0 None None 24 Advanced None NaN Novice 1 24 Advanced None NaN Novice 1 19 Advanced None NaN Advanced 0 29 Novice yes 4.0 None None 15 Advanced yes 4.0 None None
- Set allow_duplicates = False
>>> cdf = concat([cdf,df2], allow_duplicates=False) >>> cdf stats masters gpa programming admitted id 19 Advanced None NaN Advanced 0 29 Novice yes 4.0 None None 24 Advanced None NaN Novice 1 15 Advanced yes 4.0 None None 13 Advanced no 4.0 None None
Example 4: Run concat() with optional argument "sort"
Set sort=True
>>> cdf = concat([df1,df2], sort=True) >>> cdf admitted gpa masters programming stats id 19 0 NaN None Advanced Advanced 24 1 NaN None Novice Advanced 13 None 4.0 no None Advanced 29 None 4.0 yes None Novice 15 None 4.0 yes None Advanced
Example 5: Perform concatenation of two GeoDataFrames
- Create GeoDataFrames
>>> geo_dataframe = GeoDataFrame('sample_shapes')
>>> geo_dataframe1 = geo_dataframe[geo_dataframe.skey == 1004].select(['skey','linestrings']) >>> geo_dataframe1 skey linestrings 1004 LINESTRING (10 20 30,40 50 60,70 80 80)
>>> geo_dataframe2 = geo_dataframe[geo_dataframe.skey < 1010].select(['skey','polygons']) >>> geo_dataframe2 skey polygons 1009 MULTIPOLYGON (((0 0 0,0 20 20,20 20 20,20 0 20,0 0 0)),((50 50 50,50 90 90,90 90 90,90 50 90,50 50 50))) 1005 POLYGON ((0 0 0,0 0 20.435,0.0 20.435 0,0.0 20.435 20.435,20.435 0.0 0,20.435 0.0 20.435,20.435 20.435 0,20.435 20.435 20.435,0 0 0)) 1004 POLYGON ((0 0 0,0 10 20,20 20 30,20 10 0,0 0 0),(5 5 5,5 10 10,10 10 10,10 10 5,5 5 5)) 1002 POLYGON ((0 0,0 20,20 20,20 0,0 0),(5 5,5 10,10 10,10 5,5 5)) 1001 POLYGON ((0 0,0 20,20 20,20 0,0 0)) 1003 POLYGON ((0.6 0.8,0.6 20.8,20.6 20.8,20.6 0.8,0.6 0.8)) 1007 MULTIPOLYGON (((1 1,1 3,6 3,6 0,1 1)),((10 5,10 10,20 10,20 5,10 5))) 1006 POLYGON ((0 0 0,0 0 20,0 20 0,0 20 20,20 0 0,20 0 20,20 20 0,20 20 20,0 0 0)) 1008 MULTIPOLYGON (((0 0,0 20,20 20,20 0,0 0)),((0.6 0.8,0.6 20.8,20.6 20.8,20.6 0.8,0.6 0.8)))
- Perform concatenation
>>> concat([geo_dataframe1,geo_dataframe2]) skey linestrings polygons 1009 None MULTIPOLYGON (((0 0 0,0 20 20,20 20 20,20 0 20,0 0 0)),((50 50 50,50 90 90,90 90 90,90 50 90,50 50 50))) 1005 None POLYGON ((0 0 0,0 0 20.435,0.0 20.435 0,0.0 20.435 20.435,20.435 0.0 0,20.435 0.0 20.435,20.435 20.435 0,20.435 20.435 20.435,0 0 0)) 1004 LINESTRING (10 20 30,40 50 60,70 80 80) None 1004 None POLYGON ((0 0 0,0 10 20,20 20 30,20 10 0,0 0 0),(5 5 5,5 10 10,10 10 10,10 10 5,5 5 5)) 1003 None POLYGON ((0.6 0.8,0.6 20.8,20.6 20.8,20.6 0.8,0.6 0.8)) 1001 None POLYGON ((0 0,0 20,20 20,20 0,0 0)) 1002 None POLYGON ((0 0,0 20,20 20,20 0,0 0),(5 5,5 10,10 10,10 5,5 5)) 1007 None MULTIPOLYGON (((1 1,1 3,6 3,6 0,1 1)),((10 5,10 10,20 10,20 5,10 5))) 1006 None POLYGON ((0 0 0,0 0 20,0 20 0,0 20 20,20 0 0,20 0 20,20 20 0,20 20 20,0 0 0)) 1008 None MULTIPOLYGON (((0 0,0 20,20 20,20 0,0 0)),((0.6 0.8,0.6 20.8,20.6 20.8,20.6 0.8,0.6 0.8)))
Example 6: Perform concatenation of a DataFrame and GeoDataFrame
>>> normal_df=df.select(['id','stats']) >>> normal_df stats id 34 Advanced 32 Advanced 11 Advanced 40 Novice 38 Advanced 36 Advanced 7 Novice 26 Advanced 19 Advanced 13 Advanced
>>> geo_df = geo_dataframe[geo_dataframe.skey < 1010].select(['skey', 'polygons']) >>> geo_df skey polygons 1003 POLYGON ((0.6 0.8,0.6 20.8,20.6 20.8,20.6 0.8,0.6 0.8)) 1008 MULTIPOLYGON (((0 0,0 20,20 20,20 0,0 0)),((0.6 0.8,0.6 20.8,20.6 20.8,20.6 0.8,0.6 0.8))) 1006 POLYGON ((0 0 0,0 0 20,0 20 0,0 20 20,20 0 0,20 0 20,20 20 0,20 20 20,0 0 0)) 1009 MULTIPOLYGON (((0 0 0,0 20 20,20 20 20,20 0 20,0 0 0)),((50 50 50,50 90 90,90 90 90,90 50 90,50 50 50))) 1005 POLYGON ((0 0 0,0 0 20.435,0.0 20.435 0,0.0 20.435 20.435,20.435 0.0 0,20.435 0.0 20.435,20.435 20.435 0,20.435 20.435 20.435,0 0 0)) 1007 MULTIPOLYGON (((1 1,1 3,6 3,6 0,1 1)),((10 5,10 10,20 10,20 5,10 5))) 1001 POLYGON ((0 0,0 20,20 20,20 0,0 0)) 1002 POLYGON ((0 0,0 20,20 20,20 0,0 0),(5 5,5 10,10 10,10 5,5 5)) 1004 POLYGON ((0 0 0,0 10 20,20 20 30,20 10 0,0 0 0),(5 5 5,5 10 10,10 10 10,10 10 5,5 5 5))
>>> idf = concat([normal_df,geo_df]) >>> idf stats skey polygons id 38 Advanced None None 7 Novice None None 26 Advanced None None 17 Advanced None None 34 Advanced None None 13 Advanced None None 32 Advanced None None 11 Advanced None None 15 Advanced None None 36 Advanced None None