Use the concat() API to concatenate two or more teradataml DataFrame objects along the index axis. The operation is performed by carrying out a database-style union or union all operation.
Example Prerequisites
>>> df = DataFrame("admissions_train") >>> df masters gpa stats programming admitted id 22 yes 3.46 Novice Beginner 0 36 no 3.00 Advanced Novice 0 15 yes 4.00 Advanced Advanced 1 38 yes 2.65 Advanced Beginner 1 5 no 3.44 Novice Novice 0 17 no 3.83 Advanced Advanced 1 34 yes 3.85 Advanced Beginner 0 13 no 4.00 Advanced Novice 1 26 yes 3.57 Advanced Advanced 1 19 yes 1.98 Advanced Advanced 0
>>> df1 = df[df.gpa == 4].select(['id', 'stats', 'masters', 'gpa']) >>> df1 stats masters gpa id 13 Advanced no 4.0 29 Novice yes 4.0 15 Advanced yes 4.0
>>> df2 = df[df.gpa < 2].select(['id', 'stats', 'programming', 'admitted']) >>> df2 stats programming admitted id 24 Advanced Novice 1 19 Advanced Advanced 0
Example: Run concat() with default values for optional arguments
>>> cdf = concat([df1,df2]) >>> cdf stats masters gpa programming admitted id 19 Advanced None NaN Advanced 0 24 Advanced None NaN Novice 1 13 Advanced no 4.0 None None 29 Novice yes 4.0 None None 15 Advanced yes 4.0 None None
Example: Run concat() with optional argument "join"
Set join = inner
>>> cdf = concat([df1,df2], join='inner') >>> cdf stats id 19 Advanced 24 Advanced 13 Advanced 29 Novice 15 Advanced
Example: Run concat() with optional argument "allow_duplicates"
- Set allow_duplicates = True (default)
>>> cdf = concat([df1,df2]) >>> cdf stats masters gpa programming admitted id 19 Advanced None NaN Advanced 0 24 Advanced None NaN Novice 1 13 Advanced no 4.0 None None 29 Novice yes 4.0 None None 15 Advanced yes 4.0 None None
>>> cdf = concat([cdf,df2]) >>> cdf stats masters gpa programming admitted id 19 Advanced None NaN Advanced 0 13 Advanced no 4.0 None None 24 Advanced None NaN Novice 1 24 Advanced None NaN Novice 1 19 Advanced None NaN Advanced 0 29 Novice yes 4.0 None None 15 Advanced yes 4.0 None None
- Set allow_duplicates = False
>>> cdf = concat([cdf,df2], allow_duplicates=False) >>> cdf stats masters gpa programming admitted id 19 Advanced None NaN Advanced 0 29 Novice yes 4.0 None None 24 Advanced None NaN Novice 1 15 Advanced yes 4.0 None None 13 Advanced no 4.0 None None
Example: Run concat() with optional argument "sort"
Set sort=True
>>> cdf = concat([df1,df2], sort=True) >>> cdf admitted gpa masters programming stats id 19 0 NaN None Advanced Advanced 24 1 NaN None Novice Advanced 13 None 4.0 no None Advanced 29 None 4.0 yes None Novice 15 None 4.0 yes None Advanced