Use the concat() method to concatenate two teradataml DataFrame objects along the index axis. The operation is performed by carrying out a database-style union or union all operation.
Examples Prerequisite
Assume the table "admissions_train" exists and a DataFrame "df" is created based on this table using the command:
>>> df = DataFrame("admissions_train")
>>> df masters gpa stats programming admitted id 22 yes 3.46 Novice Beginner 0 36 no 3.00 Advanced Novice 0 15 yes 4.00 Advanced Advanced 1 38 yes 2.65 Advanced Beginner 1 5 no 3.44 Novice Novice 0 17 no 3.83 Advanced Advanced 1 34 yes 3.85 Advanced Beginner 0 13 no 4.00 Advanced Novice 1 26 yes 3.57 Advanced Advanced 1 19 yes 1.98 Advanced Advanced 0
>>> df1 = df[df.gpa == 4].select(['id', 'stats', 'masters', 'gpa'])
>>> df1
stats masters gpa
id
13 Advanced no 4.0
29 Novice yes 4.0
15 Advanced yes 4.0
>>> df2 = df[df.gpa < 2].select(['id', 'stats', 'programming', 'admitted'])
>>> df2
stats programming admitted
id
24 Advanced Novice 1
19 Advanced Advanced 0
Example 1: Default behavior with default values for the optional arguments.
>>> # Default options
>>> cdf = df1.concat(df2)
>>> cdf
stats masters gpa programming admitted
id
19 Advanced None NaN Advanced 0
24 Advanced None NaN Novice 1
13 Advanced no 4.0 None None
29 Novice yes 4.0 None None
15 Advanced yes 4.0 None None
Example 2: concat() operation with 'join = inner'.
>>> cdf = df1.concat(df2, join='inner')
>>> cdf
stats
id
19 Advanced
24 Advanced
13 Advanced
29 Novice
15 Advanced
Example 3: concat() operation with 'allow_duplicates=False'.
>>> # allow_duplicates = True (default)
>>> cdf = df1.concat(df2)
>>> cdf
stats masters gpa programming admitted
id
19 Advanced None NaN Advanced 0
24 Advanced None NaN Novice 1
13 Advanced no 4.0 None None
29 Novice yes 4.0 None None
15 Advanced yes 4.0 None None
>>> cdf = cdf.concat(df2)
>>> cdf
stats masters gpa programming admitted
id
19 Advanced None NaN Advanced 0
13 Advanced no 4.0 None None
24 Advanced None NaN Novice 1
24 Advanced None NaN Novice 1
19 Advanced None NaN Advanced 0
29 Novice yes 4.0 None None
15 Advanced yes 4.0 None None
>>> # allow_duplicates = False
>>> cdf = cdf.concat(df2, allow_duplicates=False)
>>> cdf
stats masters gpa programming admitted
id
19 Advanced None NaN Advanced 0
29 Novice yes 4.0 None None
24 Advanced None NaN Novice 1
15 Advanced yes 4.0 None None
13 Advanced no 4.0 None None
Example 4: concat() operation with 'sort=True'.
>>> cdf = df1.concat(df2, sort=True) >>> cdf admitted gpa masters programming stats id 19 0 NaN None Advanced Advanced 24 1 NaN None Novice Advanced 13 None 4.0 no None Advanced 29 None 4.0 yes None Novice 15 None 4.0 yes None Advanced