pivot | teradatamlspk | pyspark2teradataml - pivot - Teradata Package for Python

Teradata® pyspark2teradataml User Guide

Deployment
VantageCloud
VantageCore
Edition
Enterprise
IntelliFlex
VMware
Product
Teradata Package for Python
Release Number
20.00
Published
March 2024
Language
English (United States)
Last Update
2024-04-11
dita:mapPath
oeg1710443196055.ditamap
dita:ditavalPath
ayr1485454803741.ditaval
dita:id
oeg1710443196055
Product Category
Teradata Vantage

When using pivot, the output column names are different in PySpark and teradatamlspk.

PySpark

>>> df.groupBy("year").pivot("course", ["dotNET", "Java"]).sum("earnings").show()
+----+------+-----+
|year|dotNET| Java|
+----+------+-----+
|2012| 15000|20000|
|2013| 48000|30000|
+----+------+-----+

teradatamlspk

>>> df.groupBy("year").pivot("course", ["dotNET", "Java"]).sum("earnings").show()
+----+-------------------+-----------------+
|year|sum_earnings_dotnet|sum_earnings_java|
+----+-------------------+-----------------+
|2012|              15000|            20000|
|2013|              48000|            30000|
+----+-------------------+-----------------+

When using pivot in teradatamlspk, the grouping columns are not returned.

PySpark

>>> df1.groupBy("year").pivot("course", ["dotNET", "Java"]).sum("year").show()
+----+------+----+
|year|dotNET|Java|
+----+------+----+
|2012|  4024|2012|
|2013|  2013|2013|
+----+------+----+

teradatamlspk

>>> df1.groupBy("year").pivot("course", ["dotNET", "Java"]).sum("year").show()
+---------------+-------------+
|sum_year_dotnet|sum_year_java|
+---------------+-------------+
|           6037|         4025|
+---------------+-------------+