unpivot() | Teradata Package for Python - unpivot() - Teradata Vantage

Teradata® VantageCloud Lake

Deployment
VantageCloud
Edition
Lake
Product
Teradata Vantage
Published
January 2023
Language
English (United States)
Last Update
2024-02-17
dita:mapPath
phg1621910019905.ditamap
dita:ditavalPath
pny1626732985837.ditaval
dita:id
phg1621910019905

Use the unpivot() function to rotate data from columns into rows to create easy-to-read DataFrames.

Required Arguments:
  • columns: Specifies the dictionary of columns with distinct column values used for unpivoting the data.

    This argument is required when keyword arguments are not specified. Otherwise, it is optional.

    • Key is a column expression or tuple of column expressions.
    • Value is a literal to be taken by column when transposed into transpose_column. If not specified, value is generated by the function based on column names.
    • Number of columns specified in each key should be equal.
    For example:
    columns={(df.Q101Sales, df.Q101Cogs): "2001_Q1",
             (df.Q201Sales, df.Q201Cogs): None,
             (df.Q301Sales, df.Q301Cogs): "2001_Q3"},
    transpose_column="yr_qtr",
    This example does the following:
    • Transposes columns 'Q101Sales' and 'Q101Cogs' into a row where 'yr_qtr' column value is '2001_Q1';
    • Transposes columns 'Q301Sales' and 'Q301Cogs' into a row with '2001_Q3' value in 'yr_qtr' column;
    • For columns 'Q201Sales' and 'Q201Cogs', value of 'yr_qtr' column is function generated.
  • transpose_column: Specifies the name of the column in the output DataFrame, which holds the data for columns specified in the keys of columns argument.
  • measure_columns: Specifies the names of output columns to unpivot the data in the columns specified in the columns argument.
    • The number of columns specified in measure_columns should be equal to the number of columns specified in each key of columns.
    • One exception for this requirement is when all the columns is to be unpivoted into a single column. In such case, measure_columns should be specified as a string or list of string with one value.
Optional Arguments:
  • exclude_nulls: Specifies whether to exclude NULL (None) values while unpivoting the data. When set to True, excludes NULL (None), otherwise includes NULL.

    Default value is True.

  • returns: Specifies the custom column names to be returned in the output DataFrame.
    • The number of column names should be equal to the value in the measure_column argument plus one.
    • All columns including columns in DataFrame which do not participate in unpivot must be specified if all_columns is set to True.
  • all_columns: Specifies whether returns argument should include the names of only unpivot columns or all columns.
    • When set to True, all columns including columns in DataFrame which do not participate in unpivot must be specified.
    • When set to False, which is the default value, only output columns excluding columns in DataFrame which do not participate in unpivot must be specified.
  • kwargs: Specifies the keyword argument to accept column name and column values as named arguments.
    For example:
    col1=df.year, col1_value=2001, col2=df.qtr, col2_value='Q1'
    Or
    col1=(df.Q101Sales, df.Q101Cogs), col1_value="2001_Q1",
    col2=(df.Q201Sales, df.Q201Cogs), col2_value=None,
    col3=(df.Q301Sales, df.Q301Cogs), col3_value="2001_Q3"
    • Either use columns argument or keyword arguments.
    • Format for column name arguments should be col1, col2,.. colN with values of type ColumnExpression.
    • Format for column value argument should be col1_value, col2_value,.. colN_value.

Example Setup

# Create a teradataml DataFrame.            
>>> load_example_data("teradataml", "star1")
>>> df = DataFrame("star1")
>>> df = df.pivot(columns={df.qtr: ["Q1", "Q2", "Q3"], df.yr: ["2001"]},
                  aggfuncs=[df.sales.sum(), df.cogs.sum()],
                  returns=["Q101Sales", "Q201Sales", "Q301Sales",
                           "Q101Cogs", "Q201Cogs", "Q301Cogs"])
>>> print(df)
  country state  Q101Sales  Q201Sales  Q301Sales  Q101Cogs  Q201Cogs  Q301Cogs
0  CANADA    BC        NaN        NaN        NaN       NaN      10.0       0.0
1  CANADA    ON        NaN        NaN       10.0       0.0       NaN       NaN
2     USA    NY       45.0       25.0        NaN       NaN       NaN       NaN
3     USA    CA        NaN        NaN        NaN       NaN       NaN       NaN

Example 1: Unpivot quarterly sales data to 'sales' column and quarterly cogs data to 'cogs' columns

This example unpivots quarterly sales data to 'sales' column and quarterly cogs data to 'cogs' columns using columns argument.

>>> unpivot_df = df.unpivot(columns={(df.Q101Sales, df.Q101Cogs): "2001_Q1",
                                     (df.Q201Sales, df.Q201Cogs): None,
                                     (df.Q301Sales, df.Q301Cogs): "2001_Q3"},
                            transpose_column="yr_qtr",
                            measure_columns=["sales", "cogs"])
>>> print(unpivot_df)
  country state              yr_qtr  sales  cogs
0  CANADA    BC  Q201Sales_Q201Cogs    NaN  10.0
1  CANADA    ON             2001_Q1    NaN   0.0
2  CANADA    ON             2001_Q3   10.0   NaN
3  CANADA    BC             2001_Q3    NaN   0.0
4     USA    NY  Q201Sales_Q201Cogs   25.0   NaN
5     USA    NY             2001_Q1   45.0   NaN

Example 2: Unpivot 'sales' and 'cogs' into a single column 'sales_cogs'

>>> unpivot_df = df.unpivot(columns={(df.Q101Sales, df.Q101Cogs,
                                     df.Q201Sales, df.Q201Cogs,
                                     df.Q301Sales, df.Q301Cogs): None},
                            transpose_column="yr_qtr",
                            measure_columns="sales_cogs")
>>> print(unpivot_df)
  country state     yr_qtr  sales_cogs
0  CANADA    BC   Q201Cogs        10.0
1  CANADA    ON   Q101Cogs         0.0
2  CANADA    ON  Q301Sales        10.0
3  CANADA    BC   Q301Cogs         0.0
4     USA    NY  Q201Sales        25.0
5     USA    NY  Q101Sales        45.0

Example 3: Unpivot quarterly sales data to 'sales' column and quarterly cogs data to 'cogs' columns

This example unpivots quarterly sales data to 'sales' column and quarterly cogs data to 'cogs' columns using keyword arguments.

>>> unpivot_df = df.unpivot(transpose_column="yr_qtr",
                            measure_columns=["sales", "cogs"],
                            col1 = (df.Q101Sales, df.Q101Cogs),
                            col1_value = "Q101",
                            col2 = (df.Q201Sales, df.Q201Cogs),
                            col2_value = None,
                            col3 = (df.Q301Sales, df.Q301Cogs),
                            col3_value = "Q103")
>>> print(unpivot_df)
  country state              yr_qtr  sales  cogs
0  CANADA    BC  Q201Sales_Q201Cogs    NaN  10.0
1  CANADA    ON                Q101    NaN   0.0
2  CANADA    ON                Q103   10.0   NaN
3  CANADA    BC                Q103    NaN   0.0
4     USA    NY  Q201Sales_Q201Cogs   25.0   NaN
5     USA    NY                Q101   45.0   NaN

Example 4: Unpivot quarterly sales and quarterly cogs data and rename column of the output using one argument

This example unpivots quarterly sales data to 'sales' column and quarterly cogs data to 'cogs' columns, and rename column of the output using the returns argument.

>>> unpivot_df = df.unpivot(columns={(df.Q101Sales, df.Q101Cogs): "Q101",
                                     (df.Q201Sales, df.Q201Cogs): None,
                                     (df.Q301Sales, df.Q301Cogs): "Q301"},
                            transpose_column="yr_qtr",
                            measure_columns=["sales", "cogs"],
                            returns=["year_quarter", "sales_data", "cogs_data"])
>>> print(unpivot_df)
  country state        year_quarter  sales_data  cogs_data
0  CANADA    BC  Q201Sales_Q201Cogs         NaN       10.0
1  CANADA    ON                Q101         NaN        0.0
2  CANADA    ON                Q301        10.0        NaN
3  CANADA    BC                Q301         NaN        0.0
4     USA    NY  Q201Sales_Q201Cogs        25.0        NaN
5     USA    NY                Q101        45.0        NaN

Example 5: Unpivot quarterly sales and quarterly cogs data and rename each column of the output using two arguments

This example unpivots quarterly sales data to 'sales' column and quarterly cogs data to 'cogs' columns, and rename each column of the output using returns and all_columns arguments.

>>> unpivot_df = df.unpivot(columns={(df.Q101Sales, df.Q101Cogs): "Q101",
                                     (df.Q201Sales, df.Q201Cogs): None,
                                     (df.Q301Sales, df.Q301Cogs): "Q301"},
                            transpose_column="yr_qtr",
                            measure_columns=["sales", "cogs"],
                            returns=["con", "st", "year_quarter",
                                     "sales_data", "cogs_data"],
                            all_columns=True)
>>> print(unpivot_df)
      con  st        year_quarter  sales_data  cogs_data
0  CANADA  BC  Q201Sales_Q201Cogs         NaN       10.0
1  CANADA  ON                Q101         NaN        0.0
2  CANADA  ON                Q301        10.0        NaN
3  CANADA  BC                Q301         NaN        0.0
4     USA  NY  Q201Sales_Q201Cogs        25.0        NaN
5     USA  NY                Q101        45.0        NaN

Example 6: Unpivot quarterly sales data to 'sales' column and quarterly cogs data to 'cogs' columns, including NULLs

>>> unpivot_df = df.unpivot(columns={(df.Q101Sales, df.Q101Cogs): "2001_Q1",
                                     (df.Q201Sales, df.Q201Cogs): None,
                                     (df.Q301Sales, df.Q301Cogs): "2001_Q3"},
                            transpose_column="yr_qtr",
                            measure_columns=["sales", "cogs"],
                            exclude_nulls=False)
>>> print(unpivot_df)
  country state              yr_qtr  sales  cogs
0     USA    CA             2001_Q3    NaN   NaN
1     USA    NY  Q201Sales_Q201Cogs   25.0   NaN
2     USA    NY             2001_Q3    NaN   NaN
3  CANADA    BC             2001_Q1    NaN   NaN
4  CANADA    BC             2001_Q3    NaN   0.0
5  CANADA    ON             2001_Q1    NaN   0.0
6  CANADA    BC  Q201Sales_Q201Cogs    NaN  10.0
7     USA    NY             2001_Q1   45.0   NaN
8     USA    CA  Q201Sales_Q201Cogs    NaN   NaN
9     USA    CA             2001_Q1    NaN   NaN