Use the unpivot() function to rotate data from columns into rows to create easy-to-read DataFrames.
- columns: Specifies the dictionary of columns with distinct column values used for unpivoting the data.
This argument is required when keyword arguments are not specified. Otherwise, it is optional.
- Key is a column expression or tuple of column expressions.
- Value is a literal to be taken by column when transposed into transpose_column. If not specified, value is generated by the function based on column names.
- Number of columns specified in each key must be equal.
For example:columns={(df.Q101Sales, df.Q101Cogs): "2001_Q1", (df.Q201Sales, df.Q201Cogs): None, (df.Q301Sales, df.Q301Cogs): "2001_Q3"}, transpose_column="yr_qtr",
This example does the following:- Transposes columns 'Q101Sales' and 'Q101Cogs' into a row where 'yr_qtr' column value is '2001_Q1';
- Transposes columns 'Q301Sales' and 'Q301Cogs' into a row with '2001_Q3' value in 'yr_qtr' column;
- For columns 'Q201Sales' and 'Q201Cogs', value of 'yr_qtr' column is function generated.
- transpose_column: Specifies the name of the column in the output DataFrame, which holds the data for columns specified in the keys of columns argument.
- measure_columns: Specifies the names of output columns to unpivot the data in the columns specified in the columns argument.
- The number of columns specified in measure_columns must be equal to the number of columns specified in each key of columns.
- One exception for this requirement is when all the columns is to be unpivoted into a single column. In such case, measure_columns must be specified as a string or list of string with one value.
- exclude_nulls: Specifies whether to exclude NULL (None) values while unpivoting the data. When set to True, excludes NULL (None), otherwise includes NULL.
Default value is True.
- returns: Specifies the custom column names to be returned in the output DataFrame.
- The number of column names must be equal to the value in the measure_column argument plus one.
- All columns including columns in DataFrame which do not participate in unpivot must be specified if all_columns is set to True.
- all_columns: Specifies whether returns argument must include the names of only unpivot columns or all columns.
- When set to True, all columns including columns in DataFrame which do not participate in unpivot must be specified.
- When set to False, which is the default value, only output columns excluding columns in DataFrame which do not participate in unpivot must be specified.
- kwargs: Specifies the keyword argument to accept column name and column values as named arguments.For example:
col1=df.year, col1_value=2001, col2=df.qtr, col2_value='Q1'
Orcol1=(df.Q101Sales, df.Q101Cogs), col1_value="2001_Q1", col2=(df.Q201Sales, df.Q201Cogs), col2_value=None, col3=(df.Q301Sales, df.Q301Cogs), col3_value="2001_Q3"
- Either use columns argument or keyword arguments.
- Format for column name arguments must be col1, col2,.. colN with values of type ColumnExpression.
- Format for column value argument must be col1_value, col2_value,.. colN_value.
Example Setup
# Create a teradataml DataFrame. >>> load_example_data("teradataml", "star1") >>> df = DataFrame("star1")
>>> df = df.pivot(columns={df.qtr: ["Q1", "Q2", "Q3"], df.yr: ["2001"]}, aggfuncs=[df.sales.sum(), df.cogs.sum()], returns=["Q101Sales", "Q201Sales", "Q301Sales", "Q101Cogs", "Q201Cogs", "Q301Cogs"])
>>> print(df) country state Q101Sales Q201Sales Q301Sales Q101Cogs Q201Cogs Q301Cogs 0 CANADA BC NaN NaN NaN NaN 10.0 0.0 1 CANADA ON NaN NaN 10.0 0.0 NaN NaN 2 USA NY 45.0 25.0 NaN NaN NaN NaN 3 USA CA NaN NaN NaN NaN NaN NaN
Example 1: Unpivot quarterly sales data to 'sales' column and quarterly cogs data to 'cogs' columns
This example unpivots quarterly sales data to 'sales' column and quarterly cogs data to 'cogs' columns using columns argument.
>>> unpivot_df = df.unpivot(columns={(df.Q101Sales, df.Q101Cogs): "2001_Q1", (df.Q201Sales, df.Q201Cogs): None, (df.Q301Sales, df.Q301Cogs): "2001_Q3"}, transpose_column="yr_qtr", measure_columns=["sales", "cogs"])
>>> print(unpivot_df) country state yr_qtr sales cogs 0 CANADA BC Q201Sales_Q201Cogs NaN 10.0 1 CANADA ON 2001_Q1 NaN 0.0 2 CANADA ON 2001_Q3 10.0 NaN 3 CANADA BC 2001_Q3 NaN 0.0 4 USA NY Q201Sales_Q201Cogs 25.0 NaN 5 USA NY 2001_Q1 45.0 NaN
Example 2: Unpivot 'sales' and 'cogs' into a single column 'sales_cogs'
>>> unpivot_df = df.unpivot(columns={(df.Q101Sales, df.Q101Cogs, df.Q201Sales, df.Q201Cogs, df.Q301Sales, df.Q301Cogs): None}, transpose_column="yr_qtr", measure_columns="sales_cogs")
>>> print(unpivot_df) country state yr_qtr sales_cogs 0 CANADA BC Q201Cogs 10.0 1 CANADA ON Q101Cogs 0.0 2 CANADA ON Q301Sales 10.0 3 CANADA BC Q301Cogs 0.0 4 USA NY Q201Sales 25.0 5 USA NY Q101Sales 45.0
Example 3: Unpivot quarterly sales data to 'sales' column and quarterly cogs data to 'cogs' columns
This example unpivots quarterly sales data to 'sales' column and quarterly cogs data to 'cogs' columns using keyword arguments.
>>> unpivot_df = df.unpivot(transpose_column="yr_qtr", measure_columns=["sales", "cogs"], col1 = (df.Q101Sales, df.Q101Cogs), col1_value = "Q101", col2 = (df.Q201Sales, df.Q201Cogs), col2_value = None, col3 = (df.Q301Sales, df.Q301Cogs), col3_value = "Q103")
>>> print(unpivot_df) country state yr_qtr sales cogs 0 CANADA BC Q201Sales_Q201Cogs NaN 10.0 1 CANADA ON Q101 NaN 0.0 2 CANADA ON Q103 10.0 NaN 3 CANADA BC Q103 NaN 0.0 4 USA NY Q201Sales_Q201Cogs 25.0 NaN 5 USA NY Q101 45.0 NaN
Example 4: Unpivot quarterly sales and quarterly cogs data and rename column of the output using one argument
This example unpivots quarterly sales data to 'sales' column and quarterly cogs data to 'cogs' columns, and rename column of the output using the returns argument.
>>> unpivot_df = df.unpivot(columns={(df.Q101Sales, df.Q101Cogs): "Q101", (df.Q201Sales, df.Q201Cogs): None, (df.Q301Sales, df.Q301Cogs): "Q301"}, transpose_column="yr_qtr", measure_columns=["sales", "cogs"], returns=["year_quarter", "sales_data", "cogs_data"])
>>> print(unpivot_df) country state year_quarter sales_data cogs_data 0 CANADA BC Q201Sales_Q201Cogs NaN 10.0 1 CANADA ON Q101 NaN 0.0 2 CANADA ON Q301 10.0 NaN 3 CANADA BC Q301 NaN 0.0 4 USA NY Q201Sales_Q201Cogs 25.0 NaN 5 USA NY Q101 45.0 NaN
Example 5: Unpivot quarterly sales and quarterly cogs data and rename each column of the output using two arguments
This example unpivots quarterly sales data to 'sales' column and quarterly cogs data to 'cogs' columns, and rename each column of the output using returns and all_columns arguments.
>>> unpivot_df = df.unpivot(columns={(df.Q101Sales, df.Q101Cogs): "Q101", (df.Q201Sales, df.Q201Cogs): None, (df.Q301Sales, df.Q301Cogs): "Q301"}, transpose_column="yr_qtr", measure_columns=["sales", "cogs"], returns=["con", "st", "year_quarter", "sales_data", "cogs_data"], all_columns=True)
>>> print(unpivot_df) con st year_quarter sales_data cogs_data 0 CANADA BC Q201Sales_Q201Cogs NaN 10.0 1 CANADA ON Q101 NaN 0.0 2 CANADA ON Q301 10.0 NaN 3 CANADA BC Q301 NaN 0.0 4 USA NY Q201Sales_Q201Cogs 25.0 NaN 5 USA NY Q101 45.0 NaN
Example 6: Unpivot quarterly sales data to 'sales' column and quarterly cogs data to 'cogs' columns, including NULLs
>>> unpivot_df = df.unpivot(columns={(df.Q101Sales, df.Q101Cogs): "2001_Q1", (df.Q201Sales, df.Q201Cogs): None, (df.Q301Sales, df.Q301Cogs): "2001_Q3"}, transpose_column="yr_qtr", measure_columns=["sales", "cogs"], exclude_nulls=False)
>>> print(unpivot_df) country state yr_qtr sales cogs 0 USA CA 2001_Q3 NaN NaN 1 USA NY Q201Sales_Q201Cogs 25.0 NaN 2 USA NY 2001_Q3 NaN NaN 3 CANADA BC 2001_Q1 NaN NaN 4 CANADA BC 2001_Q3 NaN 0.0 5 CANADA ON 2001_Q1 NaN 0.0 6 CANADA BC Q201Sales_Q201Cogs NaN 10.0 7 USA NY 2001_Q1 45.0 NaN 8 USA CA Q201Sales_Q201Cogs NaN NaN 9 USA CA 2001_Q1 NaN NaN