Teradata Package for Python Function Reference | 20.00 - Retain - Teradata Package for Python - Look here for syntax, methods and examples for the functions included in the Teradata Package for Python.
Teradata® Package for Python Function Reference - 20.00
- Deployment
- VantageCloud
- VantageCore
- Edition
- Enterprise
- IntelliFlex
- VMware
- Product
- Teradata Package for Python
- Release Number
- 20.00.00.03
- Published
- December 2024
- ft:locale
- en-US
- ft:lastEdition
- 2024-12-19
- dita:id
- TeradataPython_FxRef_Enterprise_2000
- lifecycle
- latest
- Product Category
- Teradata Vantage
- teradataml.analytics.Transformations.Retain.__init__ = __init__(self, columns, out_columns=None, datatype=None)
- DESCRIPTION:
Retain option allows you to copy one or more columns into the final
analytic data set. By default, the result column name is the same as
the input column name, but this can be changed. If a specific type is
specified, it results in casting the retained column.
The Retain transformation is supported for all valid data types.
Note:
Output of this function is passed to "retain" argument of "Transform"
function from Vantage Analytic Library.
PARAMETERS:
columns:
Required Argument.
Specifies the names of the columns to retain.
Types: str or list of str
out_columns:
Optional Argument.
Specifies the names of the output columns.
Note:
Number of elements in "columns" and "out_columns" must be same.
Types: str or list of str
datatype:
Optional Argument.
Specifies the name of the intended datatype of the output column.
Intended data types for the output column can be specified using either the
teradatasqlalchemy types or the permitted strings mentioned below:
-------------------------------------------------------------------
| If intended SQL Data Type is | Permitted Value to be passed is |
|-------------------------------------------------------------------|
| bigint | bigint |
| byteint | byteint |
| char(n) | char,n |
| date | date |
| decimal(m,n) | decimal,m,n |
| float | float |
| integer | integer |
| number(*) | number |
| number(n) | number,n |
| number(*,n) | number,*,n |
| number(n,n) | number,n,n |
| smallint | smallint |
| time(p) | time,p |
| timestamp(p) | timestamp,p |
| varchar(n) | varchar,n |
--------------------------------------------------------------------
Notes:
1. Argument is ignored if "columns" argument is not used.
2. char without a size is not supported.
3. number(*) does not include the * in its datatype format.
Examples:
1. If intended datatype for the output column is "bigint", then
pass string "bigint" to the argument as shown below:
datatype="bigint"
2. If intended datatype for the output column is "decimal(3,5)", then
pass string "decimal,3,5" to the argument as shown below:
datatype="decimal,3,5"
Types: str, BIGINT, BYTEINT, CHAR, DATE, DECIMAL, FLOAT, INTEGER, NUMBER, SMALLINT, TIME,
TIMESTAMP, VARCHAR.
RETURNS:
An instance of Retain class.
RAISES:
TeradataMlException, TypeError, ValueError
EXAMPLE:
# Note:
# To run any transformation, user needs to use Transform() function from
# Vantage Analytic Library.
# To do so import valib first and set the "val_install_location".
>>> from teradataml import configure, DataFrame, load_example_data, valib, Retain
>>> configure.val_install_location = "SYSLIB"
>>>
# Load example data.
>>> load_example_data("dataframe", "sales")
>>>
# Create the required DataFrames.
>>> sales = DataFrame("sales")
>>> sales
Feb Jan Mar Apr datetime
accounts
Alpha Co 210.0 200.0 215.0 250.0 04/01/2017
Blue Inc 90.0 50.0 95.0 101.0 04/01/2017
Yellow Inc 90.0 NaN NaN NaN 04/01/2017
Jones LLC 200.0 150.0 140.0 180.0 04/01/2017
Red Inc 200.0 150.0 140.0 NaN 04/01/2017
Orange Inc 210.0 NaN NaN 250.0 04/01/2017
>>>
# Example: Shows retaining some column unchanged and some with name or datatype
# change.
# Retain columns "accounts" and "Feb" as is.
>>> rt_1 = Retain(columns=["accounts", "Feb"])
>>>
# Retain column "Jan" with name as "january".
>>> rt_2 = Retain(columns="Jan", out_columns="january")
>>>
# Retain column "Mar" and "Apr" with name as "march" and "april" with
# datatype changed to 'bigint'.
>>> rt_3 = Retain(columns=["Mar", "Apr"], out_columns=["march", "april"],
... datatype="bigint")
>>>
# Execute Transform() function.
>>> obj = valib.Transform(data=sales, retain=[rt_1, rt_2, rt_3])
>>> obj.result
accounts accounts1 Feb january march april
0 Alpha Co Alpha Co 210.0 200.0 215.0 250.0
1 Blue Inc Blue Inc 90.0 50.0 95.0 101.0
2 Yellow Inc Yellow Inc 90.0 NaN NaN NaN
3 Jones LLC Jones LLC 200.0 150.0 140.0 180.0
4 Red Inc Red Inc 200.0 150.0 140.0 NaN
5 Orange Inc Orange Inc 210.0 NaN NaN 250.0
>>>