August 2024 |
20.00.00.01 |
- teradatamlspk DataFrame
- write() - Supports writing the DataFrame to local file system or to Vantage or to cloud storage.
- writeTo() - Supports writing the DataFrame to a Vantage table.
- rdd - Returns the same DataFrame.
- teradatamlspk DataFrameColumn (ColumnExpression)
- desc_nulls_first - Returns a sort expression based on the descending order of the given column name, and null values appear before non-null values.
- desc_nulls_last - Returns a sort expression based on the descending order of the given column name, and null values appear after non-null values.
- asc_nulls_first - Returns a sort expression based on the ascending order of the given column name, and null values appear before non-null values.
- asc_nulls_last - Returns a sort expression based on the ascending order of the given column name, and null values appear after non-null values.
- Updates
- DataFrame.fillna() and DataFrame.na.fill() now supports input arguments of the same data type or their types must be compatible.
- DataFrame.agg() and GroupedData.agg() function supports Column as input and '*' for 'count'.
- DataFrameColumn.cast() and DataFrameColumn.alias() now accepts string literal which are case insensitive.
- Optimized performance for DataFrame.show()
- Classification Summary, TrainingSummary object and MulticlassClassificationEvaluator now supports weightedTruePositiveRate and weightedFalsePositiveRate metric.
- Arithmetic operations can be performed on window aggregates.
- Added new function time_difference to return difference between two timestamps in seconds.
- Bug fixes:
- DataFrame.head() returns a list when n is 1.
- DataFrame.union() and DataFrame.unionAll() now performs union of rows based on columns position.
- DataFrame.groupBy() and DataFrame.groupby() now accepts columns as positional arguments as well, for example df.groupBy("col1", "col2").
- MLlib Functions attribute numClasses and intercept now return value.
- Appropriate error is raised if invalid file is passed to pyspark2teradataml.
- when function accepts Column also along with literal for value argument.
|
March 2024 |
20.00.00.00 |
Initial release.- A pyspark2teradataml utility function to enable PySpark script conversion automatically to teradataml format.
- Supports the following:
- 85 DataFrame APIs with similar syntax compared to PySpark DataFrame APIs.
- 22 DataFrameColumn APIs with similar syntax compared to PySpark DataFrameColumn APIs.
- 200 Functions with similar syntax compared to PySpark Functions.
- 69 machine learning functions with similar syntax compared to PySpark machine learning functions.
|