The LinearRegrWithSGDTrain class defines a wrapper function that uses the Aster Spark API and implements the training phase of the Spark MLlib LinearRegressionWithSGD algorithm. The function generates a model that is typically used by the LinearRegressionWithSGDRun function.
Run Method Signature
run(input: RDD[DataRow], sparkFunctParams: String): RDD[DataRow]
Parameters
String representing the parameters specific to the function you are implementing. The string has this syntax:
'--option_value_pair [,...]'
option_value_pair is one of the following:
-
initialWeights initial_weights
Array of initial weights, one for each feature in the data either "random" or "k-means||" (default).
-
miniBatchFraction mini_batch_fraction
Fraction of data to use in each iteration.
-
modelLocation model_location
Required. Specifies the HDFS path to the location where the function is to save the model.
-
numIterations iterations
Number of iterations of gradient descent to run.
-
stepSize step_size
Step size for each iteration of gradient descent.
Returns
A single value, the mean square error.
Side Effects
Function saves model in model_location.
Version
Spark 1.3 and later.