Example setup
>>> # This example uses the 'admissions_train' dataset. >>> # Load the example data. >>> load_example_data("dataframe", "admissions_train") >>> df = DataFrame('admissions_train') >>> print(df)
masters gpa stats programming admitted id 5 no 3.44 Novice Novice 0 34 yes 3.85 Advanced Beginner 0 13 no 4.00 Advanced Novice 1 40 yes 3.95 Novice Beginner 0 22 yes 3.46 Novice Beginner 0 19 yes 1.98 Advanced Advanced 0 36 no 3.00 Advanced Novice 0 15 yes 4.00 Advanced Advanced 1 7 yes 2.33 Novice Novice 1 17 no 3.83 Advanced Advanced 1
Example 1: Create a user defined function to increase the 'gpa' by the percentage provided
Input to and output from the function is a Pandas Series object.
>>> def increase_gpa(row, p=20): row['gpa'] = row['gpa'] + row['gpa'] * p/100 return row
>>> # Apply the user defined function to the DataFrame. >>> # Note that since the output of the user defined function expects the same columns
>>> # with the same types, we can skip passing the 'returns' argument. >>> increase_gpa_20 = df.map_row(increase_gpa)
>>> # Print the result. >>> print(increase_gpa_20)
masters gpa stats programming admitted id 13 no 4.800 Advanced Novice 1 36 no 3.600 Advanced Novice 0 15 yes 4.800 Advanced Advanced 1 40 yes 4.740 Novice Beginner 0 22 yes 4.152 Novice Beginner 0 38 yes 3.180 Advanced Beginner 1 26 yes 4.284 Advanced Advanced 1 5 no 4.128 Novice Novice 0 7 yes 2.796 Novice Novice 1 19 yes 2.376 Advanced Advanced 0
Example 2: Use the same user defined function with a lambda notation to pass the percentage 'p = 40'
>>> increase_gpa_40 = df.map_row(lambda row: increase_gpa(row, p = 40))
>>> print(increase_gpa_40)
masters gpa stats programming admitted id 5 no 4.816 Novice Novice 0 34 yes 5.390 Advanced Beginner 0 13 no 5.600 Advanced Novice 1 40 yes 5.530 Novice Beginner 0 22 yes 4.844 Novice Beginner 0 19 yes 2.772 Advanced Advanced 0 36 no 4.200 Advanced Novice 0 15 yes 5.600 Advanced Advanced 1 7 yes 3.262 Novice Novice 1 17 no 5.362 Advanced Advanced 1
Example 3: Use the same user defined function with functools.partial to pass the percentage 'p = 50'
>>> from functools import partial >>> increase_gpa_50 = df.map_row(partial(increase_gpa, p = 50))
>>> print(increase_gpa_50)
masters gpa stats programming admitted id 5 no 5.160 Novice Novice 0 34 yes 5.775 Advanced Beginner 0 13 no 6.000 Advanced Novice 1 40 yes 5.925 Novice Beginner 0 22 yes 5.190 Novice Beginner 0 19 yes 2.970 Advanced Advanced 0 36 no 4.500 Advanced Novice 0 15 yes 6.000 Advanced Advanced 1 7 yes 3.495 Novice Novice 1 17 no 5.745 Advanced Advanced 1
Example 4: Use a lambda function to increase the 'gpa' by 50 percent, and return numpy ndarray
>>> from numpy import asarray >>> increase_gpa_lambda = lambda row, p=20: asarray([row['id'], row['masters'], row['gpa'] + row['gpa'] * p/100, row['stats'], row['programming'], row['admitted']]) >>> increase_gpa_100 = df.map_row(lambda row: increase_gpa_lambda(row, p=100))
>>> print(increase_gpa_100)
masters gpa stats programming admitted id 5 no 6.88 Novice Novice 0 34 yes 7.70 Advanced Beginner 0 13 no 8.00 Advanced Novice 1 40 yes 7.90 Novice Beginner 0 22 yes 6.92 Novice Beginner 0 19 yes 3.96 Advanced Advanced 0 36 no 6.00 Advanced Novice 0 15 yes 8.00 Advanced Advanced 1 7 yes 4.66 Novice Novice 1 17 no 7.66 Advanced Advanced 1