The mode() function returns the column-wise mode of all values in each group. In the event of a tie between two or more values from column, a row per result is returned. It is a single-threaded function.
- This function is valid only on columns of numeric types.
- Nulls are not included in the result computation.
Examples Prerequisite
See Example Setup to set up the environment for the following examples.
Example: Run mode() on DataFrame created on non-sequenced PTI table
>>> ocean_buoys_grpby1 = ocean_buoys.groupby_time(timebucket_duration="10m", value_expression="buoyid", fill="NULLS")
>>> ocean_buoys_grpby1.mode().sort(["TIMECODE_RANGE", "buoyid"]) TIMECODE_RANGE GROUP BY TIME(MINUTES(10)) buoyid mode_temperature mode_salinity 0 ('2014-01-06 08:00:00.000000+00:00', '2014-01-... 106033 0 99 55 1 ('2014-01-06 08:00:00.000000+00:00', '2014-01-... 106033 0 10 55 2 ('2014-01-06 08:10:00.000000+00:00', '2014-01-... 106034 0 100 55 3 ('2014-01-06 08:10:00.000000+00:00', '2014-01-... 106034 0 10 55 4 ('2014-01-06 09:00:00.000000+00:00', '2014-01-... 106039 1 79 55 5 ('2014-01-06 09:00:00.000000+00:00', '2014-01-... 106039 1 70 55 6 ('2014-01-06 09:00:00.000000+00:00', '2014-01-... 106039 1 72 55 7 ('2014-01-06 09:00:00.000000+00:00', '2014-01-... 106039 1 78 55 8 ('2014-01-06 09:00:00.000000+00:00', '2014-01-... 106039 1 71 55 9 ('2014-01-06 09:00:00.000000+00:00', '2014-01-... 106039 1 77 55
Example: Run mode() on DataFrame created on sequenced PTI table
Table has few columns incompatible for mode() operation 'dates' and 'TD_TIMECODE', while executing this mode() incompatible columns are ignored.
>>> ocean_buoys_seq_grpby1 = ocean_buoys_seq.groupby_time(timebucket_duration="MINUTES(10)", value_expression="buoyid", fill="NULLS")
>>> ocean_buoys_seq_grpby1.mode().sort(["TIMECODE_RANGE", "buoyid"]) TIMECODE_RANGE GROUP BY TIME(MINUTES(10)) buoyid mode_TD_SEQNO mode_salinity mode_temperature 0 ('2014-01-06 08:00:00.000000+00:00', '2014-01-... 106033 0 17 55 10 1 ('2014-01-06 08:10:00.000000+00:00', '2014-01-... 106034 0 19 55 10 2 ('2014-01-06 09:00:00.000000+00:00', '2014-01-... 106039 1 11 55 70 3 ('2014-01-06 10:00:00.000000+00:00', '2014-01-... 106045 44 7 55 43 4 ('2014-01-06 10:00:00.000000+00:00', '2014-01-... 106045 44 21 55 43 5 ('2014-01-06 10:00:00.000000+00:00', '2014-01-... 106045 44 20 55 43 6 ('2014-01-06 10:00:00.000000+00:00', '2014-01-... 106045 44 4 55 43 7 ('2014-01-06 10:00:00.000000+00:00', '2014-01-... 106045 44 9 55 43 8 ('2014-01-06 10:00:00.000000+00:00', '2014-01-... 106045 44 8 55 43 9 ('2014-01-06 10:00:00.000000+00:00', '2014-01-... 106045 44 5 55 43
Example: Run mode() on DataFrame created on NON-PTI table
timecode_column for grouping must be passed explicitly.
>>> ocean_buoys_nonpti_grpby1 = ocean_buoys_nonpti.groupby_time(timebucket_duration="10minutes", value_expression="buoyid", timecode_column="timecode", fill="NULLS")
>>> ocean_buoys_nonpti_grpby1.mode().sort(["TIMECODE_RANGE", "buoyid"]) TIMECODE_RANGE GROUP BY TIME(MINUTES(10)) buoyid mode_temperature mode_salinity 0 ('2014-01-06 08:00:00.000000+00:00', '2014-01-... 2314993 0 99 55 1 ('2014-01-06 08:00:00.000000+00:00', '2014-01-... 2314993 0 10 55 2 ('2014-01-06 08:10:00.000000+00:00', '2014-01-... 2314994 0 10 55 3 ('2014-01-06 08:10:00.000000+00:00', '2014-01-... 2314994 0 100 55 4 ('2014-01-06 09:00:00.000000+00:00', '2014-01-... 2314999 1 70 55 5 ('2014-01-06 09:00:00.000000+00:00', '2014-01-... 2314999 1 71 55 6 ('2014-01-06 09:00:00.000000+00:00', '2014-01-... 2314999 1 72 55 7 ('2014-01-06 09:00:00.000000+00:00', '2014-01-... 2314999 1 77 55 8 ('2014-01-06 09:00:00.000000+00:00', '2014-01-... 2314999 1 78 55 9 ('2014-01-06 09:00:00.000000+00:00', '2014-01-... 2314999 1 79 55