The aggregate function ts.var() returns the population variance of values of the column grouped by time.
The variance of a population is a measure of dispersion from the mean of that population.
- When there are fewer than two non-NULL data points in the population used for the computation, ts.varp() returns NULL/NA.
- Nulls are not included in the result computation.
- Division by zero results in NULL/NA value rather than an error.
- ts.varp() can only be used if data represents entire population. Otherwise, Teradata recommends to use ts.var() to calculate sample variance.
Arguments:
- value.expression: Specify the column for which population variance is to be computed.
Use ts.varp(distinct(column_name)) to exclude duplicate rows while calculating population variance.
Example 1: Calculate the population variance of values in the 'temperature' column of sequenced PTI table
- Calculate the population variance.
> df_seq_varp <- df_seq_grp %>% summarise(varp_temp = ts.varp(temperature))
- Print the results.
> df_seq_varp %>% arrange(TIMECODE_RANGE, buoyid, varp_temp) # Source: lazy query [?? x 4] # Database: [Teradata 16.20.50.01] [Teradata Native Driver 17.0.0.2] # [TDAPUSER@<hostname>/TDAPUSERDB] # Ordered by: TIMECODE_RANGE, buoyid, varp_temp TIMECODE_RANGE `GROUP BY TIME(MINUTES(~ buoyid varp_temp <chr> <int64> <int> <dbl> 1 2014-01-06 08:00:00.000000+00:00,2014-01-06 08:3~ 35345 0 2003. 2 2014-01-06 09:00:00.000000+00:00,2014-01-06 09:3~ 35347 1 12.9 3 2014-01-06 10:00:00.000000+00:00,2014-01-06 10:3~ 35349 44 30.7 4 2014-01-06 10:30:00.000000+00:00,2014-01-06 11:0~ 35350 22 0 5 2014-01-06 10:30:00.000000+00:00,2014-01-06 11:0~ 35350 44 0 6 2014-01-06 21:00:00.000000+00:00,2014-01-06 21:3~ 35371 2 0.667
Example 2: Calculate the population variance of values in the 'temperature' column of non-PTI table
- Calculate the population variance.
> df_nonpti_varp <- df_nonpti %>% group_by_time(timebucket.duration = "10m", timecode.column = "TIMECODE") %>% summarise(varp_temp = ts.varp(temperature))
- Print the results.
> df_nonpti_varp %>% arrange(TIMECODE_RANGE, varp_temp) # Source: lazy query [?? x 3] # Database: [Teradata 16.20.50.01] [Teradata Native Driver 17.0.0.2] # [TDAPUSER@<hostname>/TDAPUSERDB] # Ordered by: TIMECODE_RANGE, varp_temp TIMECODE_RANGE `GROUP BY TIME(MINUTES(1~ varp_temp <chr> <int64> <dbl> 1 2014-01-06 08:00:00.000000+00:00,2014-01-06 08:10:00.0~ 2314993 1980. 2 2014-01-06 08:10:00.000000+00:00,2014-01-06 08:20:00.0~ 2314994 2025 3 2014-01-06 09:00:00.000000+00:00,2014-01-06 09:10:00.0~ 2314999 12.9 4 2014-01-06 10:00:00.000000+00:00,2014-01-06 10:10:00.0~ 2315005 29.8 5 2014-01-06 10:10:00.000000+00:00,2014-01-06 10:20:00.0~ 2315006 0 6 2014-01-06 10:30:00.000000+00:00,2014-01-06 10:40:00.0~ 2315008 0 7 2014-01-06 10:50:00.000000+00:00,2014-01-06 11:00:00.0~ 2315010 0 8 2014-01-06 21:00:00.000000+00:00,2014-01-06 21:10:00.0~ 2315071 0.667