The dbplot package is a simple yet powerful open source package for visualizing computations in the database without bringing the data to the client. The APIs in the dbplot library can be implemented in R to add this feature to the tdplyr library.
The following examples use the "flights" dataset from the "nycflights13" package and the use cases from dbplot, to show how to use dbplot package with the tdplyr.
Examples Prerequisite
You must have the "ggplot2" and "dbplot" packages, and the "nycflights13" data package on your R client first. And prepare the data for plotting.
library(ggplot2)
install.packages("dbplot", quiet = TRUE)
library(dbplot)
install.packages('nycflights13', quiet = TRUE)
library(nycflights13)
flights_tibble <- as.data.frame(nycflights13::flights)%>% filter(dep_time < 600)
copy_to(con, flights_tibble, "flights")
flights <- tbl(con, "flights")
flights
year month day dep_time sched_dep_time dep_delay arr_time sched_arr_time <int> <int> <int> <int> <int> <dbl> <int> <int> 1 2013 1 1 542 540 2 923 850 2 2013 1 1 554 600 -6 812 837 3 2013 1 1 554 558 -4 740 728 4 2013 1 1 555 600 -5 913 854 5 2013 1 1 557 600 -3 838 846 6 2013 1 1 558 600 -2 753 745 7 2013 1 1 557 600 -3 709 723 8 2013 1 1 544 545 -1 1004 1022 9 2013 1 1 533 529 4 850 830 10 2013 1 1 517 515 2 830 819 # ... with more rows, and 11 more variables: arr_delay <dbl>, carrier <chr>, # flight <int>, tailnum <chr>, origin <chr>, dest <chr>, air_time <dbl>, # distance <dbl>, hour <dbl>, minute <dbl>, time_hour <dttm>
Use dbplot_histogram to plot Histogram
flights %>% filter(!is.na(arr_time)) %>% dbplot_histogram(arr_time)
flights %>% filter(!is.na(arr_time)) %>% dbplot_histogram(arr_time, binwidth = 200)
flights %>% filter(!is.na(arr_time)) %>% dbplot_histogram(arr_time) + labs(title = "Flights - Arrival Time") + theme_light()#also changing the theme
Use dbplot_raster to plot Raster Graphics
To visualize two continuous variables with millions or billions of dots representing the intersections of the two variables, you can use Raster plot. It concentrates the intersections into squares that are easy to parse visually.
flights %>% filter(!is.na(arr_delay)) %>% dbplot_raster(arr_delay, dep_delay)
flights %>% filter(!is.na(arr_delay)) %>% dbplot_raster(arr_delay, dep_delay, fill = mean(distance, na.rm = TRUE))
flights %>% filter(!is.na(arr_delay)) %>% dbplot_raster(arr_delay, dep_delay, mean(distance, na.rm = TRUE), resolution = 500)
Use dbplot_bar to plot Bar graph
flights %>% dbplot_bar(origin)
flights %>% dbplot_bar(origin, mean(dep_delay, na.rm = TRUE))
Use dbplot_line to plot Line graph
flights %>% dbplot_line(month)
flights %>% dbplot_line(month, mean(dep_delay, na.rm = TRUE))