This example uses the td_cfilter_mle() function from the tdplyr package to examine a dataset of grocery store transactions to identify items that are often bought together. This example also shows how R Graphics functions can be used with the output of tdplyr analytic functions.
The input data is shown in the following table "shopping_tbl".
trans_id | date | store_id | region | item | sku | category |
---|---|---|---|---|---|---|
1 | 20100715 | 1 | west | milk | 1 | dairy |
1 | 20100715 | 1 | west | butter | 2 | dairy |
1 | 20100715 | 1 | west | eggs | 3 | dairy |
1 | 19990715 | 1 | west | flour | 4 | baking |
2 | 20100715 | 1 | west | milk | 1 | dairy |
2 | 20100715 | 1 | west | butter | 2 | dairy |
2 | 20100715 | 1 | west | eggs | 3 | dairy |
3 | 20100715 | 1 | west | milk | 1 | dairy |
3 | 20100715 | 1 | west | eggs | 3 | dairy |
3 | 19990715 | 1 | west | flour | 4 | baking |
4 | 20100715 | 1 | west | milk | 1 | dairy |
4 | 20100715 | 1 | west | butter | 2 | dairy |
5 | 20100715 | 2 | west | butter | 2 | dairy |
5 | 20100715 | 2 | west | eggs | 3 | dairy |
5 | 19990715 | 2 | west | flour | 4 | baking |
6 | 20100715 | 2 | west | milk | 1 | dairy |
6 | 20100715 | 2 | west | eggs | 3 | dairy |
7 | 20100715 | 2 | west | eggs | 3 | dairy |
7 | 19990715 | 2 | west | flour | 4 | baking |
8 | 20100715 | 3 | west | butter | 2 | dairy |
8 | 20100715 | 3 | west | eggs | 3 | dairy |
8 | 19990715 | 3 | west | flour | 4 | baking |
- Create a tibble "tddf_shopping_tbl" from the table "shopping_tbl" in the database.
tddf_shopping_tbl <- tbl(con, "shopping_tbl")
- Call the td_cfilter_mle() function.
td_cfilter_out <- td_cfilter_mle( data = tddf_shopping_tbl, input.columns = c("item"), join.columns = c("trans_id"), add.columns = c("region") )
- Inspect the results.
print(td_cfilter_out$output.table)
- Take the results of interest from the function output, and use the R library "circlize" to display these results graphically in a chord diagram.
- Install the R library "circlize" on your R client, if it is not already installed. And load the library.
install.packages('circlize')
library(circlize)
- Create the graph.
chordDiagramFromDataFrame(output_table[,c("col1_item1","col1_item2","score")])
The resulting diagram is shown here.
- Install the R library "circlize" on your R client, if it is not already installed. And load the library.