This example uses nPath and FrequentPaths to select sequences.
Input
- InputTable: nPath_output, which the example creates by inputting the following table, sequence_table, to the nPath function
id | datestamp | item |
---|---|---|
1 | 2004-03-17 16:35:00 | A |
1 | 2004-03-17 16:38:00 | B |
1 | 2004-03-17 16:42:00 | C |
2 | 2004-03-18 01:16:00 | B |
2 | 2004-03-18 01:18:00 | C |
2 | 2004-03-18 01:20:00 | D |
3 | 2004-03-19 08:33:00 | A |
3 | 2004-03-19 08:36:00 | D |
3 | 2004-03-19 08:38:00 | C |
The following statement creates nPath_output, populating it with sequences that start with "A" and end with "C", and using the Accumulate argument to output the full sequence.
CREATE VIEW nPath_output AS ( SELECT * FROM nPath ( ON sequence_table PARTITION BY id ORDER BY datestamp USING Pattern ('itemA.itemAny*.itemC') Symbols (item='A' AS itemA, item='C' AS itemC, TRUE AS itemAny) "Result" (FIRST(id OF itemA) AS id, Accumulate (item OF ANY(itemA, itemAny, itemC)) AS path) "Mode" (NONOVERLAPPING) ) AS dt );
This query returns the following table:
SELECT * FROM nPath_output ORDER BY id;
id | path |
---|---|
1 | [A, B, C] |
3 | [A, D, C] |
SQL Call
The FrequentPaths function outputs the sequences that start start with 'A' and end with 'C'.
SELECT * FROM FrequentPaths ( ON nPath_output AS InputTable OUT TABLE OutputTable (output7) USING PartitionColumns ('id') PathColumn ('path') MinSupport (2) ) AS dt;
Output
message |
---|
Finished. Totally 3 patterns were found. |
This query returns the following table:
SELECT * FROM output7 ORDER BY length, pattern;
pattern | support | length |
---|---|---|
A | 2 | 1 |
C | 2 | 1 |
A;C | 2 | 2 |