The nPath function uses greedy pattern matching, finding the longest available match despite any nongreedy operators in the pattern.
For example, consider the input table link2:
userid | job_title | startdate | enddate |
---|---|---|---|
21 | Chief Exec Officer | 1994-10-01 | 2005-02-28 |
21 | Software Engineer | 1996-10-01 | 2001-06-30 |
21 | Software Engineer | 1998-10-01 | 2001-06-30 |
21 | Chief Exec Officer | 2005-03-01 | 2007-03-31 |
21 | Chief Exec Officer | 2007-06-01 | ? |
The following query returns the following table:
SELECT job_transition_path, count(*) AS path_count FROM nPath (
ON link2 PARTITION BY userid ORDER BY startdate
USING
Mode (NONOVERLAPPING)
Pattern ('CEO.ENGR.OTHER*')
Symbols (
job_title like '%Software Eng%' AS ENGR,
TRUE AS OTHER,
job_title like 'Chief Exec Officer' AS CEO
)
Result (accumulate(job_title OF ANY(ENGR,OTHER,CEO)) AS job_transition_path)
) AS dt GROUP BY 1 ORDER BY 2 DESC;
job_transition_path | path_count |
---|---|
[Chief Exec Officer, Software Engineer, Software Engineer, Chief Exec Officer, Chief Exec Officer] | 1 |
In the pattern, CEO matches the first row, ENGR matches the second row, and OTHER* matches the remaining rows:
The following query returns the following table:
SELECT job_transition_path , count(*) AS path_count FROM nPath (
ON link2 PARTITION BY userid ORDER BY startdate
USING
Mode (NONOVERLAPPING)
Pattern ('CEO.ENGR.OTHER*.CEO')
Symbols (
job_title like '%Software Eng%' AS ENGR,
TRUE AS OTHER,
job_title like 'Chief Exec Officer' AS CEO
)
Result (accumulate(job_title OF ANY(ENGR,OTHER,CEO)) AS job_transition_path)
) AS dt GROUP BY 1 ORDER BY 2 DESC;
job_transition_path | path_count |
---|---|
[Chief Exec Officer, Software Engineer, Software Engineer, Chief Exec Officer, Chief Exec Officer] | 1 |
In the pattern, CEO matches the first row, ENGR matches the second row, OTHER* matches the next two rows, and CEO matches the last row: