For positional n-gram matching, the position and the pattern must match when measuring similarity. The position value indicates how far away positionally the match may be between the 2 strings as follows:
position Value | Where Match Must Be |
---|---|
0 | At same position in the 2 strings |
x | Within x positions in the 2 strings For example, if x = 2, match must be within 2 positions in the 2 strings. |
As an example, for a string of 'abc', the 1-grams (length =1) are 'a', 'b', and 'c'. The 2-grams (length =2) are 'ab' and 'bc'. The 3-gram (length = 3) is 'abc'. By definition, there are no 4-grams or greater.
The function returns zero in the following cases:
- If the length argument is greater than the length of either string1 or string2.
- If the length argument is <= 0 or if either string1 or string2 is an empty string.
Patterns beyond the length of 255 are ignored.