'jaro' |
Jaro distance. |
'jaro_winkler' |
Jaro-Winkler distance: 1 for an exact match, 0 otherwise. If you specify this comparison type, you can specify the value of factor p with constant. 0 ≤ p ≤ 0.25. Default: p = 0.1 |
'n_gram' |
N-gram similarity. If you specify this comparison type, you can specify the value of N with constant. Default: N = 2 |
'LD' |
Levenshtein distance: Number of edits needed to transform one string into the other. Edits are insertions, deletions, or substitutions of individual characters. |
'LDWS' |
Levenshtein distance without substitution: Number of edits needed to transform one string into the other using only insertions or deletions of individual characters. |
'OSA' |
Optimal string alignment distance: Number of edits needed to transform one string into the other. Edits are insertions, deletions, substitutions, or transpositions of characters. A substring can be edited only once. |
'DL' |
Damerau-Levenshtein distance: Like 'OSA' except that a substring can be edited any number of times. |
'hamming' |
Hamming distance: For strings of equal length, number of positions where corresponding characters differ (that is, minimum number of substitutions needed to transform one string into the other). For strings of unequal length, -1. |
'LCS' |
Longest common substring: Length of longest substring common to both strings. |
'jaccard' |
Jaccard indexed-based comparison. |
'cosine' |
Cosine similarity. |
'soundexcode' |
Only for English strings: -1 if either string has a non-English character; otherwise, 1 if their soundex codes are the same and 0 otherwise. |