The supervised model is created like the unsupervised model, except the syntax element TagColumn ('match_tag') specifies the data for the on which to train the model. Supervised learning does not use initialization parameters.
Input
- InputTable: fstrainer_input, as in FellegiSunter Example: Unsupervised Learning
SQL Call
CREATE MULTISET TABLE "fg_supervised_model" AS ( SELECT * FROM FellegiSunter ( ON fstrainer_input AS InputTable USING ComparisonFields ('jaro1_sim: 0.8', 'ld1_sim:0.8', 'ngram1_sim:0.5', 'jw1_sim:0.8') TagColumn ('match_tag') ) AS dt ) WITH DATA;
Output
This query returns the following table:
SELECT * FROM fg_supervised_model ORDER BY 1;
_key _value ---------------------------- ------------------ comparison_filed_cnt 4 comparison_filed_name_0 jaro1_sim comparison_filed_name_1 ld1_sim comparison_filed_name_2 ngram1_sim comparison_filed_name_3 jw1_sim comparison_filed_threshold_0 0.8 comparison_filed_threshold_1 0.8 comparison_filed_threshold_2 0.5 comparison_filed_threshold_3 0.8 is_supervised true lambda 0.9 lower_bound -0.795859283219775 mu 0.9 m_0 0.9999999 m_1 0.2 m_2 0.6 m_3 0.9999999 time_used 16.103000 seconds upper_bound -0.795859283219775 u_0 0.666666666666667 u_1 1.0E-7 u_2 1.0E-7 u_3 0.833333333333333
Download a zip file of all examples and a SQL script file that creates their input tables from the attachment in the left sidebar.