Input
- input: Created by applying the TextTokenizer function to the training table complaints, a log of vehicle complaints
In complaints, the category column indicates whether the car has been in a crash.
doc_id | text_data | category |
---|---|---|
1 | consumer was driving approximately 45 mph hit a deer with the front bumper and then ran into an embankment head-on passenger's side air bag did deploy hit windshield and deployed outward. driver's side airbag cover opened but did not inflate it was still folded causing injuries. | crash |
2 | when vehicle was involved in a crash totalling vehicle driver's side/ passenger's side air bags did not deploy. vehicle was making a left turn and was hit by a ford f350 traveling about 35 mph on the front passenger's side. driver hit his head-on the steering wheel. hurt his knee and received neck and back injuries. | crash |
3 | consumer has experienced following problems; 1.) both lower ball joints wear out excessively; 2.) head gasket leaks; and 3.) cruise control would shut itself off while driving without foot pressing on brake pedal. | no_crash |
... | ... | ... |
SQL Call
This call creates the model table, complaints_tokens_model, by calling NaiveBayesTextClassifierTrainer. It creates the NaiveBayesTextClassifierTrainer input table, token, by applying TextTokenizer to the table complaints.
CREATE MULTISET TABLE complaints_tokens_model AS ( SELECT * FROM NaiveBayesTextClassifierTrainer ( ON NaiveBayesTextClassifierInternal ( ON ( SELECT doc_id, lower(token) AS token, category FROM TextTokenizer ( ON complaints PARTITION BY ANY USING TextColumn ('text_data') OutputByWord ('true') Accumulate ('doc_id', 'category') ) AS dt1 ) AS InputTable PARTITION BY category USING TokenColumn ('token') ModelType ('Bernoulli') DocIDColumns ('doc_id') DocCategoryColumn ('category') ) AS dt2 PARTITION BY 1 ) AS dt ) WITH DATA;
Output
This query returns the following table:
SELECT * FROM complaints_tokens_model ORDER BY prob desc;
token category prob ------------------------------------- --------- -------------------- NAIVE_BAYES_TEXT_MODEL_TYPE BERNOULLI 1.0 . no_crash 0.9411764705882353 . crash 0.8571428571428571 NAIVE_BAYES_PRIOR_PROBABILITY no_crash 0.75 a crash 0.7142857142857143 and crash 0.7142857142857143 the no_crash 0.7058823529411765 vehicle no_crash 0.5882352941176471 driver's crash 0.5714285714285714 side crash 0.5714285714285714 did crash 0.5714285714285714 deploy crash 0.5714285714285714 hit crash 0.5714285714285714 head-on crash 0.5714285714285714 not crash 0.5714285714285714 approximately crash 0.5714285714285714 mph crash 0.5714285714285714 air crash 0.5714285714285714 vehicle crash 0.5714285714285714 passenger's crash 0.5714285714285714 and no_crash 0.5294117647058824 to no_crash 0.47058823529411764 airbags crash 0.42857142857142855 at crash 0.42857142857142855 another crash 0.42857142857142855 injuries crash 0.42857142857142855 on crash 0.42857142857142855 deployed crash 0.42857142857142855 into crash 0.42857142857142855 consumer crash 0.42857142857142855 to crash 0.42857142857142855 was crash 0.42857142857142855 bags crash 0.42857142857142855 the crash 0.42857142857142855 driver crash 0.42857142857142855 front crash 0.42857142857142855 in crash 0.42857142857142855 was no_crash 0.4117647058823529 on no_crash 0.35294117647058826 manufacturer no_crash 0.35294117647058826 consumer no_crash 0.35294117647058826 when no_crash 0.35294117647058826 has no_crash 0.35294117647058826 is no_crash 0.35294117647058826 a no_crash 0.29411764705882354 in no_crash 0.29411764705882354 not no_crash 0.29411764705882354 dealer no_crash 0.29411764705882354 outward crash 0.2857142857142857 70mph crash 0.2857142857142857 impact crash 0.2857142857142857 rear crash 0.2857142857142857 neck crash 0.2857142857142857 wheel crash 0.2857142857142857 knee crash 0.2857142857142857 truck crash 0.2857142857142857 airbag crash 0.2857142857142857 by crash 0.2857142857142857 about crash 0.2857142857142857 turn crash 0.2857142857142857 left crash 0.2857142857142857 forward crash 0.2857142857142857 it crash 0.2857142857142857 injuries.dealer crash 0.2857142857142857 involved crash 0.2857142857142857 week crash 0.2857142857142857 traveling crash 0.2857142857142857 idle crash 0.2857142857142857 side/ crash 0.2857142857142857 car crash 0.2857142857142857 but crash 0.2857142857142857 rearended crash 0.2857142857142857 incident crash 0.2857142857142857 folded crash 0.2857142857142857 occasions crash 0.2857142857142857 with crash 0.2857142857142857 making crash 0.2857142857142857 bag crash 0.2857142857142857 raced crash 0.2857142857142857 engine crash 0.2857142857142857 back crash 0.2857142857142857 50 crash 0.2857142857142857 80 crash 0.2857142857142857 one crash 0.2857142857142857 cover crash 0.2857142857142857 slowing crash 0.2857142857142857 35 crash 0.2857142857142857 received crash 0.2857142857142857 driving crash 0.2857142857142857 park crash 0.2857142857142857 upon crash 0.2857142857142857 totalling crash 0.2857142857142857 fence crash 0.2857142857142857 hurt crash 0.2857142857142857 when crash 0.2857142857142857 mphand crash 0.2857142857142857 f350 crash 0.2857142857142857 then crash 0.2857142857142857 his crash 0.2857142857142857 ran crash 0.2857142857142857 sustained crash 0.2857142857142857 has crash 0.2857142857142857 condition crash 0.2857142857142857 65 crash 0.2857142857142857 dual crash 0.2857142857142857 neither crash 0.2857142857142857 an crash 0.2857142857142857 or crash 0.2857142857142857 deer crash 0.2857142857142857 for crash 0.2857142857142857 45 crash 0.2857142857142857 embankment crash 0.2857142857142857 still crash 0.2857142857142857 why crash 0.2857142857142857 lurched crash 0.2857142857142857 prior crash 0.2857142857142857 two crash 0.2857142857142857 building crash 0.2857142857142857 windshield crash 0.2857142857142857 ended crash 0.2857142857142857 bumper crash 0.2857142857142857 steering crash 0.2857142857142857 had crash 0.2857142857142857 ford crash 0.2857142857142857 determine crash 0.2857142857142857 shop crash 0.2857142857142857 high crash 0.2857142857142857 opened crash 0.2857142857142857 crash crash 0.2857142857142857 inflate crash 0.2857142857142857 causing crash 0.2857142857142857 dealer crash 0.2857142857142857 while crash 0.2857142857142857 been crash 0.2857142857142857 crashed crash 0.2857142857142857 NAIVE_BAYES_PRIOR_PROBABILITY crash 0.25 would no_crash 0.23529411764705882 by no_crash 0.23529411764705882 at no_crash 0.23529411764705882 also no_crash 0.23529411764705882 of no_crash 0.23529411764705882 work no_crash 0.23529411764705882 replaced no_crash 0.23529411764705882 recall no_crash 0.23529411764705882 problem no_crash 0.23529411764705882 been no_crash 0.23529411764705882 had no_crash 0.23529411764705882 causing no_crash 0.23529411764705882 will no_crash 0.23529411764705882 this no_crash 0.17647058823529413 defect no_crash 0.17647058823529413 out no_crash 0.17647058823529413 wheel no_crash 0.17647058823529413 left no_crash 0.17647058823529413 repaired no_crash 0.17647058823529413 engine no_crash 0.17647058823529413 wipers no_crash 0.17647058823529413 broke no_crash 0.17647058823529413 & no_crash 0.17647058823529413 times no_crash 0.17647058823529413 be no_crash 0.17647058823529413 shut no_crash 0.17647058823529413 driving no_crash 0.17647058823529413 off no_crash 0.17647058823529413 under no_crash 0.17647058823529413 have no_crash 0.17647058823529413 miles no_crash 0.17647058823529413 switch no_crash 0.17647058823529413 after no_crash 0.17647058823529413 from no_crash 0.17647058823529413 that no_crash 0.17647058823529413 an no_crash 0.17647058823529413 determine no_crash 0.17647058823529413 3 no_crash 0.17647058823529413 informed no_crash 0.17647058823529413 notified no_crash 0.17647058823529413 control no_crash 0.17647058823529413 owner no_crash 0.17647058823529413 which no_crash 0.17647058823529413 ignition no_crash 0.17647058823529413 windshield no_crash 0.17647058823529413 down no_crash 0.17647058823529413 transmission no_crash 0.17647058823529413 while no_crash 0.17647058823529413 up no_crash 0.17647058823529413 front no_crash 0.17647058823529413 still no_crash 0.17647058823529413 NAIVE_BAYES_MISSING_TOKEN_PROBABILITY crash 0.14285714285714285 four no_crash 0.11764705882352941 4 no_crash 0.11764705882352941 it no_crash 0.11764705882352941 cruise no_crash 0.11764705882352941 brake's no_crash 0.11764705882352941 increasedit no_crash 0.11764705882352941 jiggle no_crash 0.11764705882352941 around no_crash 0.11764705882352941 2 no_crash 0.11764705882352941 own no_crash 0.11764705882352941 stuck no_crash 0.11764705882352941 recker no_crash 0.11764705882352941 sunroof no_crash 0.11764705882352941 rear no_crash 0.11764705882352941 turned no_crash 0.11764705882352941 dealership no_crash 0.11764705882352941 transfer no_crash 0.11764705882352941 hitting no_crash 0.11764705882352941 notfied no_crash 0.11764705882352941 resulted no_crash 0.11764705882352941 hill no_crash 0.11764705882352941 owners no_crash 0.11764705882352941 speeds no_crash 0.11764705882352941 10mph no_crash 0.11764705882352941 incline no_crash 0.11764705882352941 speed no_crash 0.11764705882352941 change no_crash 0.11764705882352941 speedometer no_crash 0.11764705882352941 stalled no_crash 0.11764705882352941 turn no_crash 0.11764705882352941 repairs no_crash 0.11764705882352941 airbag no_crash 0.11764705882352941 housing no_crash 0.11764705882352941 provide no_crash 0.11764705882352941 thousand no_crash 0.11764705882352941 belts/speed no_crash 0.11764705882352941 problems no_crash 0.11764705882352941 referenced no_crash 0.11764705882352941 coming no_crash 0.11764705882352941 almost no_crash 0.11764705882352941 experienced no_crash 0.11764705882352941 module no_crash 0.11764705882352941 about no_crash 0.11764705882352941 intermittently no_crash 0.11764705882352941 happened no_crash 0.11764705882352941 inoperative no_crash 0.11764705882352941 truck no_crash 0.11764705882352941 electrical no_crash 0.11764705882352941 case no_crash 0.11764705882352941 traveling no_crash 0.11764705882352941 wants no_crash 0.11764705882352941 1998 no_crash 0.11764705882352941 what no_crash 0.11764705882352941 hour no_crash 0.11764705882352941 please no_crash 0.11764705882352941 but no_crash 0.11764705882352941 expense no_crash 0.11764705882352941 occurring no_crash 0.11764705882352941 slowing no_crash 0.11764705882352941 total no_crash 0.11764705882352941 over no_crash 0.11764705882352941 slip no_crash 0.11764705882352941 unexpectedly no_crash 0.11764705882352941 completed no_crash 0.11764705882352941 saw no_crash 0.11764705882352941 does no_crash 0.11764705882352941 alternator/ no_crash 0.11764705882352941 storm no_crash 0.11764705882352941 made no_crash 0.11764705882352941 without no_crash 0.11764705882352941 rpms no_crash 0.11764705882352941 started no_crash 0.11764705882352941 information no_crash 0.11764705882352941 side no_crash 0.11764705882352941 if no_crash 0.11764705882352941 controlcable no_crash 0.11764705882352941 head no_crash 0.11764705882352941 heard no_crash 0.11764705882352941 pull no_crash 0.11764705882352941 ) no_crash 0.11764705882352941 68000 no_crash 0.11764705882352941 stopped no_crash 0.11764705882352941 pedal no_crash 0.11764705882352941 99v029000 no_crash 0.11764705882352941 pump no_crash 0.11764705882352941 burned no_crash 0.11764705882352941 joints no_crash 0.11764705882352941 corrected no_crash 0.11764705882352941 walnut no_crash 0.11764705882352941 lower no_crash 0.11764705882352941 r&r no_crash 0.11764705882352941 accurate no_crash 0.11764705882352941 ea02-025 no_crash 0.11764705882352941 back no_crash 0.11764705882352941 themselves no_crash 0.11764705882352941 ball no_crash 0.11764705882352941 gear no_crash 0.11764705882352941 yh no_crash 0.11764705882352941 become no_crash 0.11764705882352941 properly no_crash 0.11764705882352941 can't no_crash 0.11764705882352941 defective no_crash 0.11764705882352941 do no_crash 0.11764705882352941 factory no_crash 0.11764705882352941 referred no_crash 0.11764705882352941 then no_crash 0.11764705882352941 following no_crash 0.11764705882352941 parked no_crash 0.11764705882352941 pressing no_crash 0.11764705882352941 malfunctioned no_crash 0.11764705882352941 rain no_crash 0.11764705882352941 shortening no_crash 0.11764705882352941 blew no_crash 0.11764705882352941 stall no_crash 0.11764705882352941 further no_crash 0.11764705882352941 took no_crash 0.11764705882352941 it's no_crash 0.11764705882352941 drive no_crash 0.11764705882352941 noise no_crash 0.11764705882352941 light no_crash 0.11764705882352941 brake no_crash 0.11764705882352941 manufactured no_crash 0.11764705882352941 smoke no_crash 0.11764705882352941 battery no_crash 0.11764705882352941 reimbursement no_crash 0.11764705882352941 ; no_crash 0.11764705882352941 motor no_crash 0.11764705882352941 were no_crash 0.11764705882352941 performed no_crash 0.11764705882352941 both no_crash 0.11764705882352941 into no_crash 0.11764705882352941 keep no_crash 0.11764705882352941 sitting no_crash 0.11764705882352941 move no_crash 0.11764705882352941 leaking no_crash 0.11764705882352941 totally no_crash 0.11764705882352941 rolled no_crash 0.11764705882352941 tune no_crash 0.11764705882352941 stayed no_crash 0.11764705882352941 could no_crash 0.11764705882352941 leaks no_crash 0.11764705882352941 power no_crash 0.11764705882352941 coil no_crash 0.11764705882352941 bearing no_crash 0.11764705882352941 97v017000 no_crash 0.11764705882352941 caused no_crash 0.11764705882352941 driveshaft no_crash 0.11764705882352941 itself no_crash 0.11764705882352941 its no_crash 0.11764705882352941 off/on no_crash 0.11764705882352941 separated no_crash 0.11764705882352941 for no_crash 0.11764705882352941 gasket no_crash 0.11764705882352941 just no_crash 0.11764705882352941 drivers no_crash 0.11764705882352941 excessively no_crash 0.11764705882352941 due no_crash 0.11764705882352941 cable no_crash 0.11764705882352941 owner's no_crash 0.11764705882352941 first no_crash 0.11764705882352941 *ml no_crash 0.11764705882352941 mechanic no_crash 0.11764705882352941 frame no_crash 0.11764705882352941 starter no_crash 0.11764705882352941 start no_crash 0.11764705882352941 fire no_crash 0.11764705882352941 periodcally no_crash 0.11764705882352941 reinspected no_crash 0.11764705882352941 cannot no_crash 0.11764705882352941 compartment no_crash 0.11764705882352941 shift no_crash 0.11764705882352941 owned no_crash 0.11764705882352941 crash no_crash 0.11764705882352941 smelled no_crash 0.11764705882352941 wear no_crash 0.11764705882352941 checked no_crash 0.11764705882352941 foot no_crash 0.11764705882352941 son no_crash 0.11764705882352941 fail no_crash 0.11764705882352941 aware no_crash 0.11764705882352941 loss no_crash 0.11764705882352941 1 no_crash 0.11764705882352941 steering no_crash 0.11764705882352941 66900 no_crash 0.11764705882352941 NAIVE_BAYES_MISSING_TOKEN_PROBABILITY no_crash 0.058823529411764705
Download a zip file of all examples and a SQL script file that creates their input tables from the attachment in the left sidebar.