It is assumed that the training and test datasets are already in the database.
The following table "complaints" contains the training dataset.
|1||consumer was driving approximately 45 mph hit a deer with the front bumper and then ran into an embankment head-on passenger's side air bag did deploy hit windshield and deployed outward. driver's side airbag cover opened but did not inflate it was still folded causing injuries.||crash|
|2||when vehicle was involved in a crash totalling vehicle driver's side/ passenger's side air bags did not deploy. vehicle was making a left turn and was hit by a ford f350 traveling about 35 mph on the front passenger's side. driver hit his head-on the steering wheel. hurt his knee and received neck and back injuries.||crash|
|3||consumer has experienced following problems; 1.) both lower ball joints wear out excessively; 2.) head gasket leaks; and 3.) cruise control would shut itself off while driving without foot pressing on brake pedal.||no_crash|
|4||transfer case was repaired under recall. after the work was completed noise was heard intermittently. consumer took vehicle back to dealer. the dealer re-inspected vehicle and informed the owner that the driveshaft was hitting the transfer case. the manufacturer has been notified.||no_crash|
|5||transmission would start to slip when traveling just 10mph. the rpms would be over 3 thousand. had vehicle checked at dealership & was informed transmission was stuck & that it's a factory defect almost blew up. also speedometer does not keep accurate speeds. if speed is increased, it would fail to work. this was referred to mechanic by manufacturer.||no_crash|
|6||due to the defective ignition cable which burned the coil the vehicle stalled unexpectedly which could have resulted in a crash. also dealer replaced the r&r drive belts/speed control cable and performed vehicle tune up. please provide further information.||no_crash|
|7||when switch is turned on windshield wipers would not work properly. would have to jiggle switch & then wipers would move. wipers do turn off/on by themselves. recall 97v017000.||no_crash|
|8||consumer was driving in a rain storm when the windshield wipers stopped this happened periodically.||no_crash|
|9||at 66900 miles transmission has malfunctioned and will not shift into first gear. repairs were made at owner's expense wants reimbursement. *ml||no_crash|
|10||when truck was sitting on an incline it rolled on its own. manufacturer was aware of the problem. problem has not been corrected. the truck is owned by walnut hill recker manufactured in 1998.||no_crash|
|11||car engine raced while slowing to park. car lurched forward and crashed into a fence and a building. car had been in shop approximately one week prior to incident for high idle condition.||crash|
|12||rear ended another vehicle at 65 to 70mph and neither driver's side or passenger's side airbags deployed. dealer has vehicle.||crash|
|13||while vehicle was parked for an hour a fire started on the left side of the engine compartment. owners son smelled smoke owner saw fire coming from around drivers side front wheel. referenced in ea02-025||no_crash|
|14||after vehicle was repaired under recall 99v029000 ignition switch the airbag light stayed on . the dealer and the manufacturer has been notified.||no_crash|
|15||electrical control module is shortening out causing the vehicle to stall. engine will become totally inoperative. consumer had to change alternator/ battery and starter and module replaced 4 times but defect still occurring cannot determine what is causing the problem.||no_crash|
|16||at 68000 miles power steering broke off the housing pump causing total loss of power steering which also caused the vehicle to shut down.||no_crash|
|17||on two occasions dual airbags did not deploy. consumer rear-ended another vehicle at approximately 50 mph and at 80 mph hit a truck head-on upon impact air bags did not deploy. driver sustained injuries. dealer did not determine why air bags did not deploy.||crash|
|18||sunroof is leaking.||no_crash|
|19||motor and the frame separated from vehicle. manufacturer will be notified.||no_crash|
|20||rear front wheel bearing broke causing vehicle to pull to the left when slowing down. consumer had brake's replaced about four times and still dealer can't determine the problem.||no_crash|
This following table "nbayes_test" contains the test dataset.
|1||ELECTRICAL CONTROL MODULE IS SHORTENING OUT, CAUSING THE VEHICLE TO STALL. ENGINE WILL BECOME TOTALLY INOPERATIVE. CONSUMER HAD TO CHANGE ALTERNATOR/ BATTERY AND STARTER, AND MODULE REPLACED 4 TIMES, BUT DEFECT STILL OCCURRING CANNOT DETERMINE WHAT IS CAUSING THE PROBLEM.|
|2||ABS BRAKES FAIL TO OPERATE PROPERLY, AND AIR BAGS FAILED TO DEPLOY DURING A CRASH AT APPROX. 28 MPH IMPACT. MANUFACTURER NOTIFIED.|
|3||WHILE DRIVING AT 60 MPH GAS PEDAL GOT STUCK DUE TO THE RUBBER THAT IS AROUND THE GAS PEDAL.|
|4||THERE IS A KNOCKING NOISE COMING FROM THE CATALYTIC CONVERTER, AND THE VEHICLE IS STALLING. ALSO, HAS PROBLEM WITH THE STEERING.|
|5||CONSUMER WAS MAKING A TURN, DRIVING AT APPROX 5-10 MPH WHEN CONSUMER HIT ANOTHER VEHICLE. UPON IMPACT, DUAL AIRBAGS DID NOT DEPLOY. ALL DAMAGE WAS DONE FROM ENGINE TO TRANSMISSION, TO THE FRONT OF VEHICLE, AND THE VEHICLE CONSIDERED A TOTAL LOSS.|
|6||WHEEL BEARING AND HUBS CRACKED, CAUSING THE METAL TO GRIND WHEN MAKING A RIGHT TURN. ALSO WHEN APPLYING THE BRAKES, PEDAL GOES TO THE FLOOR, CAUSE UNKNOWN. WAS ADVISED BY MIDAS NOT TO DRIVE VEHICLE- WHEELE COULD COME OFF.|
|7||DRIVING ABOUT 5-10 MPH, THE VEHICLE HAD A LOW FRONTAL IMPACT IN WHICH THE OTHER VEHICLE HAD NO DAMAGES. UPON IMPACT, DRIVER'S AND THE PASSENGER'S AIR BAGS DID NOT DEPLOY, RESULTING IN INJURIES. PLEASE PROVIDE FURTHER INFORMATION AND VIN#.|
|8||THE AIR BAG WARNING LIGHT HAS COME ON, INDICATING AIRBAGS ARE INOPERATIVE. THEY WERE FIXED ONE AT THE TIME, BUT PROBLEM HAS REOCCURRED.|
|9||CONSUMER WAS DRIVING WEST WHEN THE OTHER CAR WAS GOING EAST. THE OTHER CAR TURNED IN FRONT OF CONSUMER'S VEHICLE, CONSUMER HIT OTHER VEHICLE AND STARTED TO SPIN AROUND, COULDN'T STOP, RESULTING IN A CRASH. UPON IMPACT, AIRBAGS DIDN'T DEPLOY.|
|10||WHILE DRIVING ABOUT 65 MPH AND THE TRANSMISSION MADE A STRANGE NOISE, AND THE LEFT FRONT AXLE LOCKED UP. THE DEALER HAS REPAIRED THE VEHICLE.|
This example shows the steps to build a Naïve Bayes Text Classifier model and then apply the model to the new log data.
Create a tibble "tddf_complaints" from the table "complaints" in the database.
tddf_complaints <- tbl(con, "complaints")
Create a tibble "tddf_nbayes_tokens" consisting of the tokens from the training dataset.
Use the td_text_tokenizer_mle() function from the tdplyr package to perform token analysis on the training dataset.
td_text_tokenizer_out <- td_text_tokenizer_mle ( data = tddf_complaints, text.column = "text_data", language = "en", output.byword = TRUE, user.dictionary = NULL, accumulate = c("doc_id", "category") )
Save the tokenizer task output into the database table "nbayes_tokens".
copy_to(con, td_text_tokenizer_out$result, name="nbayes_tokens")
Create a tibble "tddf_nbayes_tokens" from the table "nbayes_tokens".
tddf_nbayes_tokens <- tbl(con, "nbayes_tokens")
- Use the td_text_tokenizer_mle() function from the tdplyr package to perform token analysis on the training dataset.
Train a new Naïve Bayes Text Classifier based on the tibble from the training dataset, using the td_naivebayes_textclassifier_mle() function from tdplyr package.
nb_textclassifier_model <- td_naivebayes_textclassifier_mle( data = tddf_nbayes_tokens, data.partition.column = "category", token.column = "token", doc.category.column = "category")Next, repeat step 1 and 2 to apply the model to the test data as follows.
Create a tibble "tddf_nbayes_test" from the table "nbayes_test" in the database.
tddf_nbayes_test <- tbl(con, "nbayes_test")
Create a tibble "tddf_nbayes_tokens_test" consisting of the tokens from the test dataset.
Use the td_text_tokenizer_mle() function from the tdplyr package to perform token analysis on the test dataset.
td_text_tokenizer_test_out <- td_text_tokenizer_mle ( data = tddf_complaints, text.column = "text_data", language = "en", output.byword = TRUE, user.dictionary = NULL, accumulate = c("doc_id", "category") )
Save the tokenizer task output into the database table "nbayes_tokens_test".
copy_to(con, td_text_tokenizer_test_out$result, name="nbayes_tokens_test")
Create a tibble from the table "nbayes_tokens_test".
tddf_nbayes_tokens_test <- tbl(con, "nbayes_tokens_test")
- Use the td_text_tokenizer_mle() function from the tdplyr package to perform token analysis on the test dataset.
Predict the categories ('crash' or 'no crash') by applying the Naïve Bayes Text Classifier model to the tibble from the test dataset, using the td_naivebayes_textclassifier_predict_sqle() function.
nb_textclassifier_pred <- td_naivebayes_textclassifier_predict_sqle( newdata = tddf_nbayes_tokens_test, object = nb_textclassifier_model, newdata.partition.column = "doc_id", input.token.column = "token", doc.id.columns = c("doc_id") )
Inspect the results.