.

Saturday, March 9, 2019

Predictive Modeling Decision Tree

assure kicks or bad purchases using Carvana Cleaned and Sampled. jmp saddle. Create a validation information set with 50% of the data. Use Decision Tree, Regression and Neural internet approached for building c on the wholeive models. Perform a comparative analysis of the triplet competing models on validation data set. Write down your final conclusions on which model performs the trounce, what is the best cut-off to use, and what is the value-added from conducting predictive modeling?Upload the saved file with the assignment. I created 6 models for this project, which are DT1, DT2, Reg1, Reg2, Reg3, and NN. After testing, the parameters I used to predict IsBadBuy in all my models are PurchDate, Auction, VehicleAge, Transmission, WheelType, VehOdo, All MMRs, VehBCost, IsOnlineSale, and WarrantyCost. Those parameters together can succor me get better models (i. e. ROC flying field 0. 7) I used the cut-off of 0. 6, because after nerve-racking out other cut-offs such as 0. 5 , 0. 7, and 0. , the results were either Im eliminating too galore(postnominal) Good Buys, or Im judge too many Bad Buys. As we know, both of the situations will shanghai the business (i. e. if we want stronger confident of the model, we will have too many 0s in the result, which means we may accept more Bad Buys in accident). Finally, I decided to use 0. 6 as my cut-off to balance the situation. The best model I chose is Reg2 (Forward regression model). I have two reasons First, Reg2 has the largest ROC Area in the Logistic Fit compression (Saved as Lodistic16), which is 0. 478 Second, it has a relatively low (the second smallest) number in the FalseNegative box from the contingency Table among all models. For my second reason, I didnt use general truth because I think the FalseNegative will damage the business more than FalsePossitive does. Because accidentally having a BadBuy will cost the company to do all require and fix job. For the Value-added calculation, as we can se e in the mishap tables (Saved as Contingency 16), the Baseline Accuracy is 49. 89. The accuracy of Reg2 is 82. 49. So the Reg2 provides the lift value of 82. 49/49. 89 = 1. 653.

No comments:

Post a Comment