The objective of this study is to help insurance companies to take informed decision on the probability of person making claim based on the certain parameters recorded and/or processed while issuance of the policy.
Random data is generated for the following parameters which we assumed may have impact on the person health and could help in predicting the possibility of person making claim.
Parameters:
1. Age (Between 20- 50)
2. Gender (M/F)
3. Smoking(Yes/No)
4. Drinking (Yes/No)
5. Drugs(Yes/No)
6. Rash Driving Cases( Yes/No)
7. Disease Type-I (Life Threatening) –(Yes/No)
8. Disease Type-II – (Yes/No)
9. Life Threatening Job- (Yes/No)
While generating the data for the use of usage all data was generated as Boolean (0/1) for all parameters apart from Age. Following is the sample data set
We found that if a person with Drinking habit, rash driving cases and doing a life threating job have high probability of making a claim to insurance company.
1. To make changes of some column type from numerical to categorical to simulate real world scenarios.
2. Generate Graphs for data presentation and display.
3. Need to increase the data input count and to see if performance improves.
4. A sample single page application.
Algorithm and Input Data:
Microsoft Azure “Two Class Boosted Decision Tree”, it is one of the most powerful ensemble methods which correct the errors in subsequent trees till the end.Random data is generated for the following parameters which we assumed may have impact on the person health and could help in predicting the possibility of person making claim.
Parameters:
1. Age (Between 20- 50)
2. Gender (M/F)
3. Smoking(Yes/No)
4. Drinking (Yes/No)
5. Drugs(Yes/No)
6. Rash Driving Cases( Yes/No)
7. Disease Type-I (Life Threatening) –(Yes/No)
8. Disease Type-II – (Yes/No)
9. Life Threatening Job- (Yes/No)
While generating the data for the use of usage all data was generated as Boolean (0/1) for all parameters apart from Age. Following is the sample data set
Age
|
Gender
|
smoking
|
Drinking
|
rash driving- cases
|
Diseases Type 1
|
Diseases Type 2
|
Life threatening job
|
Claims Made
|
34
|
0(Female)
|
0(No)
|
0(No)
|
0(No)
|
0(No)
|
0(No)
|
0(No)
|
0(No)
|
Outcome:
In below example, after completing the Data modeling, Data Learning and Model Evaluation process of the algorithm, random data was entered to check the predictability of the algorithm.We found that if a person with Drinking habit, rash driving cases and doing a life threating job have high probability of making a claim to insurance company.
Suggested Changes:
After demo following changes are suggested:1. To make changes of some column type from numerical to categorical to simulate real world scenarios.
2. Generate Graphs for data presentation and display.
3. Need to increase the data input count and to see if performance improves.
4. A sample single page application.
Well explained. It helped to understand what is Two Class Boosted Decision algorithm and it can be useful in insurance industry (an example).
ReplyDeleteThanks for ssharing
ReplyDelete