• Login
    • Login
    Advanced Search
    View Item 
    •   Maseno IR Home
    • Theses & Dissertations
    • School of Mathematics, Statistics and Actuarial Science
    • Statistics and Actuarial Science
    • View Item
    •   Maseno IR Home
    • Theses & Dissertations
    • School of Mathematics, Statistics and Actuarial Science
    • Statistics and Actuarial Science
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Insurance Claim Analysis Using Extreme Gradient Boosting Trees-A Machine Learning Approach

    Thumbnail
    View/Open
    Master's thesis copy.pdf (359.3Kb)
    Publication Date
    2024
    Author
    KOLLONGEI, Naomi
    Metadata
    Show full item record
    Abstract/Overview
    The emergence of big data has revolutionized the way insurance companies deal with data that they receive in the course of their business, big data involves huge volumes of data of different varieties. Therefore the current methods used for analysis such as statistical methods and actuarial formulas in insurance sector are becoming inadequate to solve the emerging problems and opportunities from advancement in technology. Moreover, the data may be prone to missing values. Extreme gradient Boosting Algorithm (XGBoost) which is an ensemble learning which has the capacity to effectively address the two unique characteristics of the data. This research utilized an Extreme boosting algorithm to process insurance claim data in-order to model the frequency of claim and severity of claims for claim prediction. XGBoost creates tree-based models by iteratively fitting decision trees to the residuals of the previous predictions, effectively reducing the error in each iteration. Using the algorithm we aim to enhance the accuracy of predictions that will yield better estimates for improved risk assessment and pricing of insurance products within the insurance sector. The XGBoost algorithm models were evaluated using Root Mean Squared Error (RMSE), Mean Absolute Error (MAE) and Rsquared (RSQ). Results showed that XGBoost models for the claim frequency had a RMSE estimate of 0.949, MAE of 0.7741 and RSQ 0.781 and claim severity model had the metrics 899.12,736.77 and 0.9625 respectively. We also compared the performance of the XGBoost models with zero inflated poisson model, multiple linear regression and generalized Pareto Model. The XGBoost model had the best metrics (RMSE, MAE and RSQ), we therefore concluded that the Extreme Gradient Boosting Model was the optimal model. Key words: Big data, Frequency, Severity, machine learning, gradient boost, XGBoost
    Permalink
    https://repository.maseno.ac.ke/handle/123456789/6305
    Collections
    • Statistics and Actuarial Science [31]

    Maseno University. All rights reserved | Copyright © 2022 
    Contact Us | Send Feedback

     

     

    Browse

    All of Maseno IRCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

    My Account

    LoginRegister

    Statistics

    View Usage Statistics

    Maseno University. All rights reserved | Copyright © 2022 
    Contact Us | Send Feedback