File Download

There are no files associated with this item.

  Links for fulltext
     (May Require Subscription)
Supplementary

Article: Imbalanced learning for insurance using modified loss functions in tree-based models

TitleImbalanced learning for insurance using modified loss functions in tree-based models
Authors
KeywordsCanberra distance
Custom loss
Imbalanced learning
Predictive model of insurance claims
Regression tree
Tree-based algorithms
Issue Date2022
Citation
Insurance Mathematics and Economics, 2022, v. 106, p. 13-32 How to Cite?
AbstractTree-based models have gained momentum in insurance claim loss modeling; however, the point mass at zero and the heavy tail of insurance loss distribution pose the challenge to apply conventional methods directly to claim loss modeling. With a simple illustrative dataset, we first demonstrate how the traditional tree-based algorithm's splitting function fails to cope with a large proportion of data with zero responses. To address the imbalance issue presented in such loss modeling, this paper aims to modify the traditional splitting function of Classification and Regression Tree (CART). In particular, we propose two novel modified loss functions, namely, the weighted sum of squared error and the sum of squared Canberra error. These modified loss functions impose a significant penalty on grouping observations of non-zero response with those of zero response at the splitting procedure, and thus significantly enhance their separation. Finally, we examine and compare the predictive performance of such modified tree-based models to the traditional model on synthetic datasets that imitate insurance loss. The results show that such modification leads to substantially different tree structures and improved prediction performance.
Persistent Identifierhttp://hdl.handle.net/10722/363454
ISSN
2023 Impact Factor: 1.9
2023 SCImago Journal Rankings: 1.113

 

DC FieldValueLanguage
dc.contributor.authorHu, Changyue-
dc.contributor.authorQuan, Zhiyu-
dc.contributor.authorChong, Wing Fung-
dc.date.accessioned2025-10-10T07:46:58Z-
dc.date.available2025-10-10T07:46:58Z-
dc.date.issued2022-
dc.identifier.citationInsurance Mathematics and Economics, 2022, v. 106, p. 13-32-
dc.identifier.issn0167-6687-
dc.identifier.urihttp://hdl.handle.net/10722/363454-
dc.description.abstractTree-based models have gained momentum in insurance claim loss modeling; however, the point mass at zero and the heavy tail of insurance loss distribution pose the challenge to apply conventional methods directly to claim loss modeling. With a simple illustrative dataset, we first demonstrate how the traditional tree-based algorithm's splitting function fails to cope with a large proportion of data with zero responses. To address the imbalance issue presented in such loss modeling, this paper aims to modify the traditional splitting function of Classification and Regression Tree (CART). In particular, we propose two novel modified loss functions, namely, the weighted sum of squared error and the sum of squared Canberra error. These modified loss functions impose a significant penalty on grouping observations of non-zero response with those of zero response at the splitting procedure, and thus significantly enhance their separation. Finally, we examine and compare the predictive performance of such modified tree-based models to the traditional model on synthetic datasets that imitate insurance loss. The results show that such modification leads to substantially different tree structures and improved prediction performance.-
dc.languageeng-
dc.relation.ispartofInsurance Mathematics and Economics-
dc.subjectCanberra distance-
dc.subjectCustom loss-
dc.subjectImbalanced learning-
dc.subjectPredictive model of insurance claims-
dc.subjectRegression tree-
dc.subjectTree-based algorithms-
dc.titleImbalanced learning for insurance using modified loss functions in tree-based models-
dc.typeArticle-
dc.description.naturelink_to_subscribed_fulltext-
dc.identifier.doi10.1016/j.insmatheco.2022.04.010-
dc.identifier.scopuseid_2-s2.0-85130154536-
dc.identifier.volume106-
dc.identifier.spage13-
dc.identifier.epage32-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats