File Download

There are no files associated with this item.

  Links for fulltext
     (May Require Subscription)
Supplementary

Article: Choosing the number of factors in factor analysis with incomplete data via a novel hierarchical Bayesian information criterion

TitleChoosing the number of factors in factor analysis with incomplete data via a novel hierarchical Bayesian information criterion
Authors
Keywords62D10
62F07
62F99
62H25
BIC
Factor analysis
Incomplete data
Maximum likelihood
Model selection
Variational Bayesian
Issue Date7-Mar-2024
PublisherSpringer
Citation
Advances in Data Analysis and Classification, 2024 How to Cite?
Abstract

The Bayesian information criterion (BIC), defined as the observed data log likelihood minus a penalty term based on the sample size N, is a popular model selection criterion for factor analysis with complete data. This definition has also been suggested for incomplete data. However, the penalty term based on the ‘complete’ sample size N is the same no matter whether in a complete or incomplete data case. For incomplete data, there are often only ��<� observations for variable i, which means that using the ‘complete’ sample size N implausibly ignores the amounts of missing information inherent in incomplete data. Given this observation, a novel hierarchical BIC (HBIC) criterion is proposed for factor analysis with incomplete data, which is denoted by HBICinc. The novelty is that HBICinc only uses the actual amounts of observed information, namely ��’s, in the penalty term. Theoretically, it is shown that HBICinc is a large sample approximation of variational Bayesian (VB) lower bound, and BIC is a further approximation of HBICinc, which means that HBICinc shares the theoretical consistency of BIC. Experiments on synthetic and real data sets are conducted to access the finite sample performance of HBICinc, BIC, and related criteria with various missing rates. The results show that HBICinc and BIC perform similarly when the missing rate is small, but HBICinc is more accurate when the missing rate is not small.


Persistent Identifierhttp://hdl.handle.net/10722/351235
ISSN
2023 Impact Factor: 1.4
2023 SCImago Journal Rankings: 0.594

 

DC FieldValueLanguage
dc.contributor.authorZhao, Jianhua-
dc.contributor.authorShang, Changchun-
dc.contributor.authorLi, Shulan-
dc.contributor.authorXin, Ling-
dc.contributor.authorYu, Philip L H-
dc.date.accessioned2024-11-15T00:39:53Z-
dc.date.available2024-11-15T00:39:53Z-
dc.date.issued2024-03-07-
dc.identifier.citationAdvances in Data Analysis and Classification, 2024-
dc.identifier.issn1862-5347-
dc.identifier.urihttp://hdl.handle.net/10722/351235-
dc.description.abstract<p>The Bayesian information criterion (BIC), defined as the observed data log likelihood minus a penalty term based on the sample size <em>N</em>, is a popular model selection criterion for factor analysis with complete data. This definition has also been suggested for incomplete data. However, the penalty term based on the ‘complete’ sample size <em>N</em> is the same no matter whether in a complete or incomplete data case. For incomplete data, there are often only ��<� observations for variable <em>i</em>, which means that using the ‘complete’ sample size <em>N</em> implausibly ignores the amounts of missing information inherent in incomplete data. Given this observation, a novel hierarchical BIC (HBIC) criterion is proposed for factor analysis with incomplete data, which is denoted by HBIC<sub>inc</sub>. The novelty is that HBIC<sub>inc</sub> only uses the actual amounts of observed information, namely ��’s, in the penalty term. Theoretically, it is shown that HBIC<sub>inc</sub> is a large sample approximation of variational Bayesian (VB) lower bound, and BIC is a further approximation of HBIC<sub>inc</sub>, which means that HBIC<sub>inc</sub> shares the theoretical consistency of BIC. Experiments on synthetic and real data sets are conducted to access the finite sample performance of HBIC<sub>inc</sub>, BIC, and related criteria with various missing rates. The results show that HBIC<sub>inc</sub> and BIC perform similarly when the missing rate is small, but HBIC<sub>inc</sub> is more accurate when the missing rate is not small.</p>-
dc.languageeng-
dc.publisherSpringer-
dc.relation.ispartofAdvances in Data Analysis and Classification-
dc.rightsThis work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.-
dc.subject62D10-
dc.subject62F07-
dc.subject62F99-
dc.subject62H25-
dc.subjectBIC-
dc.subjectFactor analysis-
dc.subjectIncomplete data-
dc.subjectMaximum likelihood-
dc.subjectModel selection-
dc.subjectVariational Bayesian-
dc.titleChoosing the number of factors in factor analysis with incomplete data via a novel hierarchical Bayesian information criterion-
dc.typeArticle-
dc.identifier.doi10.1007/s11634-024-00582-w-
dc.identifier.scopuseid_2-s2.0-85186885798-
dc.identifier.eissn1862-5355-
dc.identifier.issnl1862-5355-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats