Coupling mobile phone data with machine learning: How misclassification errors in ambient PM2.5 exposure estimates are produced?

Guo, H; Zhan, Q; Ho, HC; Yao, F; Zhou, X; Wu, J; Li, W

File Download

There are no files associated with this item.

Links for fulltext

(May Require Subscription)

Publisher Website: 10.1016/j.scitotenv.2020.141034
Scopus: eid_2-s2.0-85088741111
PMID: 32758750
WOS: WOS:000579365600071
Find via

Supplementary

Citations:
- Scopus: 0
- Web of Science: 0
- PubMed Central: 0
Appears in Collections:
- Urban Planning & Design: Journal/Magazine Articles

Article: Coupling mobile phone data with machine learning: How misclassification errors in ambient PM2.5 exposure estimates are produced?

Title	Coupling mobile phone data with machine learning: How misclassification errors in ambient PM2.5 exposure estimates are produced?
Authors	Guo, H Zhan, Q Ho, HC Yao, F Zhou, X Wu, J Li, W
Keywords	PM2.5 exposure estimate Misclassification errors Mobile phone location data Machine learning
Issue Date	2020
Publisher	Elsevier BV. The Journal's web site is located at http://www.elsevier.com/locate/scitotenv
Citation	Science of the Total Environment, 2020, v. 745, p. article no. 141034 How to Cite? DOI: http://dx.doi.org/10.1016/j.scitotenv.2020.141034
Abstract	Background Most studies relying on time-activity diary or traditional air pollution modelling approach are insufficient to suggest the impacts of ignoring individual mobility and air pollution variations on misclassification errors in exposure estimates. Moreover, very few studies have examined whether such impacts differ across socioeconomic groups. Objectives We aim to examine how ignoring individual mobility and PM2.5 variations produces misclassification errors in ambient PM2.5 exposure estimates. Methods We developed a geo-informed backward propagation neural network model to estimate hourly PM2.5 concentrations in terms of remote sensing and geospatial big data. Combining the estimated PM2.5 concentrations and individual trajectories derived from 755,468 mobile phone users on a weekday in Shenzhen, China, we estimated four types of individual total PM2.5 exposures during weekdays at multi-temporal scales. The estimate ignoring individual mobility, PM2.5 variations or both was compared with the hypothetical error-free estimate using paired sample t-test. We then quantified the exposure misclassification error using Pearson correlation analysis. Moreover, we examined whether the misclassification error differs across different socioeconomic groups. Taking findings of ignoring individual mobility as an example, we further investigated whether such findings are robust to the different selections of time. Results We found that the estimate ignoring PM2.5 variations, individual mobility or both was statistically different from the hypothetical error-free estimate. Ignoring both factors produced the largest exposure misclassification error. The misclassification error was larger in the estimate ignoring PM2.5 variations than that ignoring individual mobility. People with high economic status suffered from a larger exposure misclassification error. The findings were robust to the different selections of time. Conclusions Ignoring individual mobility, PM2.5 variations or both leads to misclassification errors in ambient PM2.5 exposure estimates. A larger misclassification error occurs in the estimate neglecting PM2.5 variations than that ignoring individual mobility, which is seldom reported before.
Persistent Identifier	http://hdl.handle.net/10722/286522
ISSN	0048-9697 2021 Impact Factor: 10.753 2020 SCImago Journal Rankings: 1.795
ISI Accession Number ID	WOS:000579365600071

DC Field	Value	Language
dc.contributor.author	Guo, H	-
dc.contributor.author	Zhan, Q	-
dc.contributor.author	Ho, HC	-
dc.contributor.author	Yao, F	-
dc.contributor.author	Zhou, X	-
dc.contributor.author	Wu, J	-
dc.contributor.author	Li, W	-
dc.date.accessioned	2020-08-31T07:05:01Z	-
dc.date.available	2020-08-31T07:05:01Z	-
dc.date.issued	2020	-
dc.identifier.citation	Science of the Total Environment, 2020, v. 745, p. article no. 141034	-
dc.identifier.issn	0048-9697	-
dc.identifier.uri	http://hdl.handle.net/10722/286522	-
dc.description.abstract	Background Most studies relying on time-activity diary or traditional air pollution modelling approach are insufficient to suggest the impacts of ignoring individual mobility and air pollution variations on misclassification errors in exposure estimates. Moreover, very few studies have examined whether such impacts differ across socioeconomic groups. Objectives We aim to examine how ignoring individual mobility and PM2.5 variations produces misclassification errors in ambient PM2.5 exposure estimates. Methods We developed a geo-informed backward propagation neural network model to estimate hourly PM2.5 concentrations in terms of remote sensing and geospatial big data. Combining the estimated PM2.5 concentrations and individual trajectories derived from 755,468 mobile phone users on a weekday in Shenzhen, China, we estimated four types of individual total PM2.5 exposures during weekdays at multi-temporal scales. The estimate ignoring individual mobility, PM2.5 variations or both was compared with the hypothetical error-free estimate using paired sample t-test. We then quantified the exposure misclassification error using Pearson correlation analysis. Moreover, we examined whether the misclassification error differs across different socioeconomic groups. Taking findings of ignoring individual mobility as an example, we further investigated whether such findings are robust to the different selections of time. Results We found that the estimate ignoring PM2.5 variations, individual mobility or both was statistically different from the hypothetical error-free estimate. Ignoring both factors produced the largest exposure misclassification error. The misclassification error was larger in the estimate ignoring PM2.5 variations than that ignoring individual mobility. People with high economic status suffered from a larger exposure misclassification error. The findings were robust to the different selections of time. Conclusions Ignoring individual mobility, PM2.5 variations or both leads to misclassification errors in ambient PM2.5 exposure estimates. A larger misclassification error occurs in the estimate neglecting PM2.5 variations than that ignoring individual mobility, which is seldom reported before.	-
dc.language	eng	-
dc.publisher	Elsevier BV. The Journal's web site is located at http://www.elsevier.com/locate/scitotenv	-
dc.relation.ispartof	Science of the Total Environment	-
dc.subject	PM2.5 exposure estimate	-
dc.subject	Misclassification errors	-
dc.subject	Mobile phone location data	-
dc.subject	Machine learning	-
dc.title	Coupling mobile phone data with machine learning: How misclassification errors in ambient PM2.5 exposure estimates are produced?	-
dc.type	Article	-
dc.identifier.email	Ho, HC: hcho21@hku.hk	-
dc.identifier.email	Li, W: wfli@hku.hk	-
dc.identifier.authority	Ho, HC=rp02482	-
dc.identifier.authority	Li, W=rp01507	-
dc.description.nature	link_to_subscribed_fulltext	-
dc.identifier.doi	10.1016/j.scitotenv.2020.141034	-
dc.identifier.pmid	32758750	-
dc.identifier.scopus	eid_2-s2.0-85088741111	-
dc.identifier.hkuros	313416	-
dc.identifier.volume	745	-
dc.identifier.spage	article no. 141034	-
dc.identifier.epage	article no. 141034	-
dc.identifier.isi	WOS:000579365600071	-
dc.publisher.place	Netherlands	-
dc.identifier.issnl	0048-9697	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

Article: Coupling mobile phone data with machine learning: How misclassification errors in ambient PM2.5 exposure estimates are produced?

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats