File Download
  Links for fulltext
     (May Require Subscription)
Supplementary

Article: Exploration of the Hidden Influential Factors on Crime Activities: A Big Data Approach

TitleExploration of the Hidden Influential Factors on Crime Activities: A Big Data Approach
Authors
Keywordsfelony assault
recursive feature elimination
feature analysis
Big data techniques
machine learning
gradient boost decision tree
Issue Date2020
Citation
IEEE Access, 2020, v. 8, p. 141033-141045 How to Cite?
AbstractCrime activities have long been a great concern of all the countries. Analysis of crime data has been a key part yet a considerable challenge for discovering crime patterns and reducing crimes. In recent year, along with the development of data collection and data mining techniques, lots of big data-related studies have been conducted to analyze the crime data. Studying the numerical influential factors is one important yet challenging problem, especially for those indirect features. Though a number of studies have been conducted to analyze the influential factors of crime activities, most of them have some limitations in the era of 'big data'. Some adopted the linear statistical methods, of which the basic assumption is opposite to the non-linear real world. Some limited their studied factors within one or two aspects. Some overlooked the importance of ranking the influence of factors. To fill these research gaps, this paper proposes a big data approach to analyze the influential factors on the crime activities, and experimented it on New York City. More than 1515 different factors ranging from demographic, housing, education, economy, social, and city planning were considered and analyzed. The proposed framework combines non-linear machine learning algorithms and geographical information system (GIS) to study the spatial determinants of crimes. Recursive feature elimination (RFE) is used to select the optimum feature set. Performance of gradient boost decision tree (GBDT), logistic regression (LR), support vector machine (SVM), artificial neural network (ANN) and random forest (RF) are compared to generate the optimum model. Important impact factors were then investigated using GBDT and GIS. The experimental results demonstrate that the combined GBDT and GIS model can find out the most important factors of crime rate with high efficiency and accuracy.
Persistent Identifierhttp://hdl.handle.net/10722/286815
ISI Accession Number ID

 

DC FieldValueLanguage
dc.contributor.authorZhou, Jianming-
dc.contributor.authorLi, Zheng-
dc.contributor.authorMa, Jack J.-
dc.contributor.authorJiang, Feifeng-
dc.date.accessioned2020-09-07T11:45:44Z-
dc.date.available2020-09-07T11:45:44Z-
dc.date.issued2020-
dc.identifier.citationIEEE Access, 2020, v. 8, p. 141033-141045-
dc.identifier.urihttp://hdl.handle.net/10722/286815-
dc.description.abstractCrime activities have long been a great concern of all the countries. Analysis of crime data has been a key part yet a considerable challenge for discovering crime patterns and reducing crimes. In recent year, along with the development of data collection and data mining techniques, lots of big data-related studies have been conducted to analyze the crime data. Studying the numerical influential factors is one important yet challenging problem, especially for those indirect features. Though a number of studies have been conducted to analyze the influential factors of crime activities, most of them have some limitations in the era of 'big data'. Some adopted the linear statistical methods, of which the basic assumption is opposite to the non-linear real world. Some limited their studied factors within one or two aspects. Some overlooked the importance of ranking the influence of factors. To fill these research gaps, this paper proposes a big data approach to analyze the influential factors on the crime activities, and experimented it on New York City. More than 1515 different factors ranging from demographic, housing, education, economy, social, and city planning were considered and analyzed. The proposed framework combines non-linear machine learning algorithms and geographical information system (GIS) to study the spatial determinants of crimes. Recursive feature elimination (RFE) is used to select the optimum feature set. Performance of gradient boost decision tree (GBDT), logistic regression (LR), support vector machine (SVM), artificial neural network (ANN) and random forest (RF) are compared to generate the optimum model. Important impact factors were then investigated using GBDT and GIS. The experimental results demonstrate that the combined GBDT and GIS model can find out the most important factors of crime rate with high efficiency and accuracy.-
dc.languageeng-
dc.relation.ispartofIEEE Access-
dc.rightsThis work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.-
dc.subjectfelony assault-
dc.subjectrecursive feature elimination-
dc.subjectfeature analysis-
dc.subjectBig data techniques-
dc.subjectmachine learning-
dc.subjectgradient boost decision tree-
dc.titleExploration of the Hidden Influential Factors on Crime Activities: A Big Data Approach-
dc.typeArticle-
dc.description.naturepublished_or_final_version-
dc.identifier.doi10.1109/ACCESS.2020.3009969-
dc.identifier.scopuseid_2-s2.0-85089551399-
dc.identifier.volume8-
dc.identifier.spage141033-
dc.identifier.epage141045-
dc.identifier.eissn2169-3536-
dc.identifier.isiWOS:000556692000001-
dc.identifier.issnl2169-3536-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats