High-resolution cross-scale transformer: A deep learning model for bolt loosening detection based on monocular vision measurement

Wu, Tianyi; Shang, Ke; Dai, Wei; Wang, Min; Liu, Rui; Zhou, Junxian; Liu, Jun

File Download

There are no files associated with this item.

Links for fulltext

(May Require Subscription)

Publisher Website: 10.1016/j.engappai.2024.108574
Scopus: eid_2-s2.0-85192973687
Find via

Supplementary

Citations:
- Scopus: 0
Appears in Collections:
- Industrial & Manufacturing Systems Engineering: Journal/Magazine Articles

Article: High-resolution cross-scale transformer: A deep learning model for bolt loosening detection based on monocular vision measurement

Title	High-resolution cross-scale transformer: A deep learning model for bolt loosening detection based on monocular vision measurement
Authors	Wu, Tianyi Shang, Ke Dai, Wei Wang, Min Liu, Rui Zhou, Junxian Liu, Jun
Keywords	Connection loosening detection High-resolution architecture Monocular vision measurement Vision transformer
Issue Date	2024
Citation	Engineering Applications of Artificial Intelligence, 2024, v. 133, article no. 108574 How to Cite? DOI: http://dx.doi.org/10.1016/j.engappai.2024.108574
Abstract	The reliability of bolt connections significantly impacts the operational state and lifespan of industrial equipment. Vision-based noncontact methods exhibit high efficiency in bolt loosening detection. However, limited image features hinder measurement accuracy. To improve bolt loosening detection performance, this paper proposes a novel deep learning backbone, the high-resolution cross-scale transformer, to extract high precision keypoints for bolt three-dimensional model construction. Simultaneously, a monocular vision measurement model is established to get the bolt exposed length and evaluate the connection loosening state. The proposed backbone hybridizes the advantages of high-resolution architecture and transformer, realizing global information aggregation and fine-grained image details. A simplified module, dual-scale multi-head self-attention, is designed to reduce the computational redundancy caused by the implementation of high-resolution multi-branch architecture. In the experiment section, the high-resolution cross-scale transformer outperforms other keypoint detection baselines, achieving the top one performance with 91.6 average precision and 84.9 average recall. The monocular vision measurement model realizes a 0.053 mm error with a 0.028 mm standard deviation, satisfying the industrial implementation requirement. Additionally, the model is tested on different industrial situations and an additional outside dataset, indicating the model's robustness and actual environment adaptability.
Persistent Identifier	http://hdl.handle.net/10722/350072
ISSN	0952-1976 2023 Impact Factor: 7.5 2023 SCImago Journal Rankings: 1.749

DC Field	Value	Language
dc.contributor.author	Wu, Tianyi	-
dc.contributor.author	Shang, Ke	-
dc.contributor.author	Dai, Wei	-
dc.contributor.author	Wang, Min	-
dc.contributor.author	Liu, Rui	-
dc.contributor.author	Zhou, Junxian	-
dc.contributor.author	Liu, Jun	-
dc.date.accessioned	2024-10-17T07:02:53Z	-
dc.date.available	2024-10-17T07:02:53Z	-
dc.date.issued	2024	-
dc.identifier.citation	Engineering Applications of Artificial Intelligence, 2024, v. 133, article no. 108574	-
dc.identifier.issn	0952-1976	-
dc.identifier.uri	http://hdl.handle.net/10722/350072	-
dc.description.abstract	The reliability of bolt connections significantly impacts the operational state and lifespan of industrial equipment. Vision-based noncontact methods exhibit high efficiency in bolt loosening detection. However, limited image features hinder measurement accuracy. To improve bolt loosening detection performance, this paper proposes a novel deep learning backbone, the high-resolution cross-scale transformer, to extract high precision keypoints for bolt three-dimensional model construction. Simultaneously, a monocular vision measurement model is established to get the bolt exposed length and evaluate the connection loosening state. The proposed backbone hybridizes the advantages of high-resolution architecture and transformer, realizing global information aggregation and fine-grained image details. A simplified module, dual-scale multi-head self-attention, is designed to reduce the computational redundancy caused by the implementation of high-resolution multi-branch architecture. In the experiment section, the high-resolution cross-scale transformer outperforms other keypoint detection baselines, achieving the top one performance with 91.6 average precision and 84.9 average recall. The monocular vision measurement model realizes a 0.053 mm error with a 0.028 mm standard deviation, satisfying the industrial implementation requirement. Additionally, the model is tested on different industrial situations and an additional outside dataset, indicating the model's robustness and actual environment adaptability.	-
dc.language	eng	-
dc.relation.ispartof	Engineering Applications of Artificial Intelligence	-
dc.subject	Connection loosening detection	-
dc.subject	High-resolution architecture	-
dc.subject	Monocular vision measurement	-
dc.subject	Vision transformer	-
dc.title	High-resolution cross-scale transformer: A deep learning model for bolt loosening detection based on monocular vision measurement	-
dc.type	Article	-
dc.description.nature	link_to_subscribed_fulltext	-
dc.identifier.doi	10.1016/j.engappai.2024.108574	-
dc.identifier.scopus	eid_2-s2.0-85192973687	-
dc.identifier.volume	133	-
dc.identifier.spage	article no. 108574	-
dc.identifier.epage	article no. 108574	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

Article: High-resolution cross-scale transformer: A deep learning model for bolt loosening detection based on monocular vision measurement

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats