Unsupervised Light Field Depth Estimation via Multi-view Feature Matching with Occlusion Prediction

Zhang, Shansi; Meng, Nan; Lam, Edmund Y.

File Download

There are no files associated with this item.

Links for fulltext

(May Require Subscription)

Publisher Website: 10.1109/TCSVT.2023.3305978
Scopus: eid_2-s2.0-85168738597
Find via

Supplementary

Citations:
- Scopus: 0
Appears in Collections:
- Orthopaedics & Traumatology: Journal/Magazine Articles

Article: Unsupervised Light Field Depth Estimation via Multi-view Feature Matching with Occlusion Prediction

Title	Unsupervised Light Field Depth Estimation via Multi-view Feature Matching with Occlusion Prediction
Authors	Zhang, Shansi Meng, Nan Lam, Edmund Y.
Keywords	Convolutional neural networks Costs Estimation Feature extraction feature matching Image edge detection Light field occlusion prediction Training Training data unsupervised depth estimation
Issue Date	2023
Citation	IEEE Transactions on Circuits and Systems for Video Technology, 2023 How to Cite? DOI: http://dx.doi.org/10.1109/TCSVT.2023.3305978
Abstract	Depth estimation from light field (LF) images is a fundamental step for numerous applications. Recently, learning-based methods have achieved higher accuracy and efficiency than the traditional methods. However, it is costly to obtain sufficient depth labels for supervised training. In this paper, we propose an unsupervised framework to estimate depth from LF images. First, we design a disparity estimation network (DispNet) with a coarse-to-fine structure to predict disparity maps from different view combinations. It explicitly performs multi-view feature matching to learn the correspondences effectively. As occlusions may cause the violation of photo-consistency, we introduce an occlusion prediction network (OccNet) to predict the occlusion maps, which are used as the element-wise weights of photometric loss to solve the occlusion issue and assist the disparity learning. With the disparity maps estimated by multiple input combinations, we then propose a disparity fusion strategy based on the estimated errors with effective occlusion handling to obtain the final disparity map with higher accuracy. Experimental results demonstrate that our method achieves superior performance on both the dense and sparse LF images, and also shows better robustness and generalization on the real-world LF images compared to the other methods.
Persistent Identifier	http://hdl.handle.net/10722/330491
ISSN	1051-8215 2023 Impact Factor: 8.3 2023 SCImago Journal Rankings: 2.299

DC Field	Value	Language
dc.contributor.author	Zhang, Shansi	-
dc.contributor.author	Meng, Nan	-
dc.contributor.author	Lam, Edmund Y.	-
dc.date.accessioned	2023-09-05T12:11:10Z	-
dc.date.available	2023-09-05T12:11:10Z	-
dc.date.issued	2023	-
dc.identifier.citation	IEEE Transactions on Circuits and Systems for Video Technology, 2023	-
dc.identifier.issn	1051-8215	-
dc.identifier.uri	http://hdl.handle.net/10722/330491	-
dc.description.abstract	Depth estimation from light field (LF) images is a fundamental step for numerous applications. Recently, learning-based methods have achieved higher accuracy and efficiency than the traditional methods. However, it is costly to obtain sufficient depth labels for supervised training. In this paper, we propose an unsupervised framework to estimate depth from LF images. First, we design a disparity estimation network (DispNet) with a coarse-to-fine structure to predict disparity maps from different view combinations. It explicitly performs multi-view feature matching to learn the correspondences effectively. As occlusions may cause the violation of photo-consistency, we introduce an occlusion prediction network (OccNet) to predict the occlusion maps, which are used as the element-wise weights of photometric loss to solve the occlusion issue and assist the disparity learning. With the disparity maps estimated by multiple input combinations, we then propose a disparity fusion strategy based on the estimated errors with effective occlusion handling to obtain the final disparity map with higher accuracy. Experimental results demonstrate that our method achieves superior performance on both the dense and sparse LF images, and also shows better robustness and generalization on the real-world LF images compared to the other methods.	-
dc.language	eng	-
dc.relation.ispartof	IEEE Transactions on Circuits and Systems for Video Technology	-
dc.subject	Convolutional neural networks	-
dc.subject	Costs	-
dc.subject	Estimation	-
dc.subject	Feature extraction	-
dc.subject	feature matching	-
dc.subject	Image edge detection	-
dc.subject	Light field	-
dc.subject	occlusion prediction	-
dc.subject	Training	-
dc.subject	Training data	-
dc.subject	unsupervised depth estimation	-
dc.title	Unsupervised Light Field Depth Estimation via Multi-view Feature Matching with Occlusion Prediction	-
dc.type	Article	-
dc.description.nature	link_to_subscribed_fulltext	-
dc.identifier.doi	10.1109/TCSVT.2023.3305978	-
dc.identifier.scopus	eid_2-s2.0-85168738597	-
dc.identifier.eissn	1558-2205	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

Article: Unsupervised Light Field Depth Estimation via Multi-view Feature Matching with Occlusion Prediction

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats