Propagating Over Phrase Relations for One-Stage Visual Grounding

YANG, S; LI, G; Yu, Y

File Download

There are no files associated with this item.

Supplementary

Citations:
Appears in Collections:
- Computer Science: Conference papers

Conference Paper: Propagating Over Phrase Relations for One-Stage Visual Grounding

Title	Propagating Over Phrase Relations for One-Stage Visual Grounding
Authors	YANG, S LI, G Yu, Y
Keywords	One-Stage Phrase Grounding Linguistic Graph Relational Propagation Visual Grounding
Issue Date	2020
Citation	The 16th European Conference on Computer Vision (ECCV), Online, 23-28 August 2020 How to Cite?
Abstract	Phrase level visual grounding aims to locate in an image the corresponding visual regions referred to by multiple noun phrases in a given sentence. Its challenge comes not only from large variations in visual contents and unrestricted phrase descriptions but also from unambiguous referrals derived from phrase relational reasoning. In this paper, we propose a linguistic structure guided propagation network for one-stage phrase grounding. It explicitly explores the linguistic structure of the sentence and performs relational propagation among noun phrases under the guidance of the linguistic relations between them. Specifically, we first construct a linguistic graph parsed from the sentence and then capture multimodal feature maps for all the phrasal nodes independently. The node features are then propagated over the edges with a tailor-designed relational propagation module and ultimately integrated for final prediction. Experiments on Flicker30K Entities dataset show that our model outperforms state-of-the-art methods and demonstrate the effectiveness of propagating among phrases with linguistic relations.
Persistent Identifier	http://hdl.handle.net/10722/286647

DC Field	Value	Language
dc.contributor.author	YANG, S	-
dc.contributor.author	LI, G	-
dc.contributor.author	Yu, Y	-
dc.date.accessioned	2020-09-04T13:28:31Z	-
dc.date.available	2020-09-04T13:28:31Z	-
dc.date.issued	2020	-
dc.identifier.citation	The 16th European Conference on Computer Vision (ECCV), Online, 23-28 August 2020	-
dc.identifier.uri	http://hdl.handle.net/10722/286647	-
dc.description.abstract	Phrase level visual grounding aims to locate in an image the corresponding visual regions referred to by multiple noun phrases in a given sentence. Its challenge comes not only from large variations in visual contents and unrestricted phrase descriptions but also from unambiguous referrals derived from phrase relational reasoning. In this paper, we propose a linguistic structure guided propagation network for one-stage phrase grounding. It explicitly explores the linguistic structure of the sentence and performs relational propagation among noun phrases under the guidance of the linguistic relations between them. Specifically, we first construct a linguistic graph parsed from the sentence and then capture multimodal feature maps for all the phrasal nodes independently. The node features are then propagated over the edges with a tailor-designed relational propagation module and ultimately integrated for final prediction. Experiments on Flicker30K Entities dataset show that our model outperforms state-of-the-art methods and demonstrate the effectiveness of propagating among phrases with linguistic relations.	-
dc.language	eng	-
dc.relation.ispartof	European Conference on Computer Vision (ECCV)	-
dc.subject	One-Stage Phrase Grounding	-
dc.subject	Linguistic Graph	-
dc.subject	Relational Propagation	-
dc.subject	Visual Grounding	-
dc.title	Propagating Over Phrase Relations for One-Stage Visual Grounding	-
dc.type	Conference_Paper	-
dc.identifier.email	Yu, Y: yzyu@cs.hku.hk	-
dc.identifier.authority	Yu, Y=rp01415	-
dc.identifier.hkuros	313949	-

File Download

Supplementary

Conference Paper: Propagating Over Phrase Relations for One-Stage Visual Grounding

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats