File Download
Supplementary

postgraduate thesis: Visual understanding from data-driven feature learning to causality modeling

TitleVisual understanding from data-driven feature learning to causality modeling
Authors
Advisors
Advisor(s):Yu, Y
Issue Date2021
PublisherThe University of Hong Kong (Pokfulam, Hong Kong)
Citation
Lin Xiangru, [林相如]. (2021). Visual understanding from data-driven feature learning to causality modeling. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.
AbstractHuman visual systems are complex. Teaching machines to see and understand like humans has always been a dream for us. One of the major challenges is visual understanding, a fundamental computer vision task that aims to grasp the visual perception ability of the human visual system. Although the performance of existing research in visual understanding has been significantly improved in the wave of deep neural networks, data-driven feature learning based methods still have many unresolved challenges. For example, models learned under data-driven feature learning paradigm tend to capture correlation instead of causation from data. Therefore, building models merely depending on data-driven feature learning could be problematic. In this thesis, we aim to build better visual systems that could understand, interact with, and reason about the visual world. To conduct a comprehensive analysis of the problems of existing visual understanding models, I present four research works, each of which focuses on an important sub-task of visual understanding, from 2D image-level, 3D scene-level, to video-level visual understanding. Concretely, I identify three key problems of data-driven feature learning based visual understanding models. To bridge such research gaps, I propose to exploit data-driven feature learning as a building block together with tools such as causal inference and pre-training to construct more sensible and causality-aware human-level cognitive system. In general, this thesis integrates approaches to feature representation learning, high-level perception, and causal and counterfactual reasoning in data-driven feature learning, forming a unified framework to build better machine intelligent visual systems. First, we introduce a complete feature description, called Complementary Parts Model, superior to existing data-driven feature descriptions. Second, we propose a Scene-Intuitive Agent to demonstrate that high-level understanding should be built upon data-driven feature learning. Then, we leverage Causal Inference to solve the data correlation problem in existing data-driven feature learning based visual understanding models.
DegreeDoctor of Philosophy
SubjectComputer vision
Dept/ProgramComputer Science
Persistent Identifierhttp://hdl.handle.net/10722/308660

 

DC FieldValueLanguage
dc.contributor.advisorYu, Y-
dc.contributor.authorLin Xiangru-
dc.contributor.author林相如-
dc.date.accessioned2021-12-06T01:04:08Z-
dc.date.available2021-12-06T01:04:08Z-
dc.date.issued2021-
dc.identifier.citationLin Xiangru, [林相如]. (2021). Visual understanding from data-driven feature learning to causality modeling. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.-
dc.identifier.urihttp://hdl.handle.net/10722/308660-
dc.description.abstractHuman visual systems are complex. Teaching machines to see and understand like humans has always been a dream for us. One of the major challenges is visual understanding, a fundamental computer vision task that aims to grasp the visual perception ability of the human visual system. Although the performance of existing research in visual understanding has been significantly improved in the wave of deep neural networks, data-driven feature learning based methods still have many unresolved challenges. For example, models learned under data-driven feature learning paradigm tend to capture correlation instead of causation from data. Therefore, building models merely depending on data-driven feature learning could be problematic. In this thesis, we aim to build better visual systems that could understand, interact with, and reason about the visual world. To conduct a comprehensive analysis of the problems of existing visual understanding models, I present four research works, each of which focuses on an important sub-task of visual understanding, from 2D image-level, 3D scene-level, to video-level visual understanding. Concretely, I identify three key problems of data-driven feature learning based visual understanding models. To bridge such research gaps, I propose to exploit data-driven feature learning as a building block together with tools such as causal inference and pre-training to construct more sensible and causality-aware human-level cognitive system. In general, this thesis integrates approaches to feature representation learning, high-level perception, and causal and counterfactual reasoning in data-driven feature learning, forming a unified framework to build better machine intelligent visual systems. First, we introduce a complete feature description, called Complementary Parts Model, superior to existing data-driven feature descriptions. Second, we propose a Scene-Intuitive Agent to demonstrate that high-level understanding should be built upon data-driven feature learning. Then, we leverage Causal Inference to solve the data correlation problem in existing data-driven feature learning based visual understanding models.-
dc.languageeng-
dc.publisherThe University of Hong Kong (Pokfulam, Hong Kong)-
dc.relation.ispartofHKU Theses Online (HKUTO)-
dc.rightsThe author retains all proprietary rights, (such as patent rights) and the right to use in future works.-
dc.rightsThis work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.-
dc.subject.lcshComputer vision-
dc.titleVisual understanding from data-driven feature learning to causality modeling-
dc.typePG_Thesis-
dc.description.thesisnameDoctor of Philosophy-
dc.description.thesislevelDoctoral-
dc.description.thesisdisciplineComputer Science-
dc.description.naturepublished_or_final_version-
dc.date.hkucongregation2021-
dc.identifier.mmsid991044448908403414-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats