File Download
Supplementary
-
Citations:
- Appears in Collections:
postgraduate thesis: Reconstructing 3D indoor scene from RGB equirectangular panorama images with convolutional neural network (CNN) system
Title | Reconstructing 3D indoor scene from RGB equirectangular panorama images with convolutional neural network (CNN) system |
---|---|
Authors | |
Advisors | |
Issue Date | 2022 |
Publisher | The University of Hong Kong (Pokfulam, Hong Kong) |
Citation | Chan, C. P. [陳卓梆]. (2022). Reconstructing 3D indoor scene from RGB equirectangular panorama images with convolutional neural network (CNN) system. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. |
Abstract | Constructing 3D virtual environments is an indispensable process in Virtual Reality application development because they define everything the VR users can see in VR worlds and can immerse players and induce the feeling of presence inside the VR environment. However, creating Virtual environments consume a lot of time and effort, even for professional 3D modelers. It is therefore important to explore methods to automate the generation of 3D scenes used in VR. Inspired by the rapid development of Deep Neural Network, specifically Convolutional Neural Network, we propose to accelerate the development cycle of virtual environment generation by a Deep neural network-based 3D reconstruction system, which takes 360 indoor RGB equirectangular panorama images as input, and outputs the generated 3D scene. Every 3D scene is enclosed with the room’s 3D layout and is populated by 3D objects that exist in the input image. To simplify the virtual environment generation problem, we divide the entire system into 4 main subtasks, namely Room Layout Estimation, Object Detection, Object Pose Estimation, and Artistic Postprocessing. In each chapter of this thesis, we introduce the background of each subtask, its related works, and how we propose to solve it. We also evaluate our approach by conducting quantitative or qualitative experiments. According to the results, most of the submodules can successfully operate on an error that is either competitive to relevant methods, or even better than existing approaches.
Throughout the chapters, we also explain our secondary contributions. We propose a visual representation enhancement algorithm to a room layout estimation network. Evaluation shows our approach can improve generated layouts’ objective visual realness and optimize framerate and memory usage if the environment is rendered in Unity. Furthermore, we introduce our new object pose estimation dataset, created by taking inspiration from each of the previous related datasets and combining their unique advantages into our own one. We statistically show that our dataset is superior to existing ones in terms of image source diversity and richness of annotation features. Finally, we are going to cover the Unity-based Annotation Tool we created to accompany our dataset. It serves the purpose of reading and editing annotations and giving users the ability to annotate their custom datasets with 3DOF pose. |
Degree | Master of Philosophy |
Subject | Virtual reality Neural networks (Computer science) Computer vision |
Dept/Program | Industrial and Manufacturing Systems Engineering |
Persistent Identifier | http://hdl.handle.net/10722/313706 |
DC Field | Value | Language |
---|---|---|
dc.contributor.advisor | Lau, HYK | - |
dc.contributor.advisor | Or, KL | - |
dc.contributor.author | Chan, Cheuk Pong | - |
dc.contributor.author | 陳卓梆 | - |
dc.date.accessioned | 2022-06-26T09:32:36Z | - |
dc.date.available | 2022-06-26T09:32:36Z | - |
dc.date.issued | 2022 | - |
dc.identifier.citation | Chan, C. P. [陳卓梆]. (2022). Reconstructing 3D indoor scene from RGB equirectangular panorama images with convolutional neural network (CNN) system. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. | - |
dc.identifier.uri | http://hdl.handle.net/10722/313706 | - |
dc.description.abstract | Constructing 3D virtual environments is an indispensable process in Virtual Reality application development because they define everything the VR users can see in VR worlds and can immerse players and induce the feeling of presence inside the VR environment. However, creating Virtual environments consume a lot of time and effort, even for professional 3D modelers. It is therefore important to explore methods to automate the generation of 3D scenes used in VR. Inspired by the rapid development of Deep Neural Network, specifically Convolutional Neural Network, we propose to accelerate the development cycle of virtual environment generation by a Deep neural network-based 3D reconstruction system, which takes 360 indoor RGB equirectangular panorama images as input, and outputs the generated 3D scene. Every 3D scene is enclosed with the room’s 3D layout and is populated by 3D objects that exist in the input image. To simplify the virtual environment generation problem, we divide the entire system into 4 main subtasks, namely Room Layout Estimation, Object Detection, Object Pose Estimation, and Artistic Postprocessing. In each chapter of this thesis, we introduce the background of each subtask, its related works, and how we propose to solve it. We also evaluate our approach by conducting quantitative or qualitative experiments. According to the results, most of the submodules can successfully operate on an error that is either competitive to relevant methods, or even better than existing approaches. Throughout the chapters, we also explain our secondary contributions. We propose a visual representation enhancement algorithm to a room layout estimation network. Evaluation shows our approach can improve generated layouts’ objective visual realness and optimize framerate and memory usage if the environment is rendered in Unity. Furthermore, we introduce our new object pose estimation dataset, created by taking inspiration from each of the previous related datasets and combining their unique advantages into our own one. We statistically show that our dataset is superior to existing ones in terms of image source diversity and richness of annotation features. Finally, we are going to cover the Unity-based Annotation Tool we created to accompany our dataset. It serves the purpose of reading and editing annotations and giving users the ability to annotate their custom datasets with 3DOF pose. | - |
dc.language | eng | - |
dc.publisher | The University of Hong Kong (Pokfulam, Hong Kong) | - |
dc.relation.ispartof | HKU Theses Online (HKUTO) | - |
dc.rights | The author retains all proprietary rights, (such as patent rights) and the right to use in future works. | - |
dc.rights | This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. | - |
dc.subject.lcsh | Virtual reality | - |
dc.subject.lcsh | Neural networks (Computer science) | - |
dc.subject.lcsh | Computer vision | - |
dc.title | Reconstructing 3D indoor scene from RGB equirectangular panorama images with convolutional neural network (CNN) system | - |
dc.type | PG_Thesis | - |
dc.description.thesisname | Master of Philosophy | - |
dc.description.thesislevel | Master | - |
dc.description.thesisdiscipline | Industrial and Manufacturing Systems Engineering | - |
dc.description.nature | published_or_final_version | - |
dc.date.hkucongregation | 2022 | - |
dc.identifier.mmsid | 991044545291403414 | - |