File Download
Supplementary
-
Citations:
- Appears in Collections:
postgraduate thesis: On incomplete multinomial data modeling and interactive neural and statistical computing with GPU
Title | On incomplete multinomial data modeling and interactive neural and statistical computing with GPU |
---|---|
Authors | |
Advisors | |
Issue Date | 2018 |
Publisher | The University of Hong Kong (Pokfulam, Hong Kong) |
Citation | Dong, F. [東方虎]. (2018). On incomplete multinomial data modeling and interactive neural and statistical computing with GPU. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. |
Abstract | This thesis consists of five chapters describing three independent works. Chapter 1 introduces the thesis and fills some necessary background. Chapter 2 describes the first work on incomplete multinomial model for count data sampled on a random partition. It contains a solution to the estimation problem by an iterative algorithm. The incomplete multinomial likelihood is parameterized by the complete-cell probabilities from the most refined partition. Its sufficient statistics include the variable-cell formation observed as an indicator matrix and all cell counts. With externally imposed structures on the cell formation process, it reduces to special models. The weaver algorithm enjoys the ascent property and has a linear rate of convergence. Its steps are short and amenable to a parallel implementation. It is significantly faster than the state-of-the-art EM/MM algorithm when fitting the Plackett--Luce model to a benchmark data set. The chapter also develops an analytic theory to investigate the conditions surrounding the global maximization of the likelihood. Simulation experiments are designed to show the model and algorithm's performance on recovering very weak signals. Asymptotic properties of the estimator are derived and validated with simulations. The next two chapters both design and implement softwares that combine the spreadsheet software's highly interactive user interface with the Graphics Processing Unit's high computing performance. Chapter 3 presents a general design of an interactive neural network trainer. Its main features include the abilities to specify different transfer functions, loss functions, and learning algorithms, facilities for stepping the learning course and tracking user defined variables, and a mechanism to specify constraints for the weights. It also includes a forward selection algorithm for optimizing the network architecture. Chapter 4 implements a dynamic-link library of GPU-executed matrix functions that can be called on the spreadsheet. It then demonstrates the implementation of an interactive software for multivariate statistical analysis utilizing the GPU matrix library. Chapter 5 makes concluding remarks and lists some potential directions for future works. |
Degree | Doctor of Philosophy |
Subject | Parameter estimation Estimation theory Sampling (Statistics) Iterative methods (Mathematics) Mathematical optimization Graphics processing units |
Dept/Program | Statistics and Actuarial Science |
Persistent Identifier | http://hdl.handle.net/10722/261493 |
DC Field | Value | Language |
---|---|---|
dc.contributor.advisor | Yin, G | - |
dc.contributor.advisor | Tian, G | - |
dc.contributor.author | Dong, Fanghu | - |
dc.contributor.author | 東方虎 | - |
dc.date.accessioned | 2018-09-20T06:43:56Z | - |
dc.date.available | 2018-09-20T06:43:56Z | - |
dc.date.issued | 2018 | - |
dc.identifier.citation | Dong, F. [東方虎]. (2018). On incomplete multinomial data modeling and interactive neural and statistical computing with GPU. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. | - |
dc.identifier.uri | http://hdl.handle.net/10722/261493 | - |
dc.description.abstract | This thesis consists of five chapters describing three independent works. Chapter 1 introduces the thesis and fills some necessary background. Chapter 2 describes the first work on incomplete multinomial model for count data sampled on a random partition. It contains a solution to the estimation problem by an iterative algorithm. The incomplete multinomial likelihood is parameterized by the complete-cell probabilities from the most refined partition. Its sufficient statistics include the variable-cell formation observed as an indicator matrix and all cell counts. With externally imposed structures on the cell formation process, it reduces to special models. The weaver algorithm enjoys the ascent property and has a linear rate of convergence. Its steps are short and amenable to a parallel implementation. It is significantly faster than the state-of-the-art EM/MM algorithm when fitting the Plackett--Luce model to a benchmark data set. The chapter also develops an analytic theory to investigate the conditions surrounding the global maximization of the likelihood. Simulation experiments are designed to show the model and algorithm's performance on recovering very weak signals. Asymptotic properties of the estimator are derived and validated with simulations. The next two chapters both design and implement softwares that combine the spreadsheet software's highly interactive user interface with the Graphics Processing Unit's high computing performance. Chapter 3 presents a general design of an interactive neural network trainer. Its main features include the abilities to specify different transfer functions, loss functions, and learning algorithms, facilities for stepping the learning course and tracking user defined variables, and a mechanism to specify constraints for the weights. It also includes a forward selection algorithm for optimizing the network architecture. Chapter 4 implements a dynamic-link library of GPU-executed matrix functions that can be called on the spreadsheet. It then demonstrates the implementation of an interactive software for multivariate statistical analysis utilizing the GPU matrix library. Chapter 5 makes concluding remarks and lists some potential directions for future works. | - |
dc.language | eng | - |
dc.publisher | The University of Hong Kong (Pokfulam, Hong Kong) | - |
dc.relation.ispartof | HKU Theses Online (HKUTO) | - |
dc.rights | The author retains all proprietary rights, (such as patent rights) and the right to use in future works. | - |
dc.rights | This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. | - |
dc.subject.lcsh | Parameter estimation | - |
dc.subject.lcsh | Estimation theory | - |
dc.subject.lcsh | Sampling (Statistics) | - |
dc.subject.lcsh | Iterative methods (Mathematics) | - |
dc.subject.lcsh | Mathematical optimization | - |
dc.subject.lcsh | Graphics processing units | - |
dc.title | On incomplete multinomial data modeling and interactive neural and statistical computing with GPU | - |
dc.type | PG_Thesis | - |
dc.description.thesisname | Doctor of Philosophy | - |
dc.description.thesislevel | Doctoral | - |
dc.description.thesisdiscipline | Statistics and Actuarial Science | - |
dc.description.nature | published_or_final_version | - |
dc.identifier.doi | 10.5353/th_991044040572603414 | - |
dc.date.hkucongregation | 2018 | - |
dc.identifier.mmsid | 991044040572603414 | - |