Large vocabulary automatic chord estimation from audio using deep learning approaches

Deng, Junqi; 邓俊祺

File Download

FullText.pdf

Links for fulltext

(May Require Subscription)

DOI: 10.5353/th_991043976387303414

Supplementary

Citations:
Appears in Collections:
- HKU Theses Online
- Electrical & Electronic Engineering: Theses

postgraduate thesis: Large vocabulary automatic chord estimation from audio using deep learning approaches

Title	Large vocabulary automatic chord estimation from audio using deep learning approaches
Authors	Deng, Junqi 邓俊祺
Advisors	Advisor(s):Kwok, YK
Issue Date	2016
Publisher	The University of Hong Kong (Pokfulam, Hong Kong)
Citation	Deng, J. [邓俊祺]. (2016). Large vocabulary automatic chord estimation from audio using deep learning approaches. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.
Abstract	Being well aware of the chord annotation subjectivity issue, this thesis attests the necessity of large vocabulary with a joint argument of machine musicianship and the Turing test. Built upon this premise, it proposes two deep learning based system frameworks that lead to potential practical solutions to large vocabulary automatic chord estimation. The first framework separates chord segmentation and classification into two tasks, which is unlike all previous approaches that combine them in one single pass. Several deep learning models are implemented and tested. Under the large vocabulary evaluation, the recurrent neural network model shows great potential in balanced performances across different chords. This framework has shown its advantages over large vocabulary evaluation in the automatic chord estimation task of music information retrieval evaluation exchange 2016. The second framework incorporates a skewed class distribution sensitive approach. It employs an ``even chance'' scheme to boost the uncommon chords' exposure when training a recurrent neural network sequence decoder. The main drawback of this approach is the low segmentation quality. Nevertheless, it demonstrates the even chance training scheme to be effective for the large vocabulary automatic chord estimation. Finally, a preliminary study has been conducted for automatic jazz chord estimation. Upon this study, a chord-scale estimation system is built and some semi-automatic or fully automatic jazz improvisation demos are created.
Degree	Doctor of Philosophy
Subject	Music - Data processing Machine learning
Dept/Program	Electrical and Electronic Engineering
Persistent Identifier	http://hdl.handle.net/10722/249913

DC Field	Value	Language
dc.contributor.advisor	Kwok, YK	-
dc.contributor.author	Deng, Junqi	-
dc.contributor.author	邓俊祺	-
dc.date.accessioned	2017-12-19T09:27:44Z	-
dc.date.available	2017-12-19T09:27:44Z	-
dc.date.issued	2016	-
dc.identifier.citation	Deng, J. [邓俊祺]. (2016). Large vocabulary automatic chord estimation from audio using deep learning approaches. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.	-
dc.identifier.uri	http://hdl.handle.net/10722/249913	-
dc.description.abstract	Being well aware of the chord annotation subjectivity issue, this thesis attests the necessity of large vocabulary with a joint argument of machine musicianship and the Turing test. Built upon this premise, it proposes two deep learning based system frameworks that lead to potential practical solutions to large vocabulary automatic chord estimation. The first framework separates chord segmentation and classification into two tasks, which is unlike all previous approaches that combine them in one single pass. Several deep learning models are implemented and tested. Under the large vocabulary evaluation, the recurrent neural network model shows great potential in balanced performances across different chords. This framework has shown its advantages over large vocabulary evaluation in the automatic chord estimation task of music information retrieval evaluation exchange 2016. The second framework incorporates a skewed class distribution sensitive approach. It employs an ``even chance'' scheme to boost the uncommon chords' exposure when training a recurrent neural network sequence decoder. The main drawback of this approach is the low segmentation quality. Nevertheless, it demonstrates the even chance training scheme to be effective for the large vocabulary automatic chord estimation. Finally, a preliminary study has been conducted for automatic jazz chord estimation. Upon this study, a chord-scale estimation system is built and some semi-automatic or fully automatic jazz improvisation demos are created.	-
dc.language	eng	-
dc.publisher	The University of Hong Kong (Pokfulam, Hong Kong)	-
dc.relation.ispartof	HKU Theses Online (HKUTO)	-
dc.rights	The author retains all proprietary rights, (such as patent rights) and the right to use in future works.	-
dc.rights	This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.	-
dc.subject.lcsh	Music - Data processing	-
dc.subject.lcsh	Machine learning	-
dc.title	Large vocabulary automatic chord estimation from audio using deep learning approaches	-
dc.type	PG_Thesis	-
dc.description.thesisname	Doctor of Philosophy	-
dc.description.thesislevel	Doctoral	-
dc.description.thesisdiscipline	Electrical and Electronic Engineering	-
dc.description.nature	published_or_final_version	-
dc.identifier.doi	10.5353/th_991043976387303414	-
dc.date.hkucongregation	2017	-
dc.identifier.mmsid	991043976387303414	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

postgraduate thesis: Large vocabulary automatic chord estimation from audio using deep learning approaches

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats