Optimizing Cepstral Features for Audio Classification


Speaker: Zhouyu Fu

Affiliation: University of Western Sydney

Time: Monday 18/11/2013 from 14:00 to 15:00

Venue: Access Grid UWS. Presented from Penrith (Y239), accessible from Parramatta (EB.1.32) and Campbelltown (26.1.50).

Abstract: Cepstral features have been widely used in audio applications. Domain knowledge has played an important role in designing different types of cepstral features proposed in the literature. In this paper, we present a novel approach for learning optimized cepstral features directly from audio data to better discriminate between different categories of signals in classification tasks. We employ multi-layer feedforward neural networks to model the cepstral feature extraction process. The network weights are initialized to replicate a reference cepstral feature like the mel frequency cepstral coefficient. We then propose a embedded approach that integrates feature learning with the training of a support vector machine (SVM) classifier. A single optimization problem is formulated where the feature and classifier variables are optimized simultaneously so as to refine the initial features and minimize the classification risk. Experimental results have demonstrated the effectiveness of the proposed feature learning approach, outperforming competing methods by a large margin on benchmark data.

Biography: Zhouyu Fu is a lecturer at the School of Computing, Engineering and Mathematics of University of Western Sydney. Prior to joining UWS, he worked as a research fellow at the Gippsland School of Information Technology of Monash University from April 2009 to December 2011. He did his PhD at the Australian National University and obtained his doctoral degree in Information Engineering in October 2009. He was also affiliated with and sponsored by National ICT Australia during his PhD studies at ANU. He obtained his bachelor's and master's degrees in China, from Zhejiang University and Institute of Automation, the Chinese Academy of Sciences respectively.