Publication Type : Conference Paper
Publisher : in Proc. IEEE Tech Symposium, IIT Kharagpur, 2016
Campus : Coimbatore
School : School of Engineering
Center : Computational Engineering and Networking
Year : 2016
Abstract : The present work discusses the issues of epoch extraction from expressive speech signals. Epochs represent the accurate glottal closure instants in voiced speech which in turn give the accurate instants of maximum excitation of the vocal tract. Even though, there are many existing methods for epoch extraction, which provide near perfect epoch estimation from clean or neutral speech, these methods show significant drop in the epoch extraction performance for expressive speech signals. The occurrence of uncontrolled and rapid pitch variations in expressive speech signals cause degradation in the epoch extraction performance. The objective of the present work is to improve the epoch extraction performance of the speech signals with various perceptually distinct expressions compared to neutral speech using zero frequency filtering (ZFF) approach. In order to capture the rapid and uncontrolled variations in expressive speech utterances, trend removal is performed on short segments (25 ms) of the output obtained from the cascade of three zero frequency resonators (ZFR). The epoch estimation performance of the proposed method is compared with the conventional ZFF method, existing refined ZFF method proposed for expressive speech and recently proposed zero band filtering (ZBF) approach. The effectiveness of the approach is confirmed by the improved epoch identification rate and reduced miss and false alarm rates compared with that of the existing methods.