


default search action
9th SSW 2016: Sunnyvale, CA, USA
- Alan W. Black:

The 9th ISCA Speech Synthesis Workshop, SSW 2016, Sunnyvale, CA, USA, September 13-15, 2016. ISCA 2016
Keynote Session 1
- Oriol Guasch:

Large-scale finite element simulations of the physics of voice.
Oral Session 1: Prosody
- Mahsa Sadat Elyasi Langarani, Jan P. H. van Santen:

Automatic, model-based detection of pause-less phrase boundaries from fundamental frequency and duration features. 1-6 - Rasmus Dall, Marcus Tomalin, Mirjam Wester:

Synthesising Filled Pauses: Representation and Datamixing. 7-13 - Pierre-Edouard Honnet, Philip N. Garner:

Emphasis recreation for TTS using intonation atoms. 14-20 - Eva Vanmassenhove, João P. Cabral

, Fasih Haider
:
Prediction of Emotions from Text using Sentiment Analysis for Expressive Speech Synthesis. 21-26
Poster Session 1
- Yasuhiro Hamada, Nobutaka Ono, Shigeki Sagayama:

Non-filter waveform generation from cepstrum using spectral phase reconstruction. 27-31 - Alexandros Lazaridis, Milos Cernak, Pierre-Edouard Honnet, Philip N. Garner:

Investigating Spectral Amplitude Modulation Phase Hierarchy Features in Speech Synthesis. 32-37 - Mirjam Wester, Zhizheng Wu, Junichi Yamagishi:

Multidimensional scaling of systems in the Voice Conversion Challenge 2016. 38-43 - Dong-Yan Huang, Lei Xie, Yvonne Siu Wa Lee, Jie Wu, Huaiping Ming, Xiaohai Tian, Shaofei Zhang, Chuang Ding, Mei Li, Nguyen Quy Hy, Minghui Dong, Haizhou Li:

An Automatic Voice Conversion Evaluation Strategy Based on Perceptual Background Noise Distortion and Speaker Similarity. 44-51 - Yusuke Tajiri, Tomoki Toda:

Nonaudible murmur enhancement based on statistical voice conversion and noise suppression with external noise monitoring. 52-58 - Igor Jauk, Antonio Bonafonte

:
Prosodic and Spectral iVectors for Expressive Speech Synthesis. 59-63 - Michael Pucher, Fernando Villavicencio, Junichi Yamagishi:

Development of a statistical parametric synthesis system for operatic singing in German. 64-69 - Srikanth Ronanki, Siva Reddy Gangireddy, Bajibabu Bollepalli, Simon King:

DNN-based Speech Synthesis for Indian Languages from ASCII text. 70-75 - Sunayana Sitaram, Sai Krishna Rallabandi, Shruti Rijhwani, Alan W. Black:

Experiments with Cross-lingual Systems for Synthesis of Code-Mixed Text. 76-81 - Avni Rajpal, Hemant A. Patil:

Jerk Minimization for Acoustic-To-Articulatory Inversion. 82-87 - Sunhee Kim:

How to select a good voice for TTS. 88-92 - John Andersson, Sebastian Berlin, André Costa

, Harald Berthelsen, Hanna Lindgren, Nikolaj Lindberg, Jonas Beskow, Jens Edlund, Joakim Gustafson:
WikiSpeech - enabling open source text-to-speech for Wikipedia. 93-99
Keynote Session 1
- Alex Acero:

Siri's voice gets deep learning.
Oral Session 2: Deep Learning in Speech Synthesis
- Manuel Sam Ribeiro, Oliver Watts, Junichi Yamagishi:

Parallel and cascaded deep neural networks for text-to-speech synthesis. 100-105 - Keiichi Tokuda, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku:

Temporal modeling in neural network based statistical parametric speech synthesis. 106-111 - Santiago Pascual, Antonio Bonafonte

:
Multi-output RNN-LSTM for multiple speaker speech synthesis with α-interpolation model. 112-117 - Xin Wang, Shinji Takaki, Junichi Yamagishi:

A Comparative Study of the Performance of HMM, DNN, and RNN based Speech Synthesis Systems Trained on Very Large Speaker-Dependent Corpora. 118-121
Demo Session
- Nobuaki Minematsu, Daisuke Saito, Nobuyuki Nishizawa:

Prosodic Reading Tutor of Japanese, Suzuki-kun: The first and only educational tool to teach the formal Japanese. 122 - Hideki Kawahara:

Aliasing-free L-F model and its application to an interactive MATLAB tool and test signal generation for speech analysis procedures. 123 - Srikanth Ronanki, Zhizheng Wu, Oliver Watts, Simon King:

A Demonstration of the Merlin Open Source Neural Network Speech Synthesis System. 124 - Aäron van den Oord, Sander Dieleman, Heiga Zen, Karen Simonyan, Oriol Vinyals, Alex Graves, Nal Kalchbrenner, Andrew W. Senior, Koray Kavukcuoglu:

WaveNet: A Generative Model for Raw Audio. 125 - Blaise Potard, Matthew P. Aylett, David A. Baude:

Demo of Idlak Tangle, An Open Source DNN-Based Parametric Speech Synthesiser. 126
Poster Session 2
- Meet H. Soni, Hemant A. Patil:

Non-intrusive Quality Assessment of Synthesized Speech using Spectral Features and Support Vector Regression. 127-133 - Sushant V. Rao, Nirmesh J. Shah, Hemant A. Patil:

Novel Pre-processing using Outlier Removal in Voice Conversion. 134-139 - Zhaojie Luo, Tetsuya Takiguchi, Yasuo Ariki:

Emotional Voice Conversion Using Neural Networks with Different Temporal Scales of F0 based on Wavelet Transform. 140-145 - Cassia Valentini-Botinhao, Xin Wang, Shinji Takaki, Junichi Yamagishi:

Investigating RNN-based speech enhancement methods for noise-robust Text-to-Speech. 146-152 - Shinji Takaki, Sangjin Kim, Junichi Yamagishi:

Speaker Adaptation of Various Components in Deep Neural Network based Speech Synthesis. 153-159 - Zhengchen Zhang, Fuxiang Wu, Chenyu Yang, Minghui Dong, Fugen Zhou:

Mandarin Prosodic Phrase Prediction based on Syntactic Trees. 160-165 - Xin Wang, Shinji Takaki, Junichi Yamagishi:

Investigating Very Deep Highway Networks for Parametric Speech Synthesis. 166-171 - Sivanand Achanta, Rambabu Banoth, Ayushi Pandey, Anandaswarup Vadapalli, Suryakanth V. Gangashetty:

Contextual Representation using Recurrent Neural Network Hidden State for Statistical Parametric Speech Synthesis. 172-177 - Nobuyuki Nishizawa, Tomonori Yazaki:

Wide Passband Design for Cosine-Modulated Filter Banks in Sinusoidal Speech Synthesis. 178-183 - Pallavi Baljekar, Alan W. Black:

Utterance Selection Techniques for TTS Systems Using Found Speech. 184-189 - Andrew Wilkinson, Alok Parlikar, Sunayana Sitaram, Tim White, Alan W. Black, Suresh Bazaj:

Open-Source Consumer-Grade Indic Text To Speech. 190-195 - Mei Li, Zhizheng Wu, Lei Xie:

On the impact of phoneme alignment in DNN-based speech synthesis. 196-201 - Zhizheng Wu, Oliver Watts, Simon King:

Merlin: An Open Source Neural Network Speech Synthesis System. 202-207
Keynote Session 3
- Quoc V. Le:

End-to-end Learning for Text and Speech.
Oral Session 3: Analysis and Modeling for Speech Synthesis
- Jonas Beskow, Harald Berthelsen:

A hybrid harmonics-and-bursts modelling approach to speech synthesis. 208-213 - Gilles Degottex, Pierre Lanchantin, Mark J. F. Gales:

A Pulse Model in Log-domain for a Uniform Synthesizer. 214-220 - Hideki Kawahara, Yannis Agiomyrgiannakis, Heiga Zen:

Using instantaneous frequency and aperiodicity detection to estimate F0 for high-quality speech synthesis. 221-228 - Slava Shechtman, Alexander Sorin:

Wideband Harmonic Model: Alignment and Noise Modeling for High Quality Speech Synthesis. 229-234

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID














