일 | 월 | 화 | 수 | 목 | 금 | 토 |
---|---|---|---|---|---|---|
1 | ||||||
2 | 3 | 4 | 5 | 6 | 7 | 8 |
9 | 10 | 11 | 12 | 13 | 14 | 15 |
16 | 17 | 18 | 19 | 20 | 21 | 22 |
23 | 24 | 25 | 26 | 27 | 28 | 29 |
30 | 31 |
- 딥러닝 음성 합성
- 타코트론
- 음성 합성
- melgan
- YOLO
- TTS
- 한국어 음성 합성
- 윈도우
- singing voice synthesis
- 보코더
- 딥러닝 보코더
- text-to-speech
- 트레이닝
- DCTTS
- 학습
- you only look once
- Vocoder
- 노래합성
- 딥러닝
- tacotron
- waveglow
- deep voice
- 한국어 tts
- korean tts
- Today
- Total
chldkato
딥러닝 음성 합성 (TTS) / 보코더 github, 논문 모음 본문
Tacotron
Tacotron: Towards End-to-End Speech Synthesis
A text-to-speech synthesis system typically consists of multiple stages, such as a text analysis frontend, an acoustic model and an audio synthesis module. Building these components often requires extensive domain expertise and may contain brittle design c
arxiv.org
https://github.com/keithito/tacotron
keithito/tacotron
A TensorFlow implementation of Google's Tacotron speech synthesis with pre-trained model (unofficial) - keithito/tacotron
github.com
WaveNet 보코더
WaveNet: A Generative Model for Raw Audio
This paper introduces WaveNet, a deep neural network for generating raw audio waveforms. The model is fully probabilistic and autoregressive, with the predictive distribution for each audio sample conditioned on all previous ones; nonetheless we show that
arxiv.org
https://github.com/r9y9/wavenet_vocoder
r9y9/wavenet_vocoder
WaveNet vocoder. Contribute to r9y9/wavenet_vocoder development by creating an account on GitHub.
github.com
Tacotron2
Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions
This paper describes Tacotron 2, a neural network architecture for speech synthesis directly from text. The system is composed of a recurrent sequence-to-sequence feature prediction network that maps character embeddings to mel-scale spectrograms, followed
arxiv.org
https://github.com/Rayhane-mamah/Tacotron-2
Rayhane-mamah/Tacotron-2
DeepMind's Tacotron-2 Tensorflow implementation. Contribute to Rayhane-mamah/Tacotron-2 development by creating an account on GitHub.
github.com
WaveGlow
WaveGlow: A Flow-based Generative Network for Speech Synthesis
In this paper we propose WaveGlow: a flow-based network capable of generating high quality speech from mel-spectrograms. WaveGlow combines insights from Glow and WaveNet in order to provide fast, efficient and high-quality audio synthesis, without the need
arxiv.org
https://github.com/NVIDIA/waveglow
NVIDIA/waveglow
A Flow-based Generative Network for Speech Synthesis - NVIDIA/waveglow
github.com
multi speaker Tacotron
https://github.com/carpedm20/multi-speaker-tacotron-tensorflow
carpedm20/multi-speaker-tacotron-tensorflow
Multi-speaker Tacotron in TensorFlow. Contribute to carpedm20/multi-speaker-tacotron-tensorflow development by creating an account on GitHub.
github.com
Tacotron + WaveNet
https://github.com/hccho2/Tacotron-Wavenet-Vocoder
hccho2/Tacotron-Wavenet-Vocoder
Tacotron, Korean, Wavenet-Vocoder, Korean TTS. Contribute to hccho2/Tacotron-Wavenet-Vocoder development by creating an account on GitHub.
github.com
https://github.com/hccho2/Tacotron2-Wavenet-Korean-TTS
hccho2/Tacotron2-Wavenet-Korean-TTS
Korean TTS, Tacotron2, Wavenet. Contribute to hccho2/Tacotron2-Wavenet-Korean-TTS development by creating an account on GitHub.
github.com
DCTTS
Efficiently Trainable Text-to-Speech System Based on Deep Convolutional Networks with Guided Attention
This paper describes a novel text-to-speech (TTS) technique based on deep convolutional neural networks (CNN), without use of any recurrent units. Recurrent neural networks (RNN) have become a standard technique to model sequential data recently, and this
arxiv.org
https://github.com/Kyubyong/dc_tts
Kyubyong/dc_tts
A TensorFlow Implementation of DC-TTS: yet another text-to-speech model - Kyubyong/dc_tts
github.com
https://github.com/Kyubyong/kss
Kyubyong/kss
Contribute to Kyubyong/kss development by creating an account on GitHub.
github.com
MelGAN
MelGAN: Generative Adversarial Networks for Conditional Waveform Synthesis
Previous works (Donahue et al., 2018a; Engel et al., 2019a) have found that generating coherent raw audio waveforms with GANs is challenging. In this paper, we show that it is possible to train GANs reliably to generate high quality coherent waveforms by i
arxiv.org
https://github.com/descriptinc/melgan-neurips
descriptinc/melgan-neurips
GAN-based Mel-Spectrogram Inversion Network for Text-to-Speech Synthesis - descriptinc/melgan-neurips
github.com
https://github.com/seungwonpark/melgan
seungwonpark/melgan
MelGAN vocoder (compatible with NVIDIA/tacotron2). Contribute to seungwonpark/melgan development by creating an account on GitHub.
github.com
Tacotron2 + WaveGlow
https://github.com/NVIDIA/tacotron2
NVIDIA/tacotron2
Tacotron 2 - PyTorch implementation with faster-than-realtime inference - NVIDIA/tacotron2
github.com
VocGAN
VocGAN: A High-Fidelity Real-time Vocoder with a Hierarchically-nested Adversarial Network
We present a novel high-fidelity real-time neural vocoder called VocGAN. A recently developed GAN-based vocoder, MelGAN, produces speech waveforms in real-time. However, it often produces a waveform that is insufficient in quality or inconsistent with acou
arxiv.org
rishikksh20/VocGAN
VocGAN: A High-Fidelity Real-time Vocoder with a Hierarchically-nested Adversarial Network - rishikksh20/VocGAN
github.com
TFGAN
TFGAN: Time and Frequency Domain Based Generative Adversarial Network for High-fidelity Speech Synthesis
Recently, GAN based speech synthesis methods, such as MelGAN, have become very popular. Compared to conventional autoregressive based methods, parallel structures based generators make waveform generation process fast and stable. However, the quality of ge
arxiv.org
rishikksh20/TFGAN
TFGAN: Time and Frequency Domain Based Generative Adversarial Network for High-fidelity Speech Synthesis - rishikksh20/TFGAN
github.com
HiFi-GAN
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis
Several recent work on speech synthesis have employed generative adversarial networks (GANs) to produce raw waveforms. Although such methods improve the sampling efficiency and memory usage, their sample quality has not yet reached that of autoregressive a
arxiv.org
jik876/hifi-gan
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis - jik876/hifi-gan
github.com
github.com/rishikksh20/HiFi-GAN
rishikksh20/HiFi-GAN
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis - rishikksh20/HiFi-GAN
github.com
WaveGrad
WaveGrad: Estimating Gradients for Waveform Generation
This paper introduces WaveGrad, a conditional model for waveform generation which estimates gradients of the data density. The model is built on prior work on score matching and diffusion probabilistic models. It starts from a Gaussian white noise signal a
arxiv.org
ivanvovk/WaveGrad
Implementation of Google Brain's WaveGrad high-fidelity vocoder (paper: https://arxiv.org/pdf/2009.00713.pdf). First implementation on GitHub. - ivanvovk/WaveGrad
github.com
'딥러닝' 카테고리의 다른 글
윈도우에서 Tacotron 한국어 TTS 학습하기 (98) | 2020.03.25 |
---|---|
윈도우에서 DCTTS (Deep Convolutional TTS) 학습하기 (19) | 2019.10.30 |
윈도우에서 waveglow 학습하기 (4) | 2019.09.14 |
윈도우에서 딥러닝 음성 합성(Multi-Speaker Tacotron) 학습하기 (12) | 2019.07.28 |
윈도우에서 YOLO 학습하기 (9) | 2019.07.12 |