Abstract: While transformers demonstrate outstanding performance across various audio tasks, their application to neural vocoders remains challenging. Neural vocoders require the generation of long ...
Abstract: This paper introduces V2Coder, a non-autoregressive vocoder based on hierarchical variational autoencoders (VAEs). The hierarchical VAE with hierarchically extended prior and approximate ...
RepCodec is a speech tokenization method for converting a speech waveform into a sequence of discrete semantic tokens. The main idea is to train a representation codec which learns a vector ...