Skip to content


Repository files navigation


This is a set of Python / Pytorch scripts and tools for various speech-processing projects.

It is maintained by Xin Wang since 2021.

XW is a Pytorch newbie. Please feel free to give suggestions and feedback.


  • The repo is relatively large. Please use --depth 1 option for fast cloning.
git clone --depth 1
  • Latest updates:
    1. Code, databases, and resources for paper below were added. Please check project/10-asvspoof-vocoded-trn-ssl/

      Xin Wang, and Junichi Yamagishi. Can Large-scale vocoded spoofed data improve speech spoofing countermeasure with a self-supervised front end? ICASSP 2024

    2. Neural vocoders pretrained on VoxCeleb2 dev and other datasets are available in tutorial notebook chapter_a3.ipynb Open In Colab

    3. Code, databases, and resources for the paper below were added. Please check project/09-asvspoof-vocoded-trn/ for more details.

      Xin Wang, and Junichi Yamagishi. Spoofed training data for speech spoofing countermeasure can be efficiently created using neural vocoders. Proc. ICASSP 2023, accepted.

    4. Code for the paper for the paper below were added. Please check project/08-asvspoof-activelearn for more details.

      Xin Wang, and Junichi Yamagishi. Investigating Active-Learning-Based Training Data Selection for Speech Spoofing Countermeasure. In Proc. SLT, accepted. 2023.

    5. Pointer to tutorials on neural vocoders were moved to ./tutorials/b1_neural_vocoder.

    6. All pre-trained models were moved to Zenodo.


This repository contains a few projects and tutorials.


Folder Project
project/01-nsf Neural source-filter waveform models
project/05-nn-vocoders Other neural waveform models including WaveNet, WaveGlow, and iLPCNet.
project/03-asvspoof-mega Speech spoofing countermeasures : a comparison of some popular countermeasures
project/06-asvspoof-ood Speech spoofing countermeasures with confidence estimation
project/07-asvspoof-ssl Speech spoofing countermeasures with pre-trained self-supervised-learning (SSL) speech feature extractor
project/08-asvspoof-activelearn Speech spoofing countermeasures in an active learning framework
project/09-asvspoof-vocoded-trn Speech spoofing countermeasures using vocoded speech as spoofed data

See project/ for an overview.


Folder Status Contents
b1_neural_vocoder readable and executable tutorials on selected neural vocoders
b2_anti_spoofing partially finished tutorials on speech audio anti-spoofing
b3_voice_privacy readable and executable tutorials on voice privacy challenge baselines

See tutorials/ for an overview.

Python environment

Projects above use either one of the two environments:

For most of the projects, install env.yml is sufficient

# create environment
conda env create -f env.yml

# load environment (whose name is pytorch-1.7)
conda activate pytorch-1.7

For projects using SSL models, use ./ to install the dependency.

# make sure other conda envs are not loaded

# load
conda activate fairseq-pip2

How to use

Most of the projects include a simple demonstration script. Take project/01-nsf/cyc-noise-nsf as an example:

# cd into one project
cd project/01-nsf/cyc-noise-nsf-4

# add PYTHONPATH and activate conda environment
source ../../../ 

# run the script

The printed messages will show what is happening.

Detailed instruction is in README of each project.

Folder structure

Name Function
./core_scripts scripts (Numpy or Pytorch code) to manage the training process, data io, etc.
./core_modules finalized pytorch modules
./sandbox new functions and modules to be test
./project project directories, and each folder correspond to one model for one dataset
./project/*/*/ script to load data and run training and inference
./project/*/*/ model definition based on Pytorch APIs
./project/*/*/ configurations for training/val/test set data
./project/*/*/*.sh scripts to wrap the python codes

See more instructions on the design and conventions of this repository misc/

Resources & links

By Xin Wang