Extract Genotypes

Based on selected snps, extract genotypes from VCF files.

SNP selection using Pan-UKB

in case one selected the Pan-UKB GWAS for SNP selection, the combined_gwas.csv file from thw pan_ukbiobank_gwas workflow needs to be used to select the SNPs and save their position in data/selected_genotypes with the help of 1_rank_chr_pos.py. The scripts also writes a combined tsv with the positons from all chromosomes.

Run the Snakemake workflow

Uses bcftools/1.20 environment module on our cluster.

snakemake -c1 --use-envmodules -n  # dry-run
snakemake -c1 --use-envmodules  #  run with one core

Manuel example

Uses the imputed vcf files under /datasets/ukb_32683-AUDIT/genotype/cur/vcf/.

# Load bcftools
module load perl/5.38.0 gsl/2.5  bcftools/1.20
# Extract genotypes
VCF_FILE=/datasets/ukb_32683-AUDIT/genotype/cur/vcf/ukb32683_cal_chr1_v2.vcf.gz
bcftools query --regions-file data/selected_genotypes/genotypes_chr9.tsv $VCF_FILE --format '%CHROM\t%POS\t%REF\t%ALT[\t%GT]\n' > chr1_genotypes.tsv
# Extract imputed genotypes
VCF_FILE=/datasets/ukb_32683-AUDIT/imputed_genotype/cur/vcf/ukb32683_imp_chr1_v3.vcf.gz
bcftools query --regions-file data/selected_genotypes/genotypes_chr9.tsv $VCF_FILE --format '%CHROM\t%POS\t%REF\t%ALT[\t%GT]\n' > chr1_genotypes.tsv
bcftools query --list-samples $VCF_FILE > samples.txt

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
data/selected_genotypes		data/selected_genotypes
scripts		scripts
.DS_Store		.DS_Store
.gitignore		.gitignore
1_rank_chr_pos.py		1_rank_chr_pos.py
README.md		README.md
Snakefile		Snakefile
reformat_encoded.py		reformat_encoded.py
snp_clustering.py		snp_clustering.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Extract Genotypes

SNP selection using Pan-UKB

Run the Snakemake workflow

Manuel example

About

Releases

Packages

Contributors 2

Languages

RasmussenLab/extract_genotypes_workflow

Folders and files

Latest commit

History

Repository files navigation

Extract Genotypes

SNP selection using Pan-UKB

Run the Snakemake workflow

Manuel example

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages