DM1_variant_repeat

Determining variant repeat length of Myotonic Dystrophy type 1 patient from raw sequences

Notable project files:

optimisations.txt lists all attempted changes/optimisations (description, test, reasoning...)
pyth_orig/ contains the original python implementation from https://github.com/picrin/science_of_alignment/blob/master/variant_repeat_algorithm.ipynb
scorecode/ contains the align scoring code. compilation: ./compile.sh
scorevisualizers contains visualizers for outputs of align scoring code (distribution of results per parameter, param vs. param etc.) ^ use python3 somevisualizer.py to get usage, -h for list of options
sequences/ contains sample DNA sequences (real.txt is parsed from github, garbagefree.txt after trimming. raw/ contain the original raw data) ^ due to privacy, they are not part of the repo. Raw data is on S3 in dm1data bucket.
test/ contains output of testing different optimisations
outputs/ contains data-outputs
windowcode, windowvisualizers is a similar pair for an (un)supervised sliding window approach
HMM contains code for the Hidden Markov Model approach
Tablet contains files used for visualization of our data in Tablet (early work in progress)

All tests were done on a single core of a intel i7-5500U CPU @ 2.40GHz

Name		Name	Last commit message	Last commit date
Latest commit History 54 Commits
HMM		HMM
Tablet		Tablet
outputs		outputs
pyth_orig		pyth_orig
scorecode		scorecode
scorevisualizers		scorevisualizers
tests		tests
tests_old_templates		tests_old_templates
windowcode		windowcode
windowvisualizers		windowvisualizers
.gitignore		.gitignore
3D.py		3D.py
CI_names.txt		CI_names.txt
CIplot.py		CIplot.py
DMGV14C_X414C_compare.py		DMGV14C_X414C_compare.py
README.md		README.md
TABLE.txt		TABLE.txt
TEMP_analyze_correlations.py		TEMP_analyze_correlations.py
TODO.txt		TODO.txt
aligned-median-picker.py		aligned-median-picker.py
analyze_correlations.py		analyze_correlations.py
correlations.py		correlations.py
data_desc.txt		data_desc.txt
extractor.cpp		extractor.cpp
filtered_pearsonr.txt		filtered_pearsonr.txt
filtered_spearmanr.txt		filtered_spearmanr.txt
fs_0.txt		fs_0.txt
fs_1.txt		fs_1.txt
fs_2.txt		fs_2.txt
graphcorrelations.py		graphcorrelations.py
optimizations.txt		optimizations.txt
pearsonr.txt		pearsonr.txt
referencer.py		referencer.py
rgb_dist.py		rgb_dist.py
rgb_separate.py		rgb_separate.py
s3upload.py		s3upload.py
samplesize_scan.py		samplesize_scan.py
spearmanr.txt		spearmanr.txt
spearmanr_GO9_removedUnder15Variant.txt		spearmanr_GO9_removedUnder15Variant.txt
suffix_referencer.py		suffix_referencer.py
table-maker.py		table-maker.py
useful_bash.txt		useful_bash.txt
wilcoxon.py		wilcoxon.py
writeup.txt		writeup.txt

Provide feedback