Skip to content

Latest commit

 

History

History

seq2seq_dialogue_1

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 

Seq2Seq for Translation or Dialogue (1)

This is an implementation of Seq2Seq to make it easy to train the seq2seq model with any corpus for translation or dialogue task.

The original codes are available here: [suriyadeepan/easy_seq2seq].

If you using the provided sample dataset, the unicode error may happens. thus, before training, you should follow the suggestions of UnicodeError issue to solve it: [link], or using the following command to convert all of the training/testing corpus to utf-8 format:

$ iconv -f WINDOWS-1252 -t UTF-8//TRANSLIT old_files -o new_files

To train the model, Create temporary working directory prior to training:

$ mkdir working_dir

Then Download test/train data from Cornell Movie Dialog Corpus and convert to utf-8 format (as described before):

$ cd data/
$ ./pull_data.sh
$ iconv -f WINDOWS-1252 -t UTF-8//TRANSLIT old_files -o new_files

Finally training or testing:

# edit seq2seq.ini file to set: mode = train
python seq2seq_main.py
# edit seq2seq.ini file to set: mode = test
python seq2seq_main.py

Test on Python3.6 + tensorflow==0.12.1