Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Text generation recipe #258

Open
wants to merge 53 commits into
base: master
Choose a base branch
from
Open
Changes from 1 commit
Commits
Show all changes
53 commits
Select commit Hold shift + click to select a range
8194f3f
Integrate ULMFiT (initial)
Chandu-4444 Jul 19, 2022
7662230
Add Paragraph to train_classifier
Chandu-4444 Jul 19, 2022
1418e50
Add batchseq to pad batch (naive version)
Chandu-4444 Jul 21, 2022
b682105
Remove Project.toml changes.
Chandu-4444 Jul 21, 2022
c1064bc
Add vocab_size to TextClassificationTask
Chandu-4444 Jul 24, 2022
345ced1
Add `vocab_size` to encodings
Chandu-4444 Jul 25, 2022
46b6826
Test `batches` integration with model.
Chandu-4444 Jul 25, 2022
388e8ac
Update load_batchseq function.
Chandu-4444 Jul 25, 2022
8bfe705
Clean up useless code from TextModels.jl.
Chandu-4444 Jul 29, 2022
2482aaa
Update FastText/src/models/pretrain_lm.jl
Chandu-4444 Aug 2, 2022
8bc930d
Update FastText/src/models/dataloader.jl
Chandu-4444 Aug 2, 2022
3882b1d
Update FastText/src/models/custom_layers.jl
Chandu-4444 Aug 2, 2022
3057989
Update FastText/src/models/custom_layers.jl
Chandu-4444 Aug 2, 2022
307fde1
Add `reset!` for AWD_LSTM.
Chandu-4444 Aug 2, 2022
075a21e
Add `textlearner`.
Chandu-4444 Aug 8, 2022
f16ec2c
Complere text classification pipeline.
Chandu-4444 Aug 8, 2022
3469630
Upadate `LanguageModel` to use `Flux.reset!`.
Chandu-4444 Aug 18, 2022
974a622
Include models.jl file.
Chandu-4444 Aug 23, 2022
8e9f7aa
Start text generation recipe for `imdb`
Chandu-4444 Aug 23, 2022
0d44bbd
Update FastText/src/models/custom_layers.jl
Chandu-4444 Aug 23, 2022
4ad9a12
Update FastText/src/models/custom_layers.jl
Chandu-4444 Aug 23, 2022
080a018
Update FastText/src/models/custom_layers.jl
Chandu-4444 Aug 23, 2022
922334d
Update FastText/src/models/custom_layers.jl
Chandu-4444 Aug 23, 2022
ce34be1
Update FastText/src/models/custom_layers.jl
Chandu-4444 Aug 23, 2022
6cea902
Update FastText/src/models/custom_layers.jl
Chandu-4444 Aug 23, 2022
5159fb8
Update FastText/src/models/custom_layers.jl
Chandu-4444 Aug 23, 2022
c6b69f7
Update FastText/src/models/custom_layers.jl
Chandu-4444 Aug 23, 2022
164fe21
Update FastText/src/models/custom_layers.jl
Chandu-4444 Aug 24, 2022
1ce6df2
Add suggestions and improvements from the call.
Chandu-4444 Aug 26, 2022
712d9c8
Use previous `VarDrop` code for using in colab.
Chandu-4444 Aug 28, 2022
6d6504b
Use NNlib for scalar indexing
Chandu-4444 Aug 29, 2022
5af6edb
Updates to Project.toml
Chandu-4444 Aug 31, 2022
6636393
Merge branch 'textmodel-integration' into text-generation-recipe
Chandu-4444 Aug 31, 2022
6cc9053
Update code to solve `getfield non-differentiable` error.
Chandu-4444 Sep 1, 2022
d0ee3a4
Add `TextGeneration` task
Chandu-4444 Sep 1, 2022
659e6d1
Modify type params for `LanguageModel` and
Chandu-4444 Sep 1, 2022
0a151b2
Update FastText/src/models/train_text_classifier.jl
Chandu-4444 Sep 1, 2022
12bc9ab
Update FastText/src/models/train_text_classifier.jl
Chandu-4444 Sep 1, 2022
35c345f
Update dtypes to avoid CuArray errors.
Chandu-4444 Sep 6, 2022
cceee46
Add callable TextClassifier
Chandu-4444 Sep 6, 2022
f7d51f6
Update FastText/src/models/custom_layers.jl
Chandu-4444 Sep 8, 2022
90c9a79
Update FastText/src/models/custom_layers.jl
Chandu-4444 Sep 8, 2022
7e7de6d
Update `Flux.reset!()`
Chandu-4444 Sep 12, 2022
aae4442
Merge branch 'textmodel-integration' into text-generation-recipe
Chandu-4444 Sep 13, 2022
c1418b3
Update few Flux.dropout functions.
Chandu-4444 Sep 13, 2022
2c04d19
Update code to avoid non-differentiable error
Chandu-4444 Sep 13, 2022
5b96c78
Merge branch 'textmodel-integration' into text-generation-recipe
Chandu-4444 Sep 16, 2022
9c60de6
Add batch generation for generation task.
Chandu-4444 Sep 19, 2022
3903bb2
Push to test on colab
Chandu-4444 Sep 21, 2022
d4aa13c
Add blockmodel for LanguageModel
Chandu-4444 Sep 21, 2022
fb69dc5
Fix `TextClassificationTask`
Chandu-4444 Sep 21, 2022
416a800
Replace `map` with `mapobs`
Chandu-4444 Sep 24, 2022
232f3bf
Update `onehot` encode for NumberVector
Chandu-4444 Sep 26, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Test batches integration with model.
Chandu-4444 committed Jul 25, 2022

Verified

This commit was signed with the committer’s verified signature.
jeremystucki Jeremy Stucki
commit 46b6826766be4e81be8fb05a78aa70e66bfd59a2
1 change: 1 addition & 0 deletions FastText/src/FastText.jl
Original file line number Diff line number Diff line change
@@ -39,6 +39,7 @@ using DataDeps
using BSON
using TextAnalysis
using MLUtils
using Zygote


include("recipes.jl")
2 changes: 1 addition & 1 deletion FastText/src/encodings/textpreprocessing.jl
Original file line number Diff line number Diff line change
@@ -70,7 +70,7 @@ function computevocabulary(data; vocab_size=40000)
counter = 3

for (k, v) in ordered_dict
ordered_dict[k] = counter + 1
ordered_dict[k] = counter
counter = counter + 1
end

6 changes: 3 additions & 3 deletions FastText/src/models/pretrain_lm.jl
Original file line number Diff line number Diff line change
@@ -26,9 +26,9 @@ mutable struct LanguageModel
layers :: Flux.Chain
end

function LanguageModel(load_pretrained::Bool=false, vocabpath::String=joinpath(@__DIR__,"vocabs/lm_vocab.csv");embedding_size::Integer=400, hid_lstm_sz::Integer=1150, out_lstm_sz::Integer=embedding_size,
function LanguageModel(load_pretrained::Bool=false, task::Any = Nothing;embedding_size::Integer=400, hid_lstm_sz::Integer=1150, out_lstm_sz::Integer=embedding_size,
embed_drop_prob::Float64 = 0.05, in_drop_prob::Float64 = 0.4, hid_drop_prob::Float64 = 0.5, layer_drop_prob::Float64 = 0.3, final_drop_prob::Float64 = 0.3)
vocab = (string.(readdlm(vocabpath, ',')))[:, 1]
vocab = task.encodings[3].vocab.keys
de = gpu(DroppedEmbeddings(length(vocab), embedding_size, embed_drop_prob; init = (dims...) -> init_weights(0.1, dims...)))
lm = LanguageModel(
vocab,
@@ -45,7 +45,7 @@ function LanguageModel(load_pretrained::Bool=false, vocabpath::String=joinpath(@
softmax
)
)
load_pretrained && load_model!(lm, datadep"Pretrained ULMFiT Language Model/ulmfit_lm_en.bson")
# load_pretrained && load_model!(lm, datadep"Pretrained ULMFiT Language Model/ulmfit_lm_en.bson")
return lm
end

35 changes: 19 additions & 16 deletions FastText/src/models/train_text_classifier.jl
Original file line number Diff line number Diff line change
@@ -89,21 +89,23 @@ gen : data loader, which will give 'X' of the mini-batch in one call
tracked_steps : This is the number of tracked time-steps for Truncated Backprop thorugh time,
these will be last time-steps for which gradients will be calculated.
"""
function forward(tc::TextClassifier, gen::Channel, tracked_steps::Integer=32)
function forward(tc::TextClassifier, batches, tracked_steps::Integer=32)
# swiching off tracking
classifier = tc
X = take!(gen)
# println("X = $X")
# X = take!(gen)
X = batches[1][1]
l = length(X)
# Truncated Backprop through time
println("l = $l")
Zygote.ignore() do
for i=1:ceil(l/tracked_steps)-1 # Tracking is swiched off inside this loop
println("i = $i / $(ceil(l/tracked_steps)-1)")
(i == 1 && l%tracked_steps != 0) ? (last_idx = l%tracked_steps) : (last_idx = tracked_steps)
H = broadcast(x -> indices(x, classifier.vocab, "_unk_"), X[1:last_idx])
# H = broadcast(x -> indices(x, classifier.vocab, "_unk_"), X[1:last_idx])
H = X[1:last_idx]
H = classifier.rnn_layers.(H)
X = X[last_idx+1:end]
println(length(X))
end

println("Start shifting states")
@@ -125,7 +127,8 @@ function forward(tc::TextClassifier, gen::Channel, tracked_steps::Integer=32)
end
println("End shifting")
# last part of the sequecnes in X - Tracking is swiched on
H = broadcast(x -> tc.rnn_layers[1](indices(x, classifier.vocab, "_unk_")), X)
# H = broadcast(x -> tc.rnn_layers[1](indices(x, classifier.vocab, "_unk_")), X)
H = classifier.rnn_layers[1](X[1])
H = tc.rnn_layers[2:end].(H)
H = tc.linear_layers(H)
return H
@@ -144,20 +147,21 @@ classifier : Instance of TextClassifier
gen : 'Channel' [data loader], to give a mini-batch
tracked_steps : specifies the number of time-steps for which tracking is on
"""
function loss(classifier::TextClassifier, gen::Channel, tracked_steps::Integer=32)
H = forward(classifier, gen, tracked_steps)
Y = gpu(take!(gen))
l = crossentropy(H, Y)
function loss(classifier::TextClassifier, batches, tracked_steps::Integer=32)
H = forward(classifier, batches, tracked_steps)
# Y = gpu(take!(gen))
Y = batches[1][2]
l = Flux.Losses.crossentropy(H, Y)
# reset!(classifier.rnn_layers)
println("Loss = $l")
return l
end

function discriminative_step!(layers, classifier::TextClassifier, gen::Channel, tracked_steps::Integer, ηL::Float64, opts::Vector)
function discriminative_step!(layers, classifier::TextClassifier, batches, tracked_steps::Integer, ηL::Float64, opts::Vector)
@assert length(opts) == length(layers)
# Gradient calculation
println("Start grads")
grads = Zygote.gradient(() -> loss(classifier, gen, tracked_steps), get_trainable_params(layers))
grads = Zygote.gradient(() -> loss(classifier, batches, tracked_steps), get_trainable_params(layers))

println("Done grads")
# discriminative step
@@ -179,9 +183,8 @@ end
It contains main training loops for training a defined classifer for specified classes and data.
Usage is discussed in the docs.
"""
function train_classifier!(classifier::TextClassifier=TextClassifier(), data = (loadrecipe()["imdb"]))
function train_classifier!(classifier::TextClassifier=TextClassifier(), batches=Nothing)

# dala_loader = imdb_classifier_data
classes = 2
hidden_layer_size = 50
stlr_cut_frac=0.1
@@ -201,8 +204,8 @@ function train_classifier!(classifier::TextClassifier=TextClassifier(), data = (

for epoch=1:epochs
println("Epoch: $epoch")
gen = data
num_of_iters = numobs(data)
gen = batches
num_of_iters = length(batches)
cut = num_of_iters * epochs * stlr_cut_frac
for iter=1:num_of_iters

@@ -216,7 +219,7 @@ function train_classifier!(classifier::TextClassifier=TextClassifier(), data = (
# Gradual-unfreezing Step with discriminative fine-tuning
unfreezed_layers, cur_opts = (epoch < length(trainable)) ? (trainable[end-epoch+1:end], opts[end-epoch+1:end]) : (trainable, opts)
println("start discriminative_step")
discriminative_step!(unfreezed_layers, classifier, gen, tracked_steps,ηL, cur_opts)
discriminative_step!(unfreezed_layers, classifier, batches, tracked_steps,ηL, cur_opts)

println("End discriminative_step")