Deep learning techniques for automatic Image Captioning

This is the repository that goes with the final year thesis titled "Deep Learning techniques for Automatic Image Captioning" done under the supervision of Dr. Anamika Singh at Visvesvaraya National Institute of Technology, Nagpur. The repo contains the following implementations of image captioning models on the Flickr8k and Flickr30k datasets.

Among these, the last one i.e. Resnet image encoder with modified transformers is a plausible novelty which results in much faster convergence of the image captioning training process and even the loss convergence is much lower than that of its simple, vanilla transformer counter-part. All these models were examined by calculating the METEOR, BLEU and ROGUE scores for the generated captions.

A part of this work in the form of a literature survey in the field of transformer based image-captioning was presented in the paper titled: "Attending to transformer: A survey on transformer-based image captioning" which was accepted at the 2nd International Conference on the Paradigm shifts in Communication, embedded systems, machine learning and signal processing.

Plots

Scores	Loss Curves

Sample Outputs

	Prediction: snowboarder in all red coat slides riding down the slope hill . . . . . . . . . . . . . . . . . . Reference: A snowboarder wearing a red jacket is boarding down the snowy hill .

Contributors

Kshitij Ambilduke
Thanmay Jayakumar

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
filter_captioning		filter_captioning
README.md		README.md
[FYP]_Resnet_GRU.ipynb		[FYP]_Resnet_GRU.ipynb
[FYP]_Resnet_LSTM.ipynb		[FYP]_Resnet_LSTM.ipynb
[FYP]_Resnet_RNN.ipynb		[FYP]_Resnet_RNN.ipynb
[FYP]_Resnet_Transformers.ipynb		[FYP]_Resnet_Transformers.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Deep learning techniques for automatic Image Captioning

Plots

Sample Outputs

Contributors

About

Releases

Packages

Contributors 2

Languages

Kshitij-Ambilduke/Image-Captioning

Folders and files

Latest commit

History

Repository files navigation

Deep learning techniques for automatic Image Captioning

Plots

Sample Outputs

Contributors

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages