Attention-Based Image Captioning Using DenseNet Features

Md Zakir Hossain*, Ferdous Sohel, Mohd Fairuz Shiratuddin, Hamid Laga, Mohammed Bennamoun

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Citations (Scopus)


We present an attention-based image captioning method using DenseNet features. Conventional image captioning methods depend on visual information of the whole scene to generate image captions. Such a mechanism often fails to get the information of salient objects and cannot generate semantically correct captions. We consider an attention mechanism that can focus on relevant parts of the image to generate fine-grained description of that image. We use image features from DenseNet. We conduct our experiments on the MSCOCO dataset. Our proposed method achieved 53.6, 39.8, and 29.5 on BLEU-2, 3, and 4 metrics, respectively, which are superior to the state-of-the-art methods.

Original languageEnglish
Title of host publicationNeural Information Processing - 26th International Conference, ICONIP 2019, Proceedings
EditorsTom Gedeon, Kok Wai Wong, Minho Lee
Number of pages9
ISBN (Print)9783030368012
Publication statusPublished - 2019
Externally publishedYes
Event26th International Conference on Neural Information Processing, ICONIP 2019 - Sydney, Australia
Duration: Dec 12 2019Dec 15 2019

Publication series

NameCommunications in Computer and Information Science
Volume1143 CCIS
ISSN (Print)1865-0929
ISSN (Electronic)1865-0937


Conference26th International Conference on Neural Information Processing, ICONIP 2019


  • Attention
  • DenseNet
  • Image captioning

ASJC Scopus subject areas

  • General Computer Science
  • General Mathematics


Dive into the research topics of 'Attention-Based Image Captioning Using DenseNet Features'. Together they form a unique fingerprint.

Cite this