M-FFN: multi-scale feature fusion network for image captioning

Prudviraj, Jeripothula and Vishnu, Chalavadi and C, Krishna Mohan (2022) M-FFN: multi-scale feature fusion network for image captioning. Applied Intelligence. ISSN 0924-669X

Full text not available from this repository. (Request a copy)

Abstract

In this work, we present a novel multi-scale feature fusion network (M-FFN) for image captioning task to incorporate discriminative features and scene contextual information of an image. We construct multi-scale feature fusion network by leveraging spatial transformation and multi-scale feature pyramid networks via feature fusion block to enrich spatial and global semantic information. In particular, we take advantage of multi-scale feature pyramid network to incorporate global contextual information by employing atrous convolutions on top layers of convolutional neural network (CNN). And, the spatial transformation network is exploited on early layers of CNN to remove intra-class variability caused by spatial transformations. Further, the feature fusion block integrates both global contextual information and spatial features to encode the visual information of an input image. Moreover, spatial-semantic attention module is incorporated to learn attentive contextual features to guide the captioning module. The efficacy of the proposed model is evaluated on the COCO dataset. © 2022, The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature.

[error in script]
IITH Creators:
IITH CreatorsORCiD
C, Krishna Mohanhttps://orcid.org/0000-0002-7316-0836
Item Type: Article
Uncontrolled Keywords: Convolutional captioning; Image captioning; Language attributes; Language generation
Subjects: Computer science
Divisions: Department of Computer Science & Engineering
Depositing User: . LibTrainee 2021
Date Deposited: 16 Jul 2022 10:55
Last Modified: 16 Jul 2022 10:55
URI: http://raiithold.iith.ac.in/id/eprint/9751
Publisher URL: http://doi.org/10.1007/s10489-022-03463-x
OA policy: https://v2.sherpa.ac.uk/id/publication/11767
Related URLs:

Actions (login required)

View Item View Item
Statistics for RAIITH ePrint 9751 Statistics for this ePrint Item