Image Caption Generator Implementation

Volume: 10 | Issue: 02 | Year 2024 | Subscription
International Journal of Image Processing and Pattern Recognition
Received Date: 05/06/2024
Acceptance Date: 06/22/2024
Published On: 2024-10-08
First Page:
Last Page:

By: Sejal Jain, Saloni Agarwal, Sonam Gour, Sanjivani Sharma, and Shrishti Agarwal


Image caption generation is a software technology that takes an image as input and produces a descriptive caption in text form. In the modern era, this technology finds application in various fields. For instance, automatically generating captions for medical images aids in diagnosis and enhances reporting efficiency, helping healthcare professionals to quickly interpret complex visuals. In the realm of autonomous vehicles, image captioning enables these vehicles to understand and communicate about their surroundings, thereby improving safety and navigation. Furthermore, in journalism, generating captions for news images can enhance comprehension and engagement for readers. This paper will provide an overview of the technologies that can be used to develop an image caption generator using the Flickr8K dataset from Kaggle. The implementation includes various tools like OpenCV, which are widely utilized by leading tech companies such as Google and Microsoft. The paper also includes snapshots of the generated outputs to illustrate the model’s effectiveness. The primary aim of this implementation is to gain insights into the practical use of these tools and technologies in real-world projects.



