The Multilingual Eyes Multimodal Traveler’s App

Wilbert Villalobos, Yulia Kumar, J. Jenny Li

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

This paper presents an in-depth analysis of “The Multilingual Eyes Multimodal Traveler’s App” (MEMTA), a novel application in the realm of travel technology, leveraging advanced Artificial Intelligence (AI) capabilities. The core of MEMTA’s innovation lies in its integration of multimodal Large Language Models (LLMs), notably ChatGPT-4-Vision, to enhance navigational assistance and situational awareness for tourists and visually impaired individuals in diverse environments. The study rigorously evaluates how the incorporation of OpenAI’s Whisper and DALL-E 3 technologies augments the app’s proficiency in real-time, multilingual translation, pronunciation, and visual content generation, thereby significantly improving the user experience in various geographical settings. A key focus is placed on the development and impact of a custom GPT model, Susanin, designed specifically for the app, highlighting its advancements in Human-AI interaction and accessibility over standard LLMs. The paper thoroughly explores the practical applications of MEMTA, extending its utility beyond mere travel assistance to sectors such as robotics, virtual reality, and military operations, thus underscoring its multifaceted significance. Through this exploration, the study contributes novel insights into the fields of AI-enhanced travel, assistive technologies, and the broader scope of human-AI interaction.

Original languageEnglish
Title of host publicationProceedings of 9th International Congress on Information and Communication Technology - ICICT 2024
EditorsXin-She Yang, Simon Sherratt, Nilanjan Dey, Amit Joshi
PublisherSpringer Science and Business Media Deutschland GmbH
Pages565-575
Number of pages11
ISBN (Print)9789819733040
DOIs
StatePublished - 2024
Event9th International Congress on Information and Communication Technology, ICICT 2024 - London, United Kingdom
Duration: 19 Feb 202422 Feb 2024

Publication series

NameLecture Notes in Networks and Systems
Volume1004 LNNS
ISSN (Print)2367-3370
ISSN (Electronic)2367-3389

Conference

Conference9th International Congress on Information and Communication Technology, ICICT 2024
Country/TerritoryUnited Kingdom
CityLondon
Period19/02/2422/02/24

Keywords

  • AI in travel
  • Assistive navigation technologies
  • Human-AI interaction in tourism
  • Multimodal LLMs
  • Real-time multilingual translation

Fingerprint

Dive into the research topics of 'The Multilingual Eyes Multimodal Traveler’s App'. Together they form a unique fingerprint.

Cite this