TY - GEN
T1 - Text-to-Sign Language Video Generation Using GANs, BERT, and Sora
AU - Kumar, Yulia
AU - Niu, Beining
AU - Lin, Mengtian
AU - Mudholker, Nidhi
N1 - Publisher Copyright:
© 2025 IEEE.
PY - 2025
Y1 - 2025
N2 - This paper explores the generation of American Sign Language (ASL) videos using Generative Adversarial Networks (GANs), BERT-based text embeddings, and a dataset comprising authentic and synthetic SL clips. The original Kaggle dataset was enriched by creating a manually crafted collection. OpenAI's Sora video generator was then employed to augment the dataset by producing synthetic videos using multimodal prompts. Researchers implemented and compared several GAN architectures, including unimodal and Feature-wise Linear Modulation (FiLM) 3D convolutional generators, which integrate text embeddings for modal fusion. While preliminary training was conducted, quantitative evaluation revealed significant challenges in generating realistic and coherent ASL videos. Current results highlight the complexities of ASL video synthesis and underscore the need for advanced ASL generative applications.
AB - This paper explores the generation of American Sign Language (ASL) videos using Generative Adversarial Networks (GANs), BERT-based text embeddings, and a dataset comprising authentic and synthetic SL clips. The original Kaggle dataset was enriched by creating a manually crafted collection. OpenAI's Sora video generator was then employed to augment the dataset by producing synthetic videos using multimodal prompts. Researchers implemented and compared several GAN architectures, including unimodal and Feature-wise Linear Modulation (FiLM) 3D convolutional generators, which integrate text embeddings for modal fusion. While preliminary training was conducted, quantitative evaluation revealed significant challenges in generating realistic and coherent ASL videos. Current results highlight the complexities of ASL video synthesis and underscore the need for advanced ASL generative applications.
KW - GANs for ASL
KW - Sora
KW - Synthetic ASL Dataset
UR - https://www.scopus.com/pages/publications/105017765640
U2 - 10.1109/ISEC64801.2025.11147381
DO - 10.1109/ISEC64801.2025.11147381
M3 - Conference contribution
AN - SCOPUS:105017765640
T3 - 2025 15th IEEE Integrated STEM Education Conference, ISEC 2025
BT - 2025 15th IEEE Integrated STEM Education Conference, ISEC 2025
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 15th IEEE Integrated STEM Education Conference, ISEC 2025
Y2 - 15 March 2025
ER -