Artificial intelligence-generated Arabic subtitles

insights from Veed.io's automatic speech recognition system of Jordanian Arabic

Authors

DOI:

https://doi.org/10.1590/1983-3652.2024.46952

Keywords:

Subtitles, Auto-generated subtitles, Automatic Speech Recognition, Linguistics, Jordanian Arabic

Abstract

This paper examines the errors that the automatic speech recognition (ASR) system of Veed.io produces when transcribing utterances spoken in Jordanian Arabic into subtitles. It attempts to propose a new classification for the subtitles that are built based on artificial intelligence technology. Through a combination of qualitative and quantitative analyses, the study examines the types of errors and their impact on comprehension. The errors observed in the generated subtitles based on linguistic and phonetic analysis are categorised into three main types: deletions, substitutions, and insertions. Furthermore, the quantitative analysis measures the word error rate (WER) and shows that the WER percentage is 38.857% revealing that deletions are the most common type of error, followed by substitutions and insertions. The study recommends conducting further research on ASR systems for Arabic language dialects and advises subtitlers to be aware of the limitations of these systems when using them, ensuring that they edit and supervise them appropriately.

Downloads

Download data is not yet available.

References

AL-ABBAS, Linda S.; HAIDER, Ahmad S. Using Modern Standard Arabic in subtitling Egyptian comedy movies for the deaf/ hard of hearing. Ed. by Maria Del Mar And Sanchez Ramos. Cogent Arts & Humanities, v. 8, n. 1, p. 1993597, Jan. 2021. ISSN 2331-1983. DOI: 10.1080/23311983.2021.1993597. Available from: https://www.tandfonline.com/doi/full/10.1080/23311983.2021.1993597. Visited on: 23 Nov. 2023.

AL MAHASEES, Zakaryia. Analysing English-Arabic machine translation: Google Translate, Microsoft Translator and Sakhr. London ; New York: Routledge, 2021. (Routledge studies in translation technology). ISBN 9780367759117.

ALHARBI, Sadeen; ALRAZGAN, Muna; ALRASHED, Alanoud; ALNOMASI, Turkiayh; ALMOJEL, Raghad; ALHARBI, Rimah; ALHARBI, Saja; ALTURKI, Sahar; ALSHEHRI, Fatimah; ALMOJIL, Maha. Automatic Speech Recognition: Systematic Literature Review. IEEE Access, v. 9, p. 131858–131876, 2021. ISSN 2169-3536. DOI: 10.1109/ACCESS.2021.3112535. Available from: https://ieeexplore.ieee.org/document/9536732/. Visited on: 23 Nov. 2023.

ALMAHASEES, Zakaryia; JACCOMARD, Helene. Facebook Translation Service (FTS) Usage among Jordanians during COVID-19 Lockdown. Advances in Science, Technology and Engineering Systems Journal, v. 5, n. 6, p. 514–519, Nov. 2020. ISSN 24156698, 24156698. DOI: 10.25046/aj050661. Available from: https://astesj.com/v05/i06/p61/. Visited on: 23 Nov. 2023.

ALMAHASEES, Zakaryia Mustafa. Machine Translation Quality of Khalil Gibran’s the Prophet. SSRN Electronic Journal, v. 1, n. 4, p. 151–159, 2017. ISSN 1556-5068. DOI: 10.2139/ssrn.3068518. Available from: https://www.ssrn.com/abstract=3068518. Visited on: 23 Nov. 2023.

BEHNSTEDT, P.; WOIDICH, M. Dialectology. In: OWENS, J. (ed.). The Oxford Handbook of Arabic Linguistics. Oxford, England: Oxford University Press, 2013. p. 300–325.

BENDOU, Imane. Automatic Arabic Translation of English Educational Content Online using Neural Machine Translation: the Case of Khan Academy. Oct. 2021. thesis – Carnegie Mellon University. DOI: 10.1184/R1/16725304.v1.

CHAUME, Frederic. The turn of audiovisual translation: New audiences and new technologies. Translation Spaces, v. 2, p. 105–123, Nov. 2013. ISSN 2211-3711, 2211-372X. DOI: 10.1075/ts.2.06cha. Available from: http://www.jbe-platform.com/content/journals/10.1075/ts.2.06cha. Visited on: 23 Nov. 2023.

DHARMALE, Gulbakshee J.; PATIL, Dipti D. Evaluation of Phonetic System for Speech Recognition on Smartphone. International Journal of Innovative Technology and Exploring Engineering, v. 8, n. 10, p. 3354–3359, Aug. 2019. ISSN 22783075. DOI: 10.35940/ijitee.J1215.0881019. Available from: https://www.ijitee.org/portfolio-item/J12150881019/. Visited on: 23 Nov. 2023.

DÍAZ-CINTAS, J.; REMAEL, A. Audiovisual Translation: Subtitling. London: Routledge, 2007. Available from: https://www.amazon.com.br/Audiovisual-Translation-Subtitling-Jorge-D%5C%C3%5C%ADaz-Cintas/dp/1900650959. Visited on: 23 Nov. 2023.

DOUGHAN, Yazan. Imaginaries of Space and Language: A historical view of the scalar enregisterment of Jordanian Arabic. International Journal of Arabic Linguistics, v. 3, n. 2, p. 77–109, 2017. ISSN 2421-9835. Available from: https://revues.imist.ma/index.php/IJAL/article/view/11572. Visited on: 23 Nov. 2023.

GUSKAROSKA, A. ASR as a tool for providing feedback for vowel pronunciation practice. 2019. Master of Arts – Iowa State University, Ames, Iowa.

HAIDER, Ahmad S.; ALROUSAN, Faurah. Dubbing television advertisements across cultures and languages: A case study of English and Arabic. Language Value, v. 15, n. 2, p. 54–80, Dec. 2022. ISSN 1989-7103. DOI: 10.6035/languagev.6922. Available from: https://www.e-revistes.uji.es/index.php/languagevalue/article/view/6922. Visited on: 23 Nov. 2023.

HAIDER, Ahmad S.; SAIDEEN, Bassam; HUSSEIN, Riyad F. Subtitling Taboo Expressions from a Conservative to a More Liberal Culture: The Case of the Arab TV Series Jinn. Middle East Journal of Culture and Communication, v. 16, n. 4, p. 363–385, Mar. 2023. ISSN 1873-9857, 1873-9865. DOI: 10.1163/18739865-tat00006. Available from: https://brill.com/view/journals/mjcc/16/4/article-p363%5C_1.xml. Visited on: 23 Nov. 2023.

JARRAH, Shatha; HAIDER, Ahmad S.; AL-SALMAN, Saleh. Strategies of Localizing Video Games into Arabic: A Case Study of PUBG and Free Fire. Open Cultural Studies, v. 7, n. 1, p. 20220179, July 2023. ISSN 2451-3474. DOI: 10.1515/culture-2022-0179. Available from: https://www.degruyter.com/document/doi/10.1515/culture-2022-0179/html. Visited on: 23 Nov. 2023.

LIAO, Junwei; ESKIMEZ, Sefik; LU, Liyang; SHI, Yu; GONG, Ming; SHOU, Linjun; QU, Hong; ZENG, Michael. Improving Readability for Automatic Speech Recognition Transcription. ACM Transactions on Asian and Low-Resource Language Information Processing, v. 22, n. 5, p. 1–23, May 2023. ISSN 2375-4699, 2375-4702. DOI: 10.1145/3557894. Available from: https://dl.acm.org/doi/10.1145/3557894. Visited on: 23 Nov. 2023.

MAAMOURI, Mohamed; BIES, Ann; BUCKWALTER, Tim; DIAB, Mona; HABASH, Nizar; RAMBOW, Owen; TABESSI, Dalila. Developing and Using a Pilot Dialectal Arabic Treebank. In: CALZOLARI, Nicoletta; CHOUKRI, Khalid; GANGEMI, Aldo; MAEGAARD, Bente; MARIANI, Joseph; ODIJK, Jan; TAPIAS, Daniel (eds.). Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06). Genoa, Italy: European Language Resources Association (ELRA), May 2006. Available from: http://www.lrec-conf.org/proceedings/lrec2006/pdf/543%5C_pdf.pdf. Visited on: 23 Nov. 2023.

MUSTAFA, Mumtaz Begum; YUSOOF, Mansoor Ali; KHALAF, Hasan Kahtan; RAHMAN MAHMOUD ABUSHARIAH, Ahmad Abdel; KIAH, Miss Laiha Mat; TING, Hua Nong; MUTHAIYAH, Saravanan. Code-Switching in Automatic Speech Recognition: The Issues and Future Directions. Applied Sciences, v. 12, n. 19, p. 9541, Sept. 2022. ISSN 2076-3417. DOI: 10.3390/app12199541. Available from: https://www.mdpi.com/2076-3417/12/19/9541. Visited on: 23 Nov. 2023.

PUCCI, M. Towards Universally Designed Communication: Opportunities and Challenges in the Use of Automatic Speech Recognition Systems to Support Access, Understanding and Use of Information in Communicative Settings. In: BAMGBOJE-AYODELE, A.; PRGOMET, M.; KUZIEMSKY, C.; ELKIN, P.; NOHR, C. (eds.). Studies in Health Technology and Informatics. Amsterdam: IOS Press, 2023. p. 18–25.

REMAEL, A. Audiovisual translation. In: YVES GAMBIER, L. V. D. (ed.). Handbook of translation studies. [S. l.]: John Benjamins Publishing Company, 2010. p. 12–17.

RYDING, Karin C. A reference grammar of modern standard Arabic. New York: Cambridge University Press, 2005. ISBN 9780521771511.

SABIR, Iram; ALSAEED, Nora. A Brief Description of Consonants in Modern Standard Arabic. Linguistics and Literature Studies, v. 2, n. 7, p. 185–189, Nov. 2014. ISSN 2331-642X, 2331-6438. DOI: 10.13189/lls.2014.020702. Available from: http://www.hrpub.org/journals/article%5C_info.php?aid=1920. Visited on: 23 Nov. 2023.

SAWAKARE, Praphulla A.; DESHMUKH, Ratndeep R.; SHRISHRIMAL, Pukhraj P. Speech Recognition Techniques: A Review. International Journal of Scientific & Engineering Research, v. 6, n. 8, p. 1693–1698, 2015. Available from: https://www.ijser.org/researchpaper/Speech-Recognition-Techniques-A-Review.pdf.

SCHLIPPE, Tim; ALESSAI, Shaimaa; EL-TAWEEL, Ghanimeh; WÖLFEL, Matthias; ZAGHOUANI, Wajdi. Visualizing Voice Characteristics with Type Design in Closed Captions for Arabic. In: 2020 International Conference on Cyberworlds (CW). [S. l.: s. n.], Sept. 2020. p. 196–203. ISSN: 2642-3596. DOI: 10.1109/CW49994.2020.00039. Available from: https://ieeexplore.ieee.org/document/9240549/citations%5C#citations. Visited on: 23 Nov. 2023.

SUVOROV, R.; LEVIS, J. M. Automatic Speech Recognition. In: CHAPELLE, C. (ed.). Encyclopedia of Applied Linguistics. Iowa: Blackwell, 2012. p. 8.

VERSTEEGH, K. Encyclopedia of Arabic Language and Linguistics – Brill. Brill: Leiden, 2006.

XIE, B. A comparative study of machine translated subtitles based on the user-centered approach: a case study between Bilibili and YouTube. Research Square, v. 1, 2022. Available from: https://www.researchsquare.com/article/rs-2179598/v1. Visited on: 23 Nov. 2023.

Downloads

Published

2024-01-06

How to Cite

AKASHEH, W. M.; HAIDER, A. S.; AL-SAIDEEN, B.; SAHARI, Y. Artificial intelligence-generated Arabic subtitles: insights from Veed.io’s automatic speech recognition system of Jordanian Arabic. Texto Livre, Belo Horizonte-MG, v. 17, p. e46952, 2024. DOI: 10.1590/1983-3652.2024.46952. Disponível em: https://periodicos.ufmg.br/index.php/textolivre/article/view/46952. Acesso em: 21 nov. 2024.