Legendas em árabe geradas por inteligência artificial

insights do sistema de reconhecimento automático de fala do árabe jordaniano da Veed.io

Autores

DOI:

https://doi.org/10.1590/1983-3652.2024.46952

Palavras-chave:

Legendas, Legendas geradas automaticamente, Reconhecimento Automático de Fala, Linguística, Árabe jordaniano

Resumo

Este artigo examina os erros que o sistema de reconhecimento automático de fala (ASR) do Veed.io produz ao transcrever declarações faladas em árabe jordaniano para legendas. Tenta propor uma nova classificação para as legendas construídas com base em tecnologia de inteligência artificial. Através de uma combinação de análises qualitativas e quantitativas, o estudo examina os tipos de erros e seu impacto na compreensão. Os erros observados nas legendas geradas com base na análise linguística e fonética são categorizados em três tipos principais: exclusões, substituições e inserções. Além disso, a análise quantitativa mede a taxa de erro de palavras (WER) e mostra que o percentual de WER é de 38,857%, revelando que as exclusões são o tipo de erro mais comum, seguidas pelas substituições e inserções. O estudo recomenda a realização de mais pesquisas sobre sistemas ASR para dialetos da língua árabe e aconselha os legendadores a estarem cientes das limitações desses sistemas ao usá-los, garantindo que os editem e supervisionem adequadamente.

Downloads

Não há dados estatísticos.

Referências

AL-ABBAS, Linda S.; HAIDER, Ahmad S. Using Modern Standard Arabic in subtitling Egyptian comedy movies for the deaf/ hard of hearing. Ed. by Maria Del Mar And Sanchez Ramos. Cogent Arts & Humanities, v. 8, n. 1, p. 1993597, Jan. 2021. ISSN 2331-1983. DOI: 10.1080/23311983.2021.1993597. Available from: https://www.tandfonline.com/doi/full/10.1080/23311983.2021.1993597. Visited on: 23 Nov. 2023.

AL MAHASEES, Zakaryia. Analysing English-Arabic machine translation: Google Translate, Microsoft Translator and Sakhr. London ; New York: Routledge, 2021. (Routledge studies in translation technology). ISBN 9780367759117.

ALHARBI, Sadeen; ALRAZGAN, Muna; ALRASHED, Alanoud; ALNOMASI, Turkiayh; ALMOJEL, Raghad; ALHARBI, Rimah; ALHARBI, Saja; ALTURKI, Sahar; ALSHEHRI, Fatimah; ALMOJIL, Maha. Automatic Speech Recognition: Systematic Literature Review. IEEE Access, v. 9, p. 131858–131876, 2021. ISSN 2169-3536. DOI: 10.1109/ACCESS.2021.3112535. Available from: https://ieeexplore.ieee.org/document/9536732/. Visited on: 23 Nov. 2023.

ALMAHASEES, Zakaryia; JACCOMARD, Helene. Facebook Translation Service (FTS) Usage among Jordanians during COVID-19 Lockdown. Advances in Science, Technology and Engineering Systems Journal, v. 5, n. 6, p. 514–519, Nov. 2020. ISSN 24156698, 24156698. DOI: 10.25046/aj050661. Available from: https://astesj.com/v05/i06/p61/. Visited on: 23 Nov. 2023.

ALMAHASEES, Zakaryia Mustafa. Machine Translation Quality of Khalil Gibran’s the Prophet. SSRN Electronic Journal, v. 1, n. 4, p. 151–159, 2017. ISSN 1556-5068. DOI: 10.2139/ssrn.3068518. Available from: https://www.ssrn.com/abstract=3068518. Visited on: 23 Nov. 2023.

BEHNSTEDT, P.; WOIDICH, M. Dialectology. In: OWENS, J. (ed.). The Oxford Handbook of Arabic Linguistics. Oxford, England: Oxford University Press, 2013. p. 300–325.

BENDOU, Imane. Automatic Arabic Translation of English Educational Content Online using Neural Machine Translation: the Case of Khan Academy. Oct. 2021. thesis – Carnegie Mellon University. DOI: 10.1184/R1/16725304.v1.

CHAUME, Frederic. The turn of audiovisual translation: New audiences and new technologies. Translation Spaces, v. 2, p. 105–123, Nov. 2013. ISSN 2211-3711, 2211-372X. DOI: 10.1075/ts.2.06cha. Available from: http://www.jbe-platform.com/content/journals/10.1075/ts.2.06cha. Visited on: 23 Nov. 2023.

DHARMALE, Gulbakshee J.; PATIL, Dipti D. Evaluation of Phonetic System for Speech Recognition on Smartphone. International Journal of Innovative Technology and Exploring Engineering, v. 8, n. 10, p. 3354–3359, Aug. 2019. ISSN 22783075. DOI: 10.35940/ijitee.J1215.0881019. Available from: https://www.ijitee.org/portfolio-item/J12150881019/. Visited on: 23 Nov. 2023.

DÍAZ-CINTAS, J.; REMAEL, A. Audiovisual Translation: Subtitling. London: Routledge, 2007. Available from: https://www.amazon.com.br/Audiovisual-Translation-Subtitling-Jorge-D%5C%C3%5C%ADaz-Cintas/dp/1900650959. Visited on: 23 Nov. 2023.

DOUGHAN, Yazan. Imaginaries of Space and Language: A historical view of the scalar enregisterment of Jordanian Arabic. International Journal of Arabic Linguistics, v. 3, n. 2, p. 77–109, 2017. ISSN 2421-9835. Available from: https://revues.imist.ma/index.php/IJAL/article/view/11572. Visited on: 23 Nov. 2023.

GUSKAROSKA, A. ASR as a tool for providing feedback for vowel pronunciation practice. 2019. Master of Arts – Iowa State University, Ames, Iowa.

HAIDER, Ahmad S.; ALROUSAN, Faurah. Dubbing television advertisements across cultures and languages: A case study of English and Arabic. Language Value, v. 15, n. 2, p. 54–80, Dec. 2022. ISSN 1989-7103. DOI: 10.6035/languagev.6922. Available from: https://www.e-revistes.uji.es/index.php/languagevalue/article/view/6922. Visited on: 23 Nov. 2023.

HAIDER, Ahmad S.; SAIDEEN, Bassam; HUSSEIN, Riyad F. Subtitling Taboo Expressions from a Conservative to a More Liberal Culture: The Case of the Arab TV Series Jinn. Middle East Journal of Culture and Communication, v. 16, n. 4, p. 363–385, Mar. 2023. ISSN 1873-9857, 1873-9865. DOI: 10.1163/18739865-tat00006. Available from: https://brill.com/view/journals/mjcc/16/4/article-p363%5C_1.xml. Visited on: 23 Nov. 2023.

JARRAH, Shatha; HAIDER, Ahmad S.; AL-SALMAN, Saleh. Strategies of Localizing Video Games into Arabic: A Case Study of PUBG and Free Fire. Open Cultural Studies, v. 7, n. 1, p. 20220179, July 2023. ISSN 2451-3474. DOI: 10.1515/culture-2022-0179. Available from: https://www.degruyter.com/document/doi/10.1515/culture-2022-0179/html. Visited on: 23 Nov. 2023.

LIAO, Junwei; ESKIMEZ, Sefik; LU, Liyang; SHI, Yu; GONG, Ming; SHOU, Linjun; QU, Hong; ZENG, Michael. Improving Readability for Automatic Speech Recognition Transcription. ACM Transactions on Asian and Low-Resource Language Information Processing, v. 22, n. 5, p. 1–23, May 2023. ISSN 2375-4699, 2375-4702. DOI: 10.1145/3557894. Available from: https://dl.acm.org/doi/10.1145/3557894. Visited on: 23 Nov. 2023.

MAAMOURI, Mohamed; BIES, Ann; BUCKWALTER, Tim; DIAB, Mona; HABASH, Nizar; RAMBOW, Owen; TABESSI, Dalila. Developing and Using a Pilot Dialectal Arabic Treebank. In: CALZOLARI, Nicoletta; CHOUKRI, Khalid; GANGEMI, Aldo; MAEGAARD, Bente; MARIANI, Joseph; ODIJK, Jan; TAPIAS, Daniel (eds.). Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06). Genoa, Italy: European Language Resources Association (ELRA), May 2006. Available from: http://www.lrec-conf.org/proceedings/lrec2006/pdf/543%5C_pdf.pdf. Visited on: 23 Nov. 2023.

MUSTAFA, Mumtaz Begum; YUSOOF, Mansoor Ali; KHALAF, Hasan Kahtan; RAHMAN MAHMOUD ABUSHARIAH, Ahmad Abdel; KIAH, Miss Laiha Mat; TING, Hua Nong; MUTHAIYAH, Saravanan. Code-Switching in Automatic Speech Recognition: The Issues and Future Directions. Applied Sciences, v. 12, n. 19, p. 9541, Sept. 2022. ISSN 2076-3417. DOI: 10.3390/app12199541. Available from: https://www.mdpi.com/2076-3417/12/19/9541. Visited on: 23 Nov. 2023.

PUCCI, M. Towards Universally Designed Communication: Opportunities and Challenges in the Use of Automatic Speech Recognition Systems to Support Access, Understanding and Use of Information in Communicative Settings. In: BAMGBOJE-AYODELE, A.; PRGOMET, M.; KUZIEMSKY, C.; ELKIN, P.; NOHR, C. (eds.). Studies in Health Technology and Informatics. Amsterdam: IOS Press, 2023. p. 18–25.

REMAEL, A. Audiovisual translation. In: YVES GAMBIER, L. V. D. (ed.). Handbook of translation studies. [S. l.]: John Benjamins Publishing Company, 2010. p. 12–17.

RYDING, Karin C. A reference grammar of modern standard Arabic. New York: Cambridge University Press, 2005. ISBN 9780521771511.

SABIR, Iram; ALSAEED, Nora. A Brief Description of Consonants in Modern Standard Arabic. Linguistics and Literature Studies, v. 2, n. 7, p. 185–189, Nov. 2014. ISSN 2331-642X, 2331-6438. DOI: 10.13189/lls.2014.020702. Available from: http://www.hrpub.org/journals/article%5C_info.php?aid=1920. Visited on: 23 Nov. 2023.

SAWAKARE, Praphulla A.; DESHMUKH, Ratndeep R.; SHRISHRIMAL, Pukhraj P. Speech Recognition Techniques: A Review. International Journal of Scientific & Engineering Research, v. 6, n. 8, p. 1693–1698, 2015. Available from: https://www.ijser.org/researchpaper/Speech-Recognition-Techniques-A-Review.pdf.

SCHLIPPE, Tim; ALESSAI, Shaimaa; EL-TAWEEL, Ghanimeh; WÖLFEL, Matthias; ZAGHOUANI, Wajdi. Visualizing Voice Characteristics with Type Design in Closed Captions for Arabic. In: 2020 International Conference on Cyberworlds (CW). [S. l.: s. n.], Sept. 2020. p. 196–203. ISSN: 2642-3596. DOI: 10.1109/CW49994.2020.00039. Available from: https://ieeexplore.ieee.org/document/9240549/citations%5C#citations. Visited on: 23 Nov. 2023.

SUVOROV, R.; LEVIS, J. M. Automatic Speech Recognition. In: CHAPELLE, C. (ed.). Encyclopedia of Applied Linguistics. Iowa: Blackwell, 2012. p. 8.

VERSTEEGH, K. Encyclopedia of Arabic Language and Linguistics – Brill. Brill: Leiden, 2006.

XIE, B. A comparative study of machine translated subtitles based on the user-centered approach: a case study between Bilibili and YouTube. Research Square, v. 1, 2022. Available from: https://www.researchsquare.com/article/rs-2179598/v1. Visited on: 23 Nov. 2023.

Downloads

Publicado

06-01-2024

Como Citar

AKASHEH, W. M.; HAIDER, A. S.; AL-SAIDEEN, B.; SAHARI, Y. Legendas em árabe geradas por inteligência artificial: insights do sistema de reconhecimento automático de fala do árabe jordaniano da Veed.io. Texto Livre, Belo Horizonte-MG, v. 17, p. e46952, 2024. DOI: 10.1590/1983-3652.2024.46952. Disponível em: https://periodicos.ufmg.br/index.php/textolivre/article/view/46952. Acesso em: 27 abr. 2024.