AI-SPEAK

"Sinteza govora na srpskom jeziku – poređenje end-to-end modela i modela sa jezičkom obradom teksta", YU INFO, 10-13. mart 2024, Kopaonik (link)

"Automatsko prepoznavanje govora na srpskom jeziku – poređenje sistema baziranog na whisper-u i konvencionalnog sistema sa jezičkim modelom", YU INFO, 10-13. mart 2024, Kopaonik (link)

"Whispered Speech Recognition Based on Audio Data Augmentation and Inverse Filtering", Applied Sciences - Basel, 2024, ISSN: 2076-3417, Special Issue "Speech Recognition and Natural Language Processing", Asad Abdi, Farid Meziane (Eds.), Vol. 14, No. 18: 8223, pp. 1-20, DOI: 10.3390/app14188223, MDPI (link)

"Basic Computational Geometry Applications in Computer Graphics", 9th Conference on Mathematics in Engineering: Theory and Applications, Novi Sad, May 31st-June 2nd, 2024 (link)

„Razvoj govornih asistenata i njihove primene u pametnim kućama i pametnim gradovima“, Zbornik radova 2022/2023 „Serija stručnih predavanja posvećenih unapređenju projektovanja telekomunikacionih mreža i sistema“, Ed. Mirjana Jarić-Ćirić, FTTH udruženje Srbija, Beograd, pp. 202-213, 2024 (link)

"Multimodal Emotion Recognition Using Compressed Graph Neural Networks". In Proc. Speech and Computer (SPECOM 2024), Belgrade, Serbia, November 25-27, 2024. Eds: Alexey Karpov, Vlado Delić, pp. 109-121, Springer (link)

"End-to-End Speech Synthesis for the Serbian Language Based on Tacotron", In Proc. Speech and Computer (SPECOM 2024), Belgrade, Serbia, November 25-27, 2024. Eds: Alexey Karpov, Vlado Delić, pp. 219-229, Springer (link)

"Retrospective and Perspectives of TTS & STT Technology Development and Implementation for South Slavic Under-Resourced Languages", In Proc. Speech and Computer (SPECOM 2024), Belgrade, Serbia, November 25-27, 2024. Eds: Alexey Karpov, Vlado Delić, pp. 23-44, Springer (link)

"Probability Density Function Distance-Based Augmented CycleGAN for Image Domain Translation with Asymmetric Sample Size", In Mathematics 2025, Multidisciplinary Digital Publishing Institute (MDPI), ISSN: 2227-7390, Vol. 13, No. 9: 1406. (link)

"Transforming Faces Into Video Stories-VideoFace2.0", Proc. 14th Mediterranean Conference on Embedded Computing (MECO), pp. 251-254, Budva, Montenegro, ISBN 979-8-3315-1341-2, 2025. (link)

"Person detection and re-identification in open-world settings of retail stores and public spaces", Proc. 2nd Int. Sci. Conf. ALFATECH – Smart Cities and Modern Technologies 2025, Belgrade, Serbia, 2025. (link)

"Named Entity Recognition for Serbian Legal Documents: Design, Methodology and Dataset Development", Proc. 5th International Conference on Information Society and Technology (ICIST), Springer Lecture Notes in Networks and Systems, Kopaonik, Serbia, 2025. (link)

"Exploiting voice conversion in creating new TTS voices", Proc. 32nd International Conference on Systems, Signals and Image Processing (IWSSIP), Skopje, North Macedonia, 2025. (link)

"Snimanje bilingvalne baze AI-SPEAK za multimodalno prepoznavanje govora", 31 Nacionalna konferencija iz oblasti informaciono-komunikacionih tehnologija (YUinfo) 2025, Kopaonik, Srbija, 2025. (link)

"Uticaj augmentacija trening podataka na performanse doobučenog Whisper modela", 31 Nacionalna konferencija iz oblasti informaciono-komunikacionih tehnologija (YUinfo) 2025, Kopaonik, Srbija, 2025. (link)

D1.2

Plan implementacije (link)

Plan kontrole kvaliteta (link)

Plan diseminacije (link)

D2.1

AI-SPEAK govorni korpus (link)

D2.2

Izveštaj o AI-SPEAK video internet korpusu VideoBase (link)

M1.1

Izveštaj sa kick-off sastanka (link)

M2.1

Izveštaj sa projektnog sastanka (link)

M2.2

Izveštaj sa projektnog sastanka (link)

M3.1

Izveštaj sa projektnog sastanka (link)

Publikacije