Publ3 search

Z6_60GI02O0O8IDC0QEJUJ26TJDI4

Error:

Javascript is disabled in this browser. This page requires Javascript. Modify your browser's settings to allow Javascript to execute. See your browser's documentation for specific instructions.

Publikační činnost

Probíhá načítání, čekejte prosím...

Record type:	stať ve sborníku (D)
Home Department:	Katedra informatiky a počítačů (31400)
Title:	Hands-Free VR
Citace	Vazquez Fernandez, J. A., Lee, J. J., Serrano Vacca, S. A., Magana, A., Peša, R., Beneš, B. a Popescu, V. Hands-Free VR. In: 20th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, VISIGRAPP 2025: Proceedings of the 20th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - HUCAPP 2025-02-26 Porto. Porto: Science and Technology Publications, Lda, 2025. s. 533-542. ISBN 978-989-758-728-3.

Subtitle
Publication year:	2025
Obor:
Number of pages:	10
Page from:	533
Page to:	542
Form of publication:	Elektronická verze
ISBN code:	978-989-758-728-3
ISSN code:	2184-4321
Proceedings title:	Proceedings of the 20th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - HUCAPP
Proceedings:	Mezinárodní
Publisher name:	Science and Technology Publications, Lda
Place of publishing:	Porto
Country of Publication:	Sborník vydaný v zahraničí
Název konference:	20th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, VISIGRAPP 2025
Conference venue:	Porto
Datum zahájení konference:
Typ akce podle státní příslušnosti účastníků:	Celosvětová akce
WoS code:
EID:	2-s2.0-105001963646

Key words in English:	Virtual Reality, Large Language Model, Retrieval-Augmented Generation, Speech-to-Text, Deep Learning
Annotation in original language:	The paper introduces Hands-Free VR, a voice-based natural-language interface for VR. The user gives a command using their voice, the speech audio data is converted to text using a speech-to-text deep learning model that is fine-tuned for robustness to word phonetic similarity and to spoken English accents, and the text is mapped to an executable VR command using a large language model that is robust to natural language diversity. Hands-Free VR was evaluated in a controlled within-subjects study (N = 22) that asked participants to find specific objects and to place them in various configurations. In the control condition participants used a conventional VR user interface to grab, carry, and position the objects using the handheld controllers. In the experimental condition participants used Hands-Free VR. The results confirm that: (1) Hands-Free VR is robust to spoken English accents, as for 20 of our participants English was not their first language, and to word phonetic similarity, correctly transcribing the voice command 96.71% of the time; (2) Hands-Free VR is robust to natural language diversity, correctly mapping the transcribed command to an executable command in 97.83% of the time; (3) Hands-Free VR had a significant efficiency advantage over the conventional VR interface in terms of task completion time, total viewpoint translation, total view direction rotation, and total left and right hand translations; (4) Hands-Free VR received high user preference ratings in terms of ease of use, intuitiveness, ergonomics, reliability, and desirability.
Annotation in english language:	The paper introduces Hands-Free VR, a voice-based natural-language interface for VR. The user gives a command using their voice, the speech audio data is converted to text using a speech-to-text deep learning model that is fine-tuned for robustness to word phonetic similarity and to spoken English accents, and the text is mapped to an executable VR command using a large language model that is robust to natural language diversity. Hands-Free VR was evaluated in a controlled within-subjects study (N = 22) that asked participants to find specific objects and to place them in various configurations. In the control condition participants used a conventional VR user interface to grab, carry, and position the objects using the handheld controllers. In the experimental condition participants used Hands-Free VR. The results confirm that: (1) Hands-Free VR is robust to spoken English accents, as for 20 of our participants English was not their first language, and to word phonetic similarity, correctly transcribing the voice command 96.71% of the time; (2) Hands-Free VR is robust to natural language diversity, correctly mapping the transcribed command to an executable command in 97.83% of the time; (3) Hands-Free VR had a significant efficiency advantage over the conventional VR interface in terms of task completion time, total viewpoint translation, total view direction rotation, and total left and right hand translations; (4) Hands-Free VR received high user preference ratings in terms of ease of use, intuitiveness, ergonomics, reliability, and desirability.

References

Reference

R01:

Complementary Content

${loading}