This commit is contained in:
2024-09-27 22:10:59 +02:00
parent 8a85b14d3b
commit a9319210df
13 changed files with 51 additions and 47 deletions

View File

@@ -17,6 +17,8 @@ We \textbf{evaluate in a user study}, using the \OST-\AR headset Microsoft HoloL
\noindentskip In the next sections, we first present the six visual hand renderings we considered and gathered from the literature. We then describe the experimental setup and design, the two manipulation tasks, and the metrics used. We present the results of the user study and discuss the implications of these results for the manipulation of \VOs directly with the hand in \AR.
\bigskip
\begin{subfigs}{hands}{The six visual hand renderings.}[
As seen by the user through the \AR headset during the two-finger grasping of a virtual cube.
][
@@ -25,7 +27,7 @@ We \textbf{evaluate in a user study}, using the \OST-\AR headset Microsoft HoloL
\item Rings on the fingertips \level{(Tips)}.
\item Thin outline of the hand \level{(Contour, Cont)}.
\item Fingers' joints and phalanges \level{(Skeleton, Skel)}.
\item Semi-transparent 3D hand model \level{(Mesh)}.
\item Semi-transparent \ThreeD hand model \level{(Mesh)}.
]
\subfig[0.22]{method/hands-none}
\subfig[0.22]{method/hands-occlusion}

View File

@@ -5,30 +5,30 @@ We compared a set of the most popular visual hand renderings, as found in the li
Since we address hand-centered manipulation tasks, we only considered renderings including the fingertips (\secref[related_work]{grasp_types}).
Moreover, as to keep the focus on the hand rendering itself, we used neutral semi-transparent grey meshes, consistent with the choices made in \cite{yoon2020evaluating,vanveldhuizen2021effect}.
All considered hand renderings are drawn following the tracked pose of the user's real hand.
However, while the real hand can of course penetrate virtual objects, the visual hand is always constrained by the virtual environment (\secref[related_work]{ar_virtual_hands}).
However, while the real hand can of course penetrate \VOs, the visual hand is always constrained by the \VE (\secref[related_work]{ar_virtual_hands}).
They are shown in \figref{hands} and described below, with an abbreviation in parentheses when needed.
\paragraph{None}
As a reference, we considered no visual hand rendering (\figref{method/hands-none}), as is common in \AR \cite{hettiarachchi2016annexing,blaga2017usability,xiao2018mrtouch,teng2021touch}.
Users have no information about hand tracking and no feedback about contact with the virtual objects, other than their movement when touched.
As virtual content is rendered on top of the real environment, the hand of the user can be hidden by the virtual objects when manipulating them (\secref[related_work]{ar_displays}).
Users have no information about hand tracking and no feedback about contact with the \VOs, other than their movement when touched.
As virtual content is rendered on top of the \RE, the hand of the user can be hidden by the \VOs when manipulating them (\secref[related_work]{ar_displays}).
\paragraph{Occlusion (Occl)}
To avoid the abovementioned undesired occlusions due to the virtual content being rendered on top of the real environment, we can carefully crop the former whenever it hides real content that should be visible \cite{macedo2023occlusion}, \eg the thumb of the user in \figref{method/hands-occlusion}.
To avoid the abovementioned undesired occlusions due to the virtual content being rendered on top of the \RE, we can carefully crop the former whenever it hides real content that should be visible \cite{macedo2023occlusion}, \eg the thumb of the user in \figref{method/hands-occlusion}.
This approach is frequent in works using \VST-\AR headsets \cite{knorlein2009influence,ha2014wearhand,piumsomboon2014graspshell,suzuki2014grasping,al-kalbani2016analysis}.
\paragraph{Tips}
This rendering shows small visual rings around the fingertips of the user (\figref{method/hands-tips}), highlighting the most important parts of the hand and contact with virtual objects during fine manipulation (\secref[related_work]{grasp_types}).
This rendering shows small visual rings around the fingertips of the user (\figref{method/hands-tips}), highlighting the most important parts of the hand and contact with \VOs during fine manipulation (\secref[related_work]{grasp_types}).
Unlike work using small spheres \cite{maisto2017evaluation,meli2014wearable,grubert2018effects,normand2018enlarging,schwind2018touch}, this ring rendering also provides information about the orientation of the fingertips.
\paragraph{Contour (Cont)}
This rendering is a \qty{1}{\mm} thick outline contouring the user's hands, providing information about the whole hand while leaving its inside visible.
Unlike the other renderings, it is not occluded by the virtual objects, as shown in \figref{method/hands-contour}.
Unlike the other renderings, it is not occluded by the \VOs, as shown in \figref{method/hands-contour}.
This rendering is not as usual as the previous others in the literature \cite{kang2020comparative}.
\paragraph{Skeleton (Skel)}
@@ -39,24 +39,24 @@ It is widely used in \VR \cite{argelaguet2016role,schwind2018touch,chessa2019gra
\paragraph{Mesh}
This rendering is a 3D semi-transparent ($a=0.2$) hand model (\figref{method/hands-mesh}), which is common in \VR \cite{prachyabrued2014visual,argelaguet2016role,schwind2018touch,chessa2019grasping,yoon2020evaluating,vanveldhuizen2021effect}.
This rendering is a \ThreeD semi-transparent ($a=0.2$) hand model (\figref{method/hands-mesh}), which is common in \VR \cite{prachyabrued2014visual,argelaguet2016role,schwind2018touch,chessa2019grasping,yoon2020evaluating,vanveldhuizen2021effect}.
It can be seen as a filled version of the Contour hand rendering, thus partially covering the view of the real hand.
\section{User Study}
\label{method}
We aim to investigate whether the chosen visual hand rendering affects the performance and user experience of manipulating virtual objects with free hands in \AR.
We aim to investigate whether the chosen visual hand rendering affects the performance and user experience of manipulating \VOs with free hands in \AR.
\subsection{Manipulation Tasks and Virtual Scene}
\label{tasks}
Following the guidelines of \textcite{bergstrom2021how} for designing object manipulation tasks, we considered two variations of a 3D pick-and-place task, commonly found in interaction and manipulation studies \cite{prachyabrued2014visual,blaga2017usability,maisto2017evaluation,meli2018combining,vanveldhuizen2021effect}.
Following the guidelines of \textcite{bergstrom2021how} for designing object manipulation tasks, we considered two variations of a \ThreeD pick-and-place task, commonly found in interaction and manipulation studies \cite{prachyabrued2014visual,blaga2017usability,maisto2017evaluation,meli2018combining,vanveldhuizen2021effect}.
\subsubsection{Push Task}
\label{push-task}
The first manipulation task consists in pushing a virtual object along a real flat surface towards a target placed on the same plane (\figref{method/task-push}).
The virtual object to manipulate is a small \qty{50}{\mm} blue and opaque cube, while the target is a (slightly) bigger \qty{70}{\mm} blue and semi-transparent volume.
The first manipulation task consists in pushing a \VO along a real flat surface towards a target placed on the same plane (\figref{method/task-push}).
The \VO to manipulate is a small \qty{50}{\mm} blue and opaque cube, while the target is a (slightly) bigger \qty{70}{\mm} blue and semi-transparent volume.
At every repetition of the task, the cube to manipulate always spawns at the same place, on top of a real table in front of the user.
On the other hand, the target volume can spawn in eight different locations on the same table, located on a \qty{20}{\cm} radius circle centred on the cube, at \qty{45}{\degree} from each other (again \figref{method/task-push}).
Users are asked to push the cube towards the target volume using their fingertips in any way they prefer.
@@ -66,7 +66,7 @@ The task is considered completed when the cube is \emph{fully} inside the target
\subsubsection{Grasp Task}
\label{grasp-task}
The second manipulation task consists in grasping, lifting, and placing a virtual object in a target placed on a different (higher) plane (\figref{method/task-grasp}).
The second manipulation task consists in grasping, lifting, and placing a \VO in a target placed on a different (higher) plane (\figref{method/task-grasp}).
The cube to manipulate and target volume are the same as in the previous task.
However, this time, the target volume can spawn in eight different locations on a plane \qty{10}{\cm} \emph{above} the table, still located on a \qty{20}{\cm} radius circle at \qty{45}{\degree} from each other.
Users are asked to grasp, lift, and move the cube towards the target volume using their fingertips in any way they prefer.
@@ -97,7 +97,7 @@ Each condition was repeated three times.
To control learning effects, we counter-balanced the orders of the two manipulation tasks and visual hand renderings following a 6 \x 6 Latin square, leading to six blocks where the position of the target volume was in turn randomized.
This design led to a total of 2 manipulation tasks \x 6 visual hand renderings \x 8 targets \x 3 repetitions $=$ 288 trials per participant.
\subsection{Apparatus and Implementation}
\subsection{Apparatus}
\label{apparatus}
We used the \OST-\AR headset HoloLens~2, as described in \secref[vhar_system]{virtual_real_alignment}.
@@ -105,13 +105,13 @@ We used the \OST-\AR headset HoloLens~2, as described in \secref[vhar_system]{vi
It is also able to track the user's fingers.
We measured the latency of the hand tracking at \qty{15}{\ms}, independent of the hand movement speed.
The implementation of our experiment was done in C\# using Unity 2022.1, PhysX 4.1, and the Mixed Reality Toolkit (MRTK) 2.8\footnoteurl{https://learn.microsoft.com/windows/mixed-reality/mrtk-unity}.
The implementation of our experiment was done using Unity 2022.1, PhysX 4.1, and the Mixed Reality Toolkit (MRTK) 2.8\footnoteurl{https://learn.microsoft.com/windows/mixed-reality/mrtk-unity}.
The compiled application ran directly on the HoloLens~2 at \qty{60}{FPS}.
The default 3D hand model from MRTK was used for all visual hand renderings.
The default \ThreeD hand model from MRTK was used for all visual hand renderings.
By changing the material properties of this hand model, we were able to achieve the six renderings shown in \figref{hands}.
A calibration was performed for every participant, to best adapt the size of the visual hand rendering to their real hand.
A set of empirical tests enabled us to choose the best rendering characteristics in terms of transparency and brightness for the virtual objects and hand renderings, which were applied throughout the experiment.
A set of empirical tests enabled us to choose the best rendering characteristics in terms of transparency and brightness for the \VOs and hand renderings, which were applied throughout the experiment.
The hand tracking information provided by MRTK was used to construct a virtual articulated physics-enabled hand (\secref[related_work]{ar_virtual_hands}) using PhysX.
It featured 25 DoFs, including the fingers proximal, middle, and distal phalanges.
@@ -121,10 +121,10 @@ As before, a set of empirical tests have been used to select the most effective
The room where the experiment was held had no windows, with one light source of \qty{800}{\lumen} placed \qty{70}{\cm} above the table.
This setup enabled a good and consistent tracking of the user's fingers.
\subsection{Protocol}
\label{protocol}
\subsection{Procedure}
\label{procedure}
First, participants were given a consent form that briefed them about the tasks and the protocol of the experiment.
First, participants were given a consent form that briefed them about the tasks and the procedure of the experiment.
Then, participants were asked to comfortably sit in front of a table and wear the HoloLens~2 headset as shown in~\figref{tasks}, perform the calibration of the visual hand size as described in~\secref{apparatus}, and complete a \qty{2}{min} training to familiarize with the \AR rendering and the two considered tasks.
During this training, we did not use any of the six hand renderings we want to test, but rather a fully-opaque white hand rendering that completely occluded the real hand of the user.
Participants were asked to carry out the two tasks as naturally and as fast as possible.

View File

@@ -1,7 +1,7 @@
\section{Discussion}
\label{discussion}
We evaluated six visual hand renderings, as described in \secref{hands}, displayed on top of the real hand, in two virtual object manipulation tasks in \AR.
We evaluated six visual hand renderings, as described in \secref{hands}, displayed on top of the real hand, in two \VO manipulation tasks in \AR.
During the \level{Push} task, the \level{Skeleton} hand rendering was the fastest (\figref{results/Push-CompletionTime-Hand-Overall-Means}), as participants employed fewer and longer contacts to adjust the cube inside the target volume (\figref{results/Push-ContactsCount-Hand-Overall-Means} and \figref{results/Push-MeanContactTime-Hand-Overall-Means}).
Participants consistently used few and continuous contacts for all visual hand renderings (Fig. 3b), with only less than ten trials, carried out by two participants, quickly completed with multiple discrete touches.
@@ -21,12 +21,12 @@ However, due to the latency of the hand tracking and the visual hand reacting to
The \level{Tips} rendering, which showed the contacts made on the virtual cube, was controversial as it received the minimum and the maximum score on every question.
Many participants reported difficulties in seeing the orientation of the visual fingers,
while others found that it gave them a better sense of the contact points and improved their concentration on the task.
This result is consistent with \textcite{saito2021contact}, who found that displaying the points of contacts was beneficial for grasping a virtual object over an opaque visual hand overlay.
This result is consistent with \textcite{saito2021contact}, who found that displaying the points of contacts was beneficial for grasping a \VO over an opaque visual hand overlay.
To summarize, when employing a visual hand rendering overlaying the real hand, participants were more performant and confident in manipulating virtual objects with bare hands in \AR.
To summarize, when employing a visual hand rendering overlaying the real hand, participants were more performant and confident in manipulating \VOs with bare hands in \AR.
These results contrast with similar manipulation studies, but in non-immersive, on-screen \AR, where the presence of a visual hand rendering was found by participants to improve the usability of the interaction, but not their performance \cite{blaga2017usability,maisto2017evaluation,meli2018combining}.
Our results show the most effective visual hand rendering to be the \level{Skeleton} one.
Participants appreciated that it provided a detailed and precise view of the tracking of the real hand, without hiding or masking it.
Although the \level{Contour} and \level{Mesh} hand renderings were also highly rated, some participants felt that they were too visible and masked the real hand.
This result is in line with the results of virtual object manipulation in \VR of \textcite{prachyabrued2014visual}, who found that the most effective visual hand rendering was a double representation of both the real tracked hand and a visual hand physically constrained by the virtual environment.
This result is in line with the results of \VO manipulation in \VR of \textcite{prachyabrued2014visual}, who found that the most effective visual hand rendering was a double representation of both the real tracked hand and a visual hand physically constrained by the \VE.
This type of \level{Skeleton} rendering was also the one that provided the best sense of agency (control) in \VR \cite{argelaguet2016role,schwind2018touch}.

View File

@@ -1,4 +1,4 @@
\chapter{Visual Rendering of the Hand for Manipulating Virtual Objects in Augmented Reality}
\chapter{Visual Rendering of the Hand for Manipulating Virtual Objects in AR}
\mainlabel{visual_hand}
\chaptertoc