Visual hand {rendering => augmentation}

This commit is contained in:
2024-11-04 14:37:23 +01:00
parent 613e683902
commit 5dc3e33a15
15 changed files with 93 additions and 79 deletions

View File

@@ -9,22 +9,22 @@ Some work has also investigated the visual feedback of the virtual hand in \AR,
\OST-\AR also has significant perceptual differences from \VR due the lack of mutual occlusion between the hand and the virtual object in \OST-\AR (\secref[related_work]{ar_displays}), and the inherent delays between the user's hand and the result of the interaction simulation (\secref[related_work]{ar_virtual_hands}).
In this chapter, we investigate the \textbf{visual rendering of the virtual hand as augmentation of the real hand} for direct hand manipulation of virtual objects in \OST-\AR.
To this end, we selected in the literature and compared the most popular visual hand renderings used to interact with virtual objects in \AR.
To this end, we selected in the literature and compared the most popular visual hand augmentation used to interact with virtual objects in \AR.
The virtual hand is \textbf{displayed superimposed} on the user's hand with these visual rendering, providing \textbf{feedback on the tracking} of the real hand, as shown in \figref{hands}.
The movement of the virtual hand is also \textbf{constrained to the surface} of the virtual object, providing an additional \textbf{feedback on the interaction} with the virtual object.
We \textbf{evaluate in a user study}, using the \OST-\AR headset Microsoft HoloLens~2, the effect of six visual hand renderings on the user performance and experience in two representative manipulation tasks: push-and-slide and grasp-and-place a virtual object directly with the hand.
We \textbf{evaluate in a user study}, using the \OST-\AR headset Microsoft HoloLens~2, the effect of six visual hand augmentations on the user performance and experience in two representative manipulation tasks: push-and-slide and grasp-and-place a virtual object directly with the hand.
\noindentskip The main contributions of this chapter are:
\begin{itemize}
\item A comparison from the literature of six common visual hand renderings used to interact with virtual objects in \AR.
\item A user study evaluating with 24 participants the performance and user experience of the six visual hand renderings as augmentation of the real hand during free and direct hand manipulation of virtual objects in \OST-\AR.
\item A comparison from the literature of six common visual hand augmentation used to interact with virtual objects in \AR.
\item A user study evaluating with 24 participants the performance and user experience of the six visual hand augmentations as augmentation of the real hand during free and direct hand manipulation of virtual objects in \OST-\AR.
\end{itemize}
\noindentskip In the next sections, we first present the six visual hand renderings we considered and gathered from the literature. We then describe the experimental setup and design, the two manipulation tasks, and the metrics used. We present the results of the user study and discuss the implications of these results for the manipulation of virtual objects directly with the hand in \AR.
\noindentskip In the next sections, we first present the six visual hand augmentations we considered and gathered from the literature. We then describe the experimental setup and design, the two manipulation tasks, and the metrics used. We present the results of the user study and discuss the implications of these results for the manipulation of virtual objects directly with the hand in \AR.
\bigskip
\begin{subfigs}{hands}{The six visual hand renderings as augmentation of the real hands.}[
\begin{subfigs}{hands}{The six visual hand augmentations as augmentation of the real hands.}[
As seen by the user through the \AR headset during the two-finger grasping of a virtual cube.
][
\item No visual rendering \level{(None)}.

View File

@@ -1,7 +1,7 @@
\section{Visual Hand Renderings}
\section{Visual Hand Augmentations}
\label{hands}
We compared a set of the most popular visual hand renderings, as found in the literature \secref[related_work]{ar_visual_hands}.
We compared a set of the most popular visual hand augmentations, as found in the literature \secref[related_work]{ar_visual_hands}.
Since we address hand-centered manipulation tasks, we only considered renderings including the fingertips (\secref[related_work]{grasp_types}).
Moreover, as to keep the focus on the hand rendering itself, we used neutral semi-transparent grey meshes, consistent with the choices made in \cite{yoon2020evaluating,vanveldhuizen2021effect}.
All considered hand renderings are drawn following the tracked pose of the user's real hand.
@@ -11,7 +11,7 @@ They are shown in \figref{hands} and described below, with an abbreviation in br
\paragraph{None}
As a reference, we considered no visual hand rendering (\figref{method/hands-none}), as is common in \AR \cite{hettiarachchi2016annexing,blaga2017usability,xiao2018mrtouch,teng2021touch}.
As a reference, we considered no visual hand augmentation (\figref{method/hands-none}), as is common in \AR \cite{hettiarachchi2016annexing,blaga2017usability,xiao2018mrtouch,teng2021touch}.
Users have no information about hand tracking and no feedback about contact with the virtual objects, other than their movement when touched.
As virtual content is rendered on top of the \RE, the hand of the user can be hidden by the virtual objects when manipulating them (\secref[related_work]{ar_displays}).
@@ -45,7 +45,7 @@ It can be seen as a filled version of the Contour hand rendering, thus partially
\section{User Study}
\label{method}
We aim to investigate whether the chosen visual hand rendering affects the performance and user experience of manipulating virtual objects with free hands in \AR.
We aim to investigate whether the chosen visual feedback of the virtual hand affects the performance and user experience of manipulating virtual objects with free hands in \AR.
\subsection{Manipulation Tasks and Virtual Scene}
\label{tasks}
@@ -89,13 +89,13 @@ As before, the task is considered completed when the cube is \emph{fully} inside
We analyzed the two tasks separately.
For each of them, we considered two independent, within-subject, variables:
\begin{itemize}
\item \factor{Hand}, consisting of the six possible visual hand renderings discussed in \secref{hands}: \level{None}, \level{Occlusion} (Occl), \level{Tips}, \level{Contour} (Cont), \level{Skeleton} (Skel), and \level{Mesh}.
\item \factor{Hand}, consisting of the six possible visual hand augmentations discussed in \secref{hands}: \level{None}, \level{Occlusion} (Occl), \level{Tips}, \level{Contour} (Cont), \level{Skeleton} (Skel), and \level{Mesh}.
\item \factor{Target}, consisting of the eight possible locations of the target volume, named from the participant's point of view and as shown in \figref{tasks}: right (\level{R}), right-back (\level{RB}), back (\level{B}), left-back (\level{LB}), left (\level{L}), left-front (\level{LF}), front (\level{F}) and right-front (\level{RF}).
\end{itemize}
Each condition was repeated three times.
To control learning effects, we counter-balanced the orders of the two manipulation tasks and visual hand renderings following a 6 \x 6 Latin square, leading to six blocks where the position of the target volume was in turn randomized.
This design led to a total of 2 manipulation tasks \x 6 visual hand renderings \x 8 targets \x 3 repetitions $=$ 288 trials per participant.
To control learning effects, we counter-balanced the orders of the two manipulation tasks and visual hand augmentations following a 6 \x 6 Latin square, leading to six blocks where the position of the target volume was in turn randomized.
This design led to a total of 2 manipulation tasks \x 6 visual hand augmentations \x 8 targets \x 3 repetitions $=$ 288 trials per participant.
\subsection{Apparatus}
\label{apparatus}
@@ -108,9 +108,9 @@ We measured the latency of the hand tracking at \qty{15}{\ms}, independent of th
The implementation of our experiment was done using Unity 2022.1, PhysX 4.1, and the Mixed Reality Toolkit (MRTK) 2.8.
The compiled application ran directly on the HoloLens~2 at \qty{60}{FPS}.
The default \ThreeD hand model from MRTK was used for all visual hand renderings.
The default \ThreeD hand model from MRTK was used for all visual hand augmentations.
By changing the material properties of this hand model, we were able to achieve the six renderings shown in \figref{hands}.
A calibration was performed for every participant, to best adapt the size of the visual hand rendering to their real hand.
A calibration was performed for every participant, to best adapt the size of the visual hand augmentation to their real hand.
A set of empirical tests enabled us to choose the best rendering characteristics in terms of transparency and brightness for the virtual objects and hand renderings, which were applied throughout the experiment.
The hand tracking information provided by MRTK was used to construct a virtual articulated physics-enabled hand (\secref[related_work]{ar_virtual_hands}) using PhysX.
@@ -149,12 +149,12 @@ Inspired by \textcite[p.674]{laviolajr20173d}, we collected the following metric
\item \response{Completion Time}, defined as the time elapsed between the first contact with the virtual cube and its correct placement inside the target volume; as subjects were asked to complete the tasks as fast as possible, lower completion times mean better performance.
\item \response{Contacts}, defined as the number of separate times the user's hand makes contact with the virtual cube; in both tasks, a lower number of contacts means a smoother continuous interaction with the object.
\item \response{Time per Contact}, defined as the total time any part of the user's hand contacted the cube divided by the number of contacts; higher values mean that the user interacted with the object for longer non-interrupted periods of time.
\item \response{Grip Aperture} (solely for the grasp-and-place task), defined as the average distance between the thumb's fingertip and the other fingertips during the grasping of the cube; lower values indicate a greater finger interpenetration with the cube, resulting in a greater discrepancy between the real hand and the visual hand rendering constrained to the cube surfaces and showing how confident users are in their grasp \cite{prachyabrued2014visual, al-kalbani2016analysis, blaga2017usability, chessa2019grasping}.
\item \response{Grip Aperture} (solely for the grasp-and-place task), defined as the average distance between the thumb's fingertip and the other fingertips during the grasping of the cube; lower values indicate a greater finger interpenetration with the cube, resulting in a greater discrepancy between the real hand and the visual hand augmentation constrained to the cube surfaces and showing how confident users are in their grasp \cite{prachyabrued2014visual, al-kalbani2016analysis, blaga2017usability, chessa2019grasping}.
\end{itemize}
Taken together, these measures provide an overview of the performance and usability of each of the visual hand renderings tested, as we hypothesized that they should influence the behavior and effectiveness of the participants.
Taken together, these measures provide an overview of the performance and usability of each visual hand augmentation, as we hypothesized that they should influence the behavior and effectiveness of the participants.
At the end of each task, participants were asked to rank the visual hand renderings according to their preference with respect to the considered task.
Participants also rated each visual hand rendering individually on six questions using a 7-item Likert scale (1=Not at all, 7=Extremely):
At the end of each task, participants were asked to rank the visual hand augmentations according to their preference with respect to the considered task.
Participants also rated each visual hand augmentation individually on six questions using a 7-item Likert scale (1=Not at all, 7=Extremely):
\begin{itemize}
\item \response{Difficulty}: How difficult were the tasks?
\item \response{Fatigue}: How fatiguing (mentally and physically) were the tasks?

View File

@@ -1,7 +1,9 @@
\section{Results}
\label{results}
Results of each trial metrics were analyzed with an \ANOVA on a \LMM model, with the order of the two manipulation tasks and the six visual hand renderings (\factor{Order}), the visual hand renderings (\factor{Hand}), the target volume position (\factor{Target}), and their interactions as fixed effects and the \factor{Participant} as random intercept.
Results of each trial metrics were analyzed with an \ANOVA on a \LMM model, with the visual hand augmentations (\factor{Hand}), the target volume position (\factor{Target}) and their interactions, as within-participant factors as well as by-participant random intercepts.
Depending on the data, different random effect structures were tested.
Only the best converging models are reported, with the lowest Akaike Information Criterion (AIC) values.
For every \LMM, residuals were tested with a Q-Q plot to confirm normality.
On statistically significant effects, estimated marginal means of the \LMM were compared pairwise using Tukey's \HSD test.
Only significant results are reported.

View File

@@ -3,7 +3,8 @@
\paragraph{Completion Time}
On the time to complete a trial, there were two statistically significant effects:
On the time to complete a trial,
a \LMM \ANOVA with by-participant random intercepts indicated two statistically significant effects:
\factor{Hand} (\anova{5}{3385}{5.5}, \pinf{0.001}, see \figref{results/Push-CompletionTime})
and \factor{Target} (\anova{7}{3385}{22.9}, \pinf{0.001}).
\level{Skeleton} was the fastest, more than \level{None} (\percent{+18}, \p{0.005}), \level{Occlusion} (\percent{+26}, \pinf{0.001}), \level{Tips} (\percent{+22}, \pinf{0.001}), and \level{Contour} (\percent{+20}, \p{0.001}).
@@ -15,30 +16,32 @@ and (3) back \level{B} and \level{LB} targets were the slowest (\p{0.04}).
\paragraph{Contacts}
On the number of contacts, there were two statistically significant effects:
On the number of contacts,
a \LMM \ANOVA with by-participant random intercepts indicated two statistically significant effects:
\factor{Hand} (\anova{5}{3385}{6.2}, \pinf{0.001}, see \figref{results/Push-ContactsCount})
and \factor{Target} (\anova{7}{3385}{25.6}, \pinf{0.001}).
Fewer contacts were made with \level{Skeleton} than with \level{None} (\percent{-23}, \pinf{0.001}), \level{Occlusion} (\percent{-26}, \pinf{0.001}), \level{Tips} (\percent{-18}, \p{0.004}), and \level{Contour} (\percent{-15}, \p{0.02});
and less with \level{Mesh} than with \level{Occlusion} (\percent{-14}, \p{0.04}).
This indicates how effective a visual hand rendering is: a lower result indicates a smoother ability to push and rotate properly the cube into the target, as one would probably do with a real cube.
This indicates how effective a visual hand augmentation is: a lower result indicates a smoother ability to push and rotate properly the cube into the target, as one would probably do with a real cube.
Targets on the left (\level{L}, \level{LF}) and the right (\level{R}) were easier to reach than the back ones (\level{B}, \level{LB}, \pinf{0.001}).
\paragraph{Time per Contact}
On the mean time spent on each contact, there were two statistically significant effects:
On the mean time spent on each contact,
a \LMM \ANOVA with by-participant random intercepts indicated two statistically significant effects:
\factor{Hand} (\anova{5}{3385}{7.7}, \pinf{0.001}, see \figref{results/Push-MeanContactTime})
and \factor{Target} (\anova{7}{3385}{17.9}, \pinf{0.001}).
It was shorter with \level{None} than with \level{Skeleton} (\percent{-10}, \pinf{0.001}) and \level{Mesh} (\percent{-8}, \p{0.03});
and shorter with \level{Occlusion} than with \level{Tips} (\percent{-10}, \p{0.002}), \level{Contour} (\percent{-10}, \p{0.001}), \level{Skeleton} (\percent{-14}, \p{0.001}), and \level{Mesh} (\percent{-12}, \p{0.03}).
This result suggests that users pushed the virtual cube with more confidence with a visible visual hand rendering.
This result suggests that users pushed the virtual cube with more confidence with a visible visual hand augmentation.
On the contrary, the lack of visual hand constrained the participants to give more attention to the cube's reactions.
Targets on the left (\level{L}, \level{LF}) and the right (\level{R}) sides had higher \response{Timer per Contact} than all the other targets (\p{0.005}).
\begin{subfigs}{push_results}{Results of the push task performance metrics for each visual hand rendering.}[
\begin{subfigs}{push_results}{Results of the push task performance metrics for each visual hand augmentation.}[
Geometric means with bootstrap \percent{95} \CI
and Tukey's \HSD pairwise comparisons: *** is \pinf{0.001}, ** is \pinf{0.01}, and * is \pinf{0.05}.
][

View File

@@ -3,14 +3,16 @@
\paragraph{Completion Time}
On the time to complete a trial, there was one statistically significant effect
On the time to complete a trial,
a \LMM \ANOVA with by-participant random intercepts indicated one statistically significant effect
of \factor{Target} (\anova{7}{3385}{34.3}, \pinf{0.001})
but not of \factor{Hand} (\anova{5}{3385}{1.7}, \p{0.1}).
Targets on the back and the left (\level{B}, \level{LB}, and \level{L}) were slower than targets on the front (\level{LF}, \level{F}, and \level{RF}, \p{0.003}) {except for} \level{RB} (back-right) which was also fast.
\paragraph{Contacts}
On the number of contacts, there were two statistically significant effects:
On the number of contacts,
a \LMM \ANOVA with by-participant random intercepts indicated two statistically significant effects:
\factor{Hand} (\anova{5}{3385}{4.9}, \pinf{0.001}, see \figref{results/Grasp-ContactsCount})
and \factor{Target} (\anova{7}{3385}{20.0}, \pinf{0.001}).
@@ -23,7 +25,8 @@ Targets on the back and left were more difficult (\level{B}, \level{LB}, and \le
\paragraph{Time per Contact}
On the mean time spent on each contact, there were two statistically significant effects:
On the mean time spent on each contact,
a \LMM \ANOVA with by-participant random intercepts indicated two statistically significant effects:
\factor{Hand} (\anova{5}{3385}{9.1}, \pinf{0.001}, see \figref{results/Grasp-MeanContactTime})
and \factor{Target} (\anova{7}{3385}{5.4}, \pinf{0.001}).
@@ -37,21 +40,20 @@ This time was the shortest on the front \level{F} than on the other target volum
\paragraph{Grip Aperture}
On the average distance between the thumb's fingertip and the other fingertips during grasping, there were two
statistically significant effects:
On the average distance between the thumb's fingertip and the other fingertips during grasping,
a \LMM \ANOVA with by-participant random intercepts and random slopes for \factor{Hand} indicated two statistically significant effects:
\factor{Hand} (\anova{5}{19}{6.7}, \pinf{0.001}, see \figref{results/Grasp-GripAperture})
and \factor{Target} (\anova{7}{3270}{4.1}, \pinf{0.001}).
\footnote{Note that the best converging \LMM (with the lowest Akaike Information Criterion value) had a by-participant random intercept (like all the other models in this study) and a by-participant random slope for the \factor{Hand} factor. The results reported are from this model, which explains the different degrees of freedom from the other models.}
It was shorter with \level{None} than with \level{Occlusion} (\pinf{0.001}), \level{Contour} (\pinf{0.001}), \level{Skeleton} (\pinf{0.001}) and \level{Mesh} (\pinf{0.001}).
%shorter with \level{Tips} than with \level{Occlusion} (\p{0.008}), \level{Contour} (\p{0.006}) and \level{Mesh} (\pinf{0.001});
%and shorter with \level{Skeleton} than with \level{Mesh} (\pinf{0.001}).
This result is an evidence of the lack of confidence of participants with no visual hand rendering: they grasped the cube more to secure it.
This result is an evidence of the lack of confidence of participants with no visual hand augmentation: they grasped the cube more to secure it.
%The \level{Mesh} rendering seemed to have provided the most confidence to participants, maybe because it was the closest to the real hand.
The \response{Grip Aperture} was longer on the right-front (\level{RF}) target volume, indicating a higher confidence, than on back and side targets (\level{R}, \level{RB}, \level{B}, \level{L}, \p{0.03}).
\begin{subfigs}{grasp_results}{Results of the grasp task performance metrics for each visual hand rendering.}[
\begin{subfigs}{grasp_results}{Results of the grasp task performance metrics for each visual hand augmentation.}[
Geometric means with bootstrap \percent{95} \CI
and Tukey's \HSD pairwise comparisons: *** is \pinf{0.001}, ** is \pinf{0.01}, and * is \pinf{0.05}.
][

View File

@@ -11,10 +11,10 @@ Pairwise Wilcoxon signed-rank tests with Holm-Bonferroni adjustment were then us
This good ranking of the \level{Skeleton} rendering for the Push task is consistent with the Push trial results.
\item \response{Grasp task ranking}: \level{Occlusion} was ranked lower than \level{Contour} (\p{0.001}), \level{Skeleton} (\p{0.001}), and \level{Mesh} (\p{0.007});
No Hand was ranked lower than \level{Skeleton} (\p{0.04}).
A complete visual hand rendering seemed to be preferred over no visual hand rendering when grasping.
A complete visual hand augmentation seemed to be preferred over no visual hand augmentation when grasping.
\end{itemize}
\begin{subfigs}{results_ranks}{Boxplots of the ranking for each visual hand rendering.}[
\begin{subfigs}{results_ranks}{Boxplots of the ranking for each visual hand augmentation.}[
Lower is better.
Pairwise Wilcoxon signed-rank tests with Holm-Bonferroni adjustment: ** is \pinf{0.01} and * is \pinf{0.05}.
][

View File

@@ -1,7 +1,7 @@
\subsection{Questionnaire}
\label{questions}
\figref{results_questions} presents the questionnaire results for each visual hand rendering.
\figref{results_questions} presents the questionnaire results for each visual hand augmentation.
Friedman tests indicated that all questions had statistically significant differences (\pinf{0.001}).
Pairwise Wilcoxon signed-rank tests with Holm-Bonferroni adjustment were then used each question results:
\begin{itemize}
@@ -17,9 +17,9 @@ In summary, \level{Occlusion} was worse than \level{Skeleton} for all questions,
Results of \response{Difficulty}, \response{Performance}, and \response{Precision} questions are consistent in that way.
Moreover, having no visible visual \factor{Hand} rendering was felt by users fatiguing and less precise than having one.
Surprisingly, no clear consensus was found on \response{Rating}.
Each visual hand rendering, except for \level{Occlusion}, had simultaneously received the minimum and maximum possible notes.
Each visual hand augmentation, except for \level{Occlusion}, had simultaneously received the minimum and maximum possible notes.
\begin{subfigs}{results_questions}{Boxplots of the questionnaire results for each visual hand rendering.}[
\begin{subfigs}{results_questions}{Boxplots of the questionnaire results for each visual hand augmentation.}[
Pairwise Wilcoxon signed-rank tests with Holm-Bonferroni adjustment: ** is \pinf{0.01} and * is \pinf{0.05}.
Lower is better for \textbf{(a)} difficulty and \textbf{(b)} fatigue.
Higher is better for \textbf{(d)} performance, \textbf{(d)} precision, \textbf{(e)} efficiency, and \textbf{(f)} rating.

View File

@@ -1,21 +1,21 @@
\section{Discussion}
\label{discussion}
We evaluated six visual hand renderings, as described in \secref{hands}, displayed on top of the real hand, in two virtual object manipulation tasks in \AR.
We evaluated six visual hand augmentations, as described in \secref{hands}, displayed on top of the real hand, in two virtual object manipulation tasks in \AR.
During the \level{Push} task, the \level{Skeleton} hand rendering was the fastest (\figref{results/Push-CompletionTime}), as participants employed fewer and longer contacts to adjust the cube inside the target volume (\figref{results/Push-ContactsCount} and \figref{results/Push-MeanContactTime}).
Participants consistently used few and continuous contacts for all visual hand renderings (Fig. 3b), with only less than ten trials, carried out by two participants, quickly completed with multiple discrete touches.
Participants consistently used few and continuous contacts for all visual hand augmentations (Fig. 3b), with only less than ten trials, carried out by two participants, quickly completed with multiple discrete touches.
However, during the \level{Grasp} task, despite no difference in \response{Completion Time}, providing no visible hand rendering (\level{None} and \level{Occlusion} renderings) led to more failed grasps or cube drops (\figref{results/Grasp-ContactsCount} and \figref{results/Grasp-MeanContactTime}).
Indeed, participants found the \level{None} and \level{Occlusion} renderings less effective (\figref{results/Ranks-Grasp}) and less precise (\figref{results_questions}).
To understand whether the participants' previous experience might have played a role, we also carried out an additional statistical analysis considering \VR experience as an additional between-subjects factor, \ie \VR novices vs. \VR experts (\enquote{I use it every week}, see \secref{participants}).
We found no statistically significant differences when comparing the considered metrics between \VR novices and experts.
All visual hand renderings showed \response{Grip Apertures} close to the size of the virtual cube, except for the \level{None} rendering (\figref{results/Grasp-GripAperture}), with which participants applied stronger grasps, \ie less distance between the fingertips.
Having no visual hand rendering, but only the reaction of the cube to the interaction as feedback, made participants less confident in their grip.
All visual hand augmentations showed \response{Grip Apertures} close to the size of the virtual cube, except for the \level{None} rendering (\figref{results/Grasp-GripAperture}), with which participants applied stronger grasps, \ie less distance between the fingertips.
Having no visual hand augmentation, but only the reaction of the cube to the interaction as feedback, made participants less confident in their grip.
This result contrasts with the wrongly estimated grip apertures observed by \textcite{al-kalbani2016analysis} in an exocentric VST-AR setup.
Also, while some participants found the absence of visual hand rendering more natural, many of them commented on the importance of having feedback on the tracking of their hands, as observed by \textcite{xiao2018mrtouch} in a similar immersive OST-AR setup.
Also, while some participants found the absence of visual hand augmentation more natural, many of them commented on the importance of having feedback on the tracking of their hands, as observed by \textcite{xiao2018mrtouch} in a similar immersive OST-AR setup.
Yet, participants' opinions of the visual hand renderings were mixed on many questions, except for the \level{Occlusion} one, which was perceived less effective than more \enquote{complete} visual hands such as \level{Contour}, \level{Skeleton}, and \level{Mesh} hands (\figref{results_questions}).
Yet, participants' opinions of the visual hand augmentations were mixed on many questions, except for the \level{Occlusion} one, which was perceived less effective than more \enquote{complete} visual hands such as \level{Contour}, \level{Skeleton}, and \level{Mesh} hands (\figref{results_questions}).
However, due to the latency of the hand tracking and the visual hand reacting to the cube, almost all participants thought that the \level{Occlusion} rendering to be a \enquote{shadow} of the real hand on the cube.
The \level{Tips} rendering, which showed the contacts made on the virtual cube, was controversial as it received the minimum and the maximum score on every question.
@@ -23,10 +23,10 @@ Many participants reported difficulties in seeing the orientation of the visual
while others found that it gave them a better sense of the contact points and improved their concentration on the task.
This result is consistent with \textcite{saito2021contact}, who found that displaying the points of contacts was beneficial for grasping a virtual object over an opaque visual hand overlay.
To summarize, when employing a visual hand rendering overlaying the real hand, participants were more performant and confident in manipulating virtual objects with bare hands in \AR.
These results contrast with similar manipulation studies, but in non-immersive, on-screen \AR, where the presence of a visual hand rendering was found by participants to improve the usability of the interaction, but not their performance \cite{blaga2017usability,maisto2017evaluation,meli2018combining}.
Our results show the most effective visual hand rendering to be the \level{Skeleton} one.
To summarize, when employing a visual feedback of the virtual hand overlaying the real hand, participants were more performant and confident in manipulating virtual objects with bare hands in \AR.
These results contrast with similar manipulation studies, but in non-immersive, on-screen \AR, where the presence of a visual hand augmentation was found by participants to improve the usability of the interaction, but not their performance \cite{blaga2017usability,maisto2017evaluation,meli2018combining}.
Our results show the most effective visual hand augmentation to be the \level{Skeleton} one.
Participants appreciated that it provided a detailed and precise view of the tracking of the real hand, without hiding or masking it.
Although the \level{Contour} and \level{Mesh} hand renderings were also highly rated, some participants felt that they were too visible and masked the real hand.
This result is in line with the results of virtual object manipulation in \VR of \textcite{prachyabrued2014visual}, who found that the most effective visual hand rendering was a double representation of both the real tracked hand and a visual hand physically constrained by the \VE.
This result is in line with the results of virtual object manipulation in \VR of \textcite{prachyabrued2014visual}, who found that the most effective visual hand augmentation was a double representation of both the real tracked hand and a visual hand physically constrained by the \VE.
This type of \level{Skeleton} rendering was also the one that provided the best sense of agency (control) in \VR \cite{argelaguet2016role,schwind2018touch}.

View File

@@ -4,8 +4,8 @@
In this chapter, we addressed the challenge of touching, grasping and manipulating virtual objects directly with the hand in immersive \OST-\AR.
To do so, we proposed to evaluate visual renderings of the virtual hand as augmentation of the real hand.
Superimposed on the user's hand, these visual renderings provide feedback from the virtual hand, which tracks the real hand, and simulates the interaction with virtual objects as a proxy.
We first selected and compared the six most popular visual hand renderings used to interact with virtual objects in \AR.
Then, in a user study with 24 participants and an immersive \OST-\AR headset, we evaluated the effect of these six visual hand renderings on the user performance and experience in two representative manipulation tasks.
We first selected and compared the six most popular visual hand augmentations used to interact with virtual objects in \AR.
Then, in a user study with 24 participants and an immersive \OST-\AR headset, we evaluated the effect of these six visual hand augmentations on the user performance and experience in two representative manipulation tasks.
Our results showed that a visual hand augmentation improved the performance, perceived effectiveness and confidence of participants compared to no augmentation.
A skeleton rendering, which provided a detailed view of the tracked joints and phalanges while not hiding the real hand, was the most performant and effective.
@@ -15,7 +15,7 @@ This is consistent with similar manipulation studies in \VR and in non-immersive
This study suggests that a \ThreeD visual hand augmentation is important in \AR when interacting with a virtual hand technique, particularly when it involves precise finger movements in relation to virtual content, \eg \ThreeD windows, buttons and sliders, or more complex tasks, such as stacking or assembly.
A minimal but detailed rendering of the virtual hand that does not hide the real hand, such as the skeleton rendering we evaluated, seems to be the best compromise between the richness and effectiveness of the feedback.
%Still, users should be able to choose and adapt the visual hand rendering to their preferences and needs.
%Still, users should be able to choose and adapt the visual hand augmentation to their preferences and needs.
In addition to visual augmentation of the hand, direct manipulation of virtual objects with the hand can also benefit from wearable haptic feedback.
In the next chapter, we explore two wearable vibrotactile contact feedback devices in a user study, located at four positionings on the hand so as to not cover the fingertips.