Rename \mainchapter => \chaptertoc

This commit is contained in:
2024-06-27 00:02:35 +02:00
parent 05bc6c77d6
commit 2d23eb9a16
14 changed files with 618 additions and 8 deletions

View File

@@ -1,5 +1,8 @@
\chapter{Related Work}
\mainlabel{related_work} \mainlabel{related_work}
\chaptertoc
\section{Haptics} \section{Haptics}
\subsection{The Sense of Touch} \subsection{The Sense of Touch}

View File

@@ -1,6 +1,8 @@
\mainchapter{Augmenting the Texture Perception of Tangible Surfaces in Augmented Reality using Vibrotactile Haptics} \chapter{Augmenting the Texture Perception of Tangible Surfaces in Augmented Reality using Vibrotactile Haptics}
\mainlabel{ar_textures} \mainlabel{ar_textures}
\chaptertoc
\input{1-introduction} \input{1-introduction}
\input{2-experiment} \input{2-experiment}
\input{3-results} \input{3-results}

View File

@@ -1,6 +1,8 @@
\mainchapter{Perception of Visual-Haptic Texture Augmentation in Augmented and Virtual Reality} \chapter{Perception of Visual-Haptic Texture Augmentation in Augmented and Virtual Reality}
\mainlabel{xr_perception} \mainlabel{xr_perception}
\chaptertoc
\input{1-introduction} \input{1-introduction}
\input{2-related-work} \input{2-related-work}
\input{3-method} \input{3-method}

View File

@@ -0,0 +1,67 @@
\section{Introduction}
\label{1_introduction}
\begin{subfigswide}{hands}{%
Experiment \#1. The six considered visual hand renderings, as seen by the user through the AR headset
during the two-finger grasping of a virtual cube.
%
From left to right: %
no visual rendering \emph{(None)}, %
cropped virtual content to {enable} hand-cube occlusion \emph{(Occlusion, Occl)}, %
rings on the fingertips \emph{(Tips)}, %
thin outline of the hand \emph{(Contour, Cont)}, %
fingers' joints and phalanges \emph{(Skeleton, Skel)}, and %
semi-transparent 3D hand model \emph{(Mesh)}.
}
\subfig[0.15]{3-hands-none}[None]
\subfig[0.15]{3-hands-occlusion}[Occlusion (Occl)]
\subfig[0.15]{3-hands-tips}[Tips]
\subfig[0.15]{3-hands-contour}[Contour (Cont)]
\subfig[0.15]{3-hands-skeleton}[Skeleton (Skel)]
\subfig[0.15]{3-hands-mesh}[Mesh]
\end{subfigswide}
\noindent \IEEEPARstart{A}{ugmented} reality (AR) integrates virtual content into our real-world surroundings, giving the illusion of one unique environment and promising natural and seamless interactions with real and virtual objects.
%
Virtual object manipulation is particularly critical for useful and effective AR usage, such as in medical applications, training, or entertainment~\cite{laviolajr20173d, kim2018revisiting}.
%
Hand tracking technologies~\cite{xiao2018mrtouch}, grasping techniques~\cite{holl2018efficient}, and real-time physics engines permit users to directly manipulate virtual objects with their bare hands as if they were real~\cite{piumsomboon2014graspshell}, without requiring controllers~\cite{krichenbauer2018augmented}, gloves~\cite{prachyabrued2014visual}, or predefined gesture techniques~\cite{piumsomboon2013userdefined, ha2014wearhand}.
%
Optical see-through AR (OST-AR) head-mounted displays (HMDs), such as the Microsoft HoloLens 2 or the Magic Leap, are particularly suited for this type of direct hand interaction~\cite{kim2018revisiting}.
However, there are still several haptic and visual limitations that affect manipulation in OST-AR, degrading the user experience.
%
For example, it is difficult to estimate the position of one's hand in relation to a virtual content because mutual occlusion between the hand and the virtual object is often lacking~\cite{macedo2023occlusion}, the depth of virtual content is underestimated~\cite{diaz2017designing, peillard2019studying}, and hand tracking still has a noticeable latency~\cite{xiao2018mrtouch}.
%
Similarly, it is challenging to ensure confident and realistic contact with a virtual object due to the lack of haptic feedback and the intangibility of the virtual environment, which of course cannot apply physical constraints on the hand~\cite{maisto2017evaluation, meli2018combining, lopes2018adding, teng2021touch}.
%
These limitations also make it difficult to confidently move a grasped object towards a target~\cite{maisto2017evaluation, meli2018combining}.
To address these haptic and visual limitations, we investigate two types of sensory feedback that are known to improve virtual interactions with hands, but have not been studied together in an AR context: visual hand rendering and delocalized haptic rendering.
%
A few works explored the effect of a visual hand rendering on interactions in AR by simulating mutual occlusion between the real hand and virtual objects~\cite{ha2014wearhand, piumsomboon2014graspshell, al-kalbani2016analysis}, or displaying a 3D virtual hand model, semi-transparent~\cite{ha2014wearhand, piumsomboon2014graspshell} or opaque~\cite{blaga2017usability, yoon2020evaluating, saito2021contact}.
%
Indeed, some visual hand renderings are known to improve interactions or user experience in virtual reality (VR), where the real hand is not visible~\cite{prachyabrued2014visual, argelaguet2016role, grubert2018effects, schwind2018touch, vanveldhuizen2021effect}.
%
However, the role of a visual hand rendering superimposed and seen above the real tracked hand has not yet been investigated in AR.
%
Conjointly, several studies have demonstrated that wearable haptics can significantly improve interactions performance and user experience in AR~\cite{maisto2017evaluation, meli2018combining, sarac2022perceived}.
%
But haptic rendering for AR remains a challenge as it is difficult to provide rich and realistic haptic sensations while limiting their negative impact on hand tracking~\cite{pacchierotti2016hring} and keeping the fingertips and palm free to interact with the real environment~\cite{lopes2018adding, teng2021touch, sarac2022perceived, palmer2022haptic}.
%
Therefore, the haptic feedback of the fingertip contact with the virtual environment needs to be rendered elsewhere on the hand, it is unclear which positioning should be preferred or which type of haptic feedback is best suited for manipulating virtual objects in AR.
%
A final question is whether one or the other of these (haptic or visual) hand renderings should be preferred~\cite{maisto2017evaluation, meli2018combining}, or whether a combined visuo-haptic rendering is beneficial for users.
%
In fact, both hand renderings can provide sufficient sensory cues for efficient manipulation of virtual objects in AR, or conversely, they can be shown to be complementary.
In this paper, we investigate the role of the visuo-haptic rendering of the hand during 3D manipulation of virtual objects in OST-AR.
%
We consider two representative manipulation tasks: push-and-slide and grasp-and-place a virtual object.
%
The main contributions of this work are:
%
\begin{itemize}
\item a first human subject experiment evaluating the performance and user experience of six visual hand renderings superimposed on the real hand; %
\item a second human subject experiment evaluating the performance and user experience of visuo-haptic hand renderings by comparing two vibrotactile contact techniques provided at four delocalized positions on the hand and combined with the two most representative visual hand renderings established in the first experiment.
\end{itemize}

View File

@@ -0,0 +1,234 @@
\section{Experiment \#1: Visual Rendering of the Hand in AR}
\label{3_method}
\noindent This first experiment aims to analyze whether the chosen visual hand rendering affects the performance and user experience of manipulating virtual objects with bare hands in AR.
\subsection{Visual Hand Renderings}
\label{3_hands}
We compared a set of the most popular visual hand renderings, as also presented in \secref{2_hands}.
%
Since we address hand-centered manipulation tasks, we only considered renderings including the fingertips.
%
Moreover, as to keep the focus on the hand rendering itself, we used neutral semi-transparent grey meshes, consistent with the choices made in~\cite{yoon2020evaluating, vanveldhuizen2021effect}.
%
All considered hand renderings are drawn following the tracked pose of the user's real hand.
%
However, while the real hand can of course penetrate virtual objects, the visual hand is always constrained by the virtual environment.
\subsubsection{None~(\figref{hands-none})}
\label{3_hands_none}
As a reference, we considered no visual hand rendering, as is common in AR~\cite{hettiarachchi2016annexing, blaga2017usability, xiao2018mrtouch, teng2021touch}.
%
Users have no information about hand tracking and no feedback about contact with the virtual objects, other than their movement when touched.
%
As virtual content is rendered on top of the real environment, the hand of the user can be hidden by the virtual objects when manipulating them (see \secref{2_hands}).
\subsubsection{Occlusion (Occl,~\figref{hands-occlusion})}
\label{3_hands_occlusion}
To avoid the abovementioned undesired occlusions due to the virtual content being rendered on top of the real environment, we can carefully crop the former whenever it hides real content that should be visible~\cite{macedo2023occlusion}, \eg the thumb of the user in \figref{hands-occlusion}.
%
This approach is frequent in works using VST-AR headsets~\cite{knorlein2009influence, ha2014wearhand, piumsomboon2014graspshell, suzuki2014grasping, al-kalbani2016analysis} .
\subsubsection{Tips (\figref{hands-tips})}
\label{3_hands_tips}
This rendering shows small visual rings around the fingertips of the user, highlighting the most important parts of the hand and contact with virtual objects during fine manipulation.
%
Unlike work using small spheres~\cite{maisto2017evaluation, meli2014wearable, grubert2018effects, normand2018enlarging, schwind2018touch}, this ring rendering also provides information about the orientation of the fingertips.
\subsubsection{Contour (Cont,~\figref{hands-contour})}
\label{3_hands_contour}
This rendering is a {1-mm-thick} outline contouring the user's hands, providing information about the whole hand while leaving its inside visible.
%
Unlike the other renderings, it is not occluded by the virtual objects, as shown in \figref{hands-contour}.
%
This rendering is not as usual as the previous others in the literature~\cite{kang2020comparative}.
\subsubsection{Skeleton (Skel,~\figref{hands-skeleton})}
\label{3_hands_skeleton}
This rendering schematically renders the joints and phalanges of the fingers with small spheres and cylinders, respectively, leaving the outside of the hand visible.
%
It can be seen as an extension of the Tips rendering to include the complete fingers articulations.
%
It is widely used in VR~\cite{argelaguet2016role, schwind2018touch, chessa2019grasping} and AR~\cite{blaga2017usability, yoon2020evaluating}, as it is considered simple yet rich and comprehensive.
\subsubsection{Mesh (\figref{hands-mesh})}
\label{3_hands_mesh}
This rendering is a 3D semi-transparent ($a=0.2$) hand model, which is common in VR~\cite{prachyabrued2014visual, argelaguet2016role, schwind2018touch, chessa2019grasping, yoon2020evaluating, vanveldhuizen2021effect}.
%
It can be seen as a filled version of the Contour hand rendering, thus partially covering the view of the real hand.
\subsection{Manipulation Tasks and Virtual Scene}
\label{3_tasks}
\begin{subfigs}{3_tasks}{%
Experiment \#1. The two manipulation tasks: %
(a) pushing a virtual cube along a table towards a target placed on the same surface; %
(b) grasping and lifting a virtual cube towards a target placed on a 20-cm-higher plane. %
Both pictures show the cube to manipulate in the middle (5-cm-edge and opaque) and the eight possible targets to
reach (7-cm-edge volume and semi-transparent). %
Only one target at a time was shown during the experiments.%
}
\subfig[0.23]{3-task-push}[Push task]
\subfig[0.23]{3-task-grasp}[Grasp task]
\end{subfigs}
Following the guidelines of \citeauthorcite{bergstrom2021how} for designing object manipulation tasks, we considered two variations of a 3D pick-and-place task, commonly found in interaction and manipulation studies~\cite{prachyabrued2014visual, maisto2017evaluation, meli2018combining, blaga2017usability, vanveldhuizen2021effect}.
\subsubsection{Push Task}
\label{push-task}
The first manipulation task consists in pushing a virtual object along a real flat surface towards a target placed on the same plane (see \figref{3-task-push}).
%
The virtual object to manipulate is a small \qty{50}{\mm} blue and opaque cube, while the target is a (slightly) bigger \qty{70}{\mm} blue and semi-transparent volume.
%
At every repetition of the task, the cube to manipulate always spawns at the same place, on top of a real table in front of the user.
%
On the other hand, the target volume can spawn in eight different locations on the same table, located on a \qty{20}{\cm} radius circle centered on the cube, at \qty{45}{\degree} from each other (see again \figref{3-task-push}).
%
Users are asked to push the cube towards the target volume using their fingertips in any way they prefer.
%
In this task, the cube cannot be lifted.
%
The task is considered completed when the cube is \emph{fully} inside the target volume.
\subsubsection{Grasp Task}
\label{grasp-task}
The second manipulation task consists in grasping, lifting, and placing a virtual object in a target placed on a different (higher) plane (see \figref{3-task-grasp}).
%
The cube to manipulate and target volume are the same as in the previous task. However, this time, the target volume can spawn in eight different locations on a plane \qty{10}{\cm} \emph{above} the table, still located on a \qty{20}{\cm} radius circle at \qty{45}{\degree} from each other.
%
Users are asked to grasp, lift, and move the cube towards the target volume using their fingertips in any way they prefer.
%
As before, the task is considered completed when the cube is \emph{fully} inside the volume.
\subsection{Experimental Design}
\label{3_design}
We analyzed the two tasks separately. For each of them, we considered two independent, within-subject, variables:
%
\begin{itemize}
\item \emph{Visual Hand Renderings}, consisting of the six possible renderings discussed in \secref{3_hands}: None, Occlusion (Occl), Tips, Contour (Cont), Skeleton (Skel), and Mesh.
\item \emph{Target}, consisting of the eight possible {location} of the target volume, named as the cardinal points and as shown in \figref{3_tasks}: {E, NE, N, NW, W, SW, S, and SE}.
\end{itemize}
%
Each condition was repeated three times.
%
To control learning effects, we counter-balanced the orders of the two manipulation tasks and visual hand renderings following a 6 \x 6 Latin square, leading to six blocks where the position of the target volume was in turn randomized.
%
This design led to a total of 2 manipulation tasks \x 6 visual hand renderings \x 8 targets \x 3 repetitions $=$ 288 trials per participant.
\subsection{Apparatus and Implementation}
\label{3_apparatus}
We used the OST-AR headset HoloLens~2.
%
It is capable of rendering virtual content within an horizontal field of view of \qty{43}{\degree} and a vertical one of \qty{29}{\degree}. It is also able to track the environment as well as the user's fingers.
%
We measured the latency of the hand tracking at \qty{15}{\ms}, independent of the hand movement speed.
The implementation of our experiment was done in C\# using Unity 2022.1, PhysX 4.1, and the Mixed Reality Toolkit (MRTK) 2.8\footnoteurl{https://learn.microsoft.com/windows/mixed-reality/mrtk-unity}.
%
The compiled application ran directly on the HoloLens~2 at \qty{60}{FPS}.
The default 3D hand model from MRTK was used for all visual hand renderings.
%
By changing the material properties of this hand model, we were able to achieve the six renderings shown in \figref{hands}.
%
A calibration was performed for every participant, so as to best adapt the size of the visual hand rendering to their real hand.
%
A set of empirical tests enabled us to choose the best rendering characteristics in terms of transparency and brightness for the virtual objects and hand renderings, which were applied throughout the experiment.
The hand tracking information provided by MRTK was used to construct a virtual articulated physics-enabled hand using PhysX.
%
It featured 25 DoFs, including the fingers proximal, middle, and distal phalanges.
%
To allow effective (and stable) physical interactions between the hand and the virtual cube to manipulate, we implemented an approach similar to that of \citeauthorcite{borst2006spring}, where a series of virtual springs with high stiffness are used to couple the physics-enabled hand with the tracked hand.
%
As before, a set of empirical tests have been used to select the most effective physical characteristics in terms of mass, elastic constant, friction, damping, colliders size, and shape for the (tracked) virtual hand interaction model.
The room where the experiment was held had no windows, with one light source of \qty{800}{\lumen} placed \qty{70}{\cm} above the table.
%
This setup enabled a good and consistent tracking of the user's fingers.
\subsection{Protocol}
\label{3_protocol}
First, participants were given a consent form that briefed them about the tasks and the protocol of the experiment.
%
Then, participants were asked to comfortably sit in front of a table and wear the HoloLens~2 headset as shown in \figref{3_tasks}, perform the calibration of the visual hand size as described in \secref{3_apparatus}, and complete a 2-minutes training to familiarize with the AR rendering and the two considered tasks.
%
During this training, we did not use any of the six hand renderings we want to test, but rather a fully-opaque white hand rendering that completely occluded the real hand of the user.
Participants were asked to carry out the two tasks as naturally and as fast as possible.
%
Similarly to~\cite{prachyabrued2014visual, maisto2017evaluation, blaga2017usability, vanveldhuizen2021effect}, we only allowed the use of the dominant hand.
%
The experiment took around 1 hour and 20 minutes to complete.
\subsection{Participants}
\label{3_participants}
Twenty-four subjects participated in the study (eight aged between 18 and 24, fourteen aged between 25 and 34, and two aged between 35 and 44; 22~males, 1~female, 1~preferred not to say).
%
None of the participants reported any deficiencies in their visual perception abilities.
%
Two subjects were left-handed, while the twenty-two other were right-handed; they all used their dominant hand during the trials.
%
Ten subjects had significant experience with VR (\enquote{I use it every week}), while the fourteen other reported little to no experience with VR.
%
Two subjects had significant experience with AR (\enquote{I use it every week}), while the twenty-two other reported little to no experience with AR.
%
Participants signed an informed consent, including the declaration of having no conflict of interest.
\subsection{Collected Data}
\label{3_metrics}
Inspired by \citeauthorcite{laviolajr20173d}, we collected the following metrics during the experiment.
%
(i) The task \emph{Completion Time}, defined as the time elapsed between the very first contact with the virtual cube and its correct placement inside the target volume; as subjects were asked to complete the tasks as fast as possible, lower completion times mean better performance.
%
(ii) The number of \emph{Contacts}, defined as the number of separate times the user's hand makes contact with the virtual cube; in both tasks, a lower number of contacts means a smoother continuous interaction with the object.
%
Finally, (iii) the mean \emph{Time per Contact}, defined as the total time any part of the user's hand contacted the cube divided by the number of contacts; higher values mean that the user interacted with the object for longer non-interrupted periods of time.
%
Solely for the grasp-and-place task, we also measured the (iv) \emph{Grip Aperture}, defined as the average distance between the thumb's fingertip and the other fingertips during the grasping of the cube;
%
lower values indicate a greater finger interpenetration with the cube, resulting in a greater discrepancy between the real hand and the visual hand rendering constrained to the cube surfaces and showing how confident users are in their grasp~\cite{prachyabrued2014visual, al-kalbani2016analysis, blaga2017usability, chessa2019grasping}.
%
Taken together, these measures provide an overview of the performance and usability of each of the visual hand renderings tested, as we hypothesized that they should influence the behavior and effectiveness of the participants.
At the end of each task, participants were asked to rank the visual hand renderings according to their preference with respect to the considered task.
%
Participants also rated each visual hand rendering individually on six questions using a 7-item Likert scale (1=Not at all, 7=Extremely):
%
\emph{(Difficulty)} How difficult were the tasks? %
\emph{(Fatigue)} How fatiguing (mentally and physically) were the tasks? %
\emph{(Precision)} How precise were you in performing the tasks? %
\emph{(Efficiency)} How fast/efficient do you think you were in performing the tasks? %
\emph{({Rating})} How much do you like each visual hand?
Finally, participants were encouraged to comment out loud on the conditions throughout the experiment, as well as in an open-ended question at its end, so as to gather additional qualitative information.

View File

@@ -0,0 +1,55 @@
\subsubsection{Push Task}
\label{3_push}
\subsubsubsection{Completion Time}
\label{3_push_tct}
On the time to complete a trial, there were two statistically significant effects: %
Hand (\anova{5}{2868}{24.8}, \p[<]{0.001}, see \figref{3-Push-ContactsCount-Hand-Overall-Means}) %
and Target (\anova{7}{2868}{5.9}, \p[<]{0.001}).
%
Skeleton was the fastest, more than None (\qty{+18}{\%}, \p{0.005}), Occlusion (\qty{+26}{\%}, \p[<]{0.001}), Tips (\qty{+22}{\%}, \p[<]{0.001}), and Contour (\qty{+20}{\%}, \p{0.001}).
%
Three groups of targets volumes were identified:
%
(1) sides E, W, and SW targets were the fastest;
%
(2) back and front NE, S, and SE were slower (\p{0.003});
%
and (3) back N and NW targets were the slowest (\p{0.04}).
\subsubsubsection{Contacts}
\label{3_push_contacts_count}
On the number of contacts, there were two statistically significant effects: %
Hand (\anova{5}{2868}{6.7}, \p[<]{0.001}, see \figref{3-Push-ContactsCount-Hand-Overall-Means}) %
and Target (\anova{7}{2868}{27.8}, \p[<]{0.001}).
%
\figref{3-Push-ContactsCount-Hand-Overall-Means} shows the Contacts for each Hand.
%
Less contacts were made with Skeleton than with None (\qty{-23}{\%}, \p[<]{0.001}), Occlusion (\qty{-26}{\%}, \p[<]{0.001}), Tips (\qty{-18}{\%}, \p{0.004}), and Contour (\qty{-15}{\%}, \p{0.02});
%
and less with Mesh than with Occlusion (\qty{-14}{\%}, \p{0.04}).
%
This indicates how effective a visual hand rendering is: a lower result indicates a smoother ability to push and rotate properly the cube into the target, as one would probably do with a real cube.
%
Targets on the left (W) and the right (E, SW) were easier to reach than the back ones (N, NW, \p[<]{0.001}).
\subsubsubsection{Time per Contact}
\label{3_push_time_per_contact}
On the mean time spent on each contact, there were two statistically significant effects: %
Hand (\anova{5}{2868}{8.4}, \p[<]{0.001}, see \figref{3-Push-MeanContactTime-Hand-Overall-Means}) %
and Target (\anova{7}{2868}{19.4}, \p[<]{0.001}).
%
It was shorter with None than with Skeleton (\qty{-10}{\%}, \p[<]{0.001}) and Mesh (\qty{-8}{\%}, \p{0.03});
%
and shorter with Occlusion than with Tips (\qty{-10}{\%}, \p{0.002}), Contour (\qty{-10}{\%}, \p{0.001}), Skeleton (\qty{-14}{\%}, \p{0.001}), and Mesh (\qty{-12}{\%}, \p{0.03}).
%
This result suggests that users pushed the virtual cube with more confidence with a visible visual hand rendering.
%
On the contrary, the lack of visual hand constrained the participants to give more attention to the cube's reactions.
%
Targets on the left (W, SW) and the right (E) sides had higher Timer per Contact than all the other targets (\p{0.005}).

View File

@@ -0,0 +1,70 @@
\subsubsection{Grasp Task}
\label{3_grasp}
\subsubsubsection{Completion Time}
\label{3_grasp_tct}
On the time to complete a trial, there was one statistically significant effect %
of Target (\anova{7}{2868}{37.2}, \p[<]{0.001}) %
but not of Hand (\anova{5}{2868}{1.8}, \p{0.1}, see \figref{3-Grasp-CompletionTime-Hand-Overall-Means}).
%
Targets on the back and the left (N, NW, and W) were slower than targets on the front (SW, S, and SE, \p{0.003}) {except for} NE (back-right) which was also fast.
\subsubsubsection{Contacts}
\label{3_grasp_contacts_count}
On the number of contacts, there were two statistically significant effects: %
Hand (\anova{5}{2868}{5.2}, \p[<]{0.001}, see \figref{3-Grasp-ContactsCount-Hand-Overall-Means}) %
and Target (\anova{7}{2868}{21.2}, \p[<]{0.001}).
%
Less contacts were made with Tips than with None (\qty{-13}{\%}, \p{0.02}) and Occlusion (\qty{-15}{\%}, \p{0.004});
%
and less with Mesh than with None (\qty{-15}{\%}, \p{0.006}) and Occlusion (\qty{-17}{\%}, \p{0.001}).
%
This result suggests that having no visible visual hand increased the number of failed grasps or cube drops.
%
But, surprisingly, only Tips and Mesh were statistically significantly better, not Contour nor Skeleton.
%
Targets on the back and left were more difficult (N, NW, and W) than targets on the front (SW, S, and SE, \p[<]{0.001}).
\subsubsubsection{Time per Contact}
\label{3_grasp_time_per_contact}
On the mean time spent on each contact, there were two statistically significant effects: %
Hand (\anova{5}{2868}{9.6}, \p[<]{0.001}, see \figref{3-Grasp-MeanContactTime-Hand-Overall-Means}) %
and Target (\anova{7}{2868}{5.6}, \p[<]{0.001}).
%
It was shorter with None than with Tips (\qty{-15}{\%}, \p[<]{0.001}), Skeleton (\qty{-11}{\%}, \p{0.001}) and Mesh (\qty{-11}{\%}, \p{0.001});
%
shorter with Occlusion than with Tips (\qty{-10}{\%}, \p[<]{0.001}), Skeleton (\qty{-8}{\%}, \p{0.05}), and Mesh (\qty{-8}{\%}, \p{0.04});
%
shorter with Contour than with Tips (\qty{-8}{\%}, \p[<]{0.001}).
%
As for the Push task, the lack of visual hand increased the number of failed grasps or cube drops.
%
The Tips rendering seemed to provide one of the best feedback for the grasping, maybe thanks to the fact that it provides information about both position and rotation of the tracked fingertips.
%
This time was the shortest on the front S than on the other target volumes (\p[<]{0.001}).
\subsubsubsection{Grip Aperture}
\label{3_grasp_grip_aperture}
On the average distance between the thumb's fingertip and the other fingertips during grasping, there were two
statistically significant effects: %
Hand (\anova{5}{2868}{35.8}, \p[<]{0.001}, see \figref{3-Grasp-GripAperture-Hand-Overall-Means}) %
and Target (\anova{7}{2868}{3.7}, \p[<]{0.001}).
%
It was shorter with None than with Occlusion (\p[<]{0.001}), Tips (\p[<]{0.001}), Contour (\p[<]{0.001}), Skeleton (\p[<]{0.001}) and Mesh (\p[<]{0.001});
%
shorter with Tips than with Occlusion (\p{0.008}), Contour (\p{0.006}) and Mesh (\p[<]{0.001});
%
and shorter with Skeleton than with Mesh (\p[<]{0.001}).
%
This result is an evidence of the lack of confidence of participants with no visual hand rendering: they grasped the cube more to secure it.
%
The Mesh rendering seemed to have provided the most confidence to participants, maybe because it was the closest to the real hand.
%
The Grip Aperture was longer on SE (bottom-right) target volume, indicating a higher confidence, than on back and side targets (E, NE, N, W, \p{0.03}).

View File

@@ -0,0 +1,32 @@
\subsubsection{Ranking}
\label{3_ranks}
\begin{subfigs}{3_ranks}{%
Experiment \#1. Boxplots of the ranking (lower is better) of each visual hand rendering
%
and pairwise Wilcoxon signed-rank tests with Holm-Bonferroni adjustment:
%
** is \p[<]{0.01} and * is \p[<]{0.05}.
}
\subfig[0.24]{3-Ranks-Push}[Push Task]
\subfig[0.24]{3-Ranks-Grasp}[Grasp Task]
\end{subfigs}
\figref{3_ranks} shows the ranking of each visual hand rendering for the Push and Grasp tasks.
%
Friedman tests indicated that both ranking had statistically significant differences (\p[<]{0.001}).
%
Pairwise Wilcoxon signed-rank tests with Holm-Bonferroni adjustment were then used on both ranking results (see \secref{3_metrics}):
\begin{itemize}
\item \textit{Push Ranking}: Occlusion was ranked lower than Contour (\p{0.005}), Skeleton (\p{0.02}), and Mesh (\p{0.03});
%
Tips was ranked lower than Skeleton (\p{0.02}).
%
This good ranking of the Skeleton rendering for the Push task is consistent with the Push trial results.
\item \textit{Grasp Ranking}: Occlusion was ranked lower than Contour (\p{0.001}), Skeleton (\p{0.001}), and Mesh (\p{0.007});
%
No Hand was ranked lower than Skeleton (\p{0.04}).
%
A complete visual hand rendering seemed to be preferred over no visual hand rendering when grasping.
\end{itemize}

View File

@@ -0,0 +1,40 @@
\subsubsection{Questionnaire}
\label{3_questions}
\begin{subfigswide}{3_questions}{%
Experiment \#1. Boxplots of the questionnaire results of each visual hand rendering
%
and pairwise Wilcoxon signed-rank tests with Holm-Bonferroni adjustment: ** is \p[<]{0.01} and * is \p[<]{0.05}.
%
Lower is better for Difficulty and Fatigue. Higher is better for Precision, Efficiency, and Rating.
}
\subfig[0.19]{3-Question-Difficulty}
\subfig[0.19]{3-Question-Fatigue}
\subfig[0.19]{3-Question-Precision}
\subfig[0.19]{3-Question-Efficiency}
\subfig[0.19]{3-Question-Rating}
\end{subfigswide}
\figref{3_questions} presents the questionnaire results for each visual hand rendering.
%
Friedman tests indicated that all questions had statistically significant differences (\p[<]{0.001}).
%
Pairwise Wilcoxon signed-rank tests with Holm-Bonferroni adjustment were then used each question results (see \secref{3_metrics}):
\begin{itemize}
\item \textit{Difficulty}: Occlusion was considered more difficult than Contour (\p{0.02}), Skeleton (\p{0.01}), and Mesh (\p{0.03}).
\item \textit{Fatigue}: None was found more fatiguing than Mesh (\p{0.04}); And Occlusion more than Skeleton (\p{0.02}) and Mesh (\p{0.02}).
\item \textit{Precision}: None was considered less precise than Skeleton (\p{0.02}) and Mesh (\p{0.02}); And Occlusion more than Contour (\p{0.02}), Skeleton (\p{0.006}), and Mesh (\p{0.02}).
\item \textit{Efficiency}: Occlusion was found less efficient than Contour (\p{0.01}), Skeleton (\p{0.02}), and Mesh (\p{0.02}).
\item \textit{{Rating}}: Occlusion was rated lower than Contour (\p{0.02}) and Skeleton (\p{0.03}).
\end{itemize}
In summary, Occlusion was worse than Skeleton for all questions, and worse than Contour and Mesh on 5 over 6 questions.
%
Results of Difficulty, Performance, and Precision questions are consistent in that way.
%
Moreover, having no visible visual hand rendering was felt by users fatiguing and less precise than having one.
%
Surprisingly, no clear consensus was found on Rating.
%
Each visual hand rendering, except for Occlusion, had simultaneously received the minimum and maximum possible notes.

View File

@@ -0,0 +1,46 @@
\subsection{Results}
\label{3_results}
\begin{subfigs}{3_push_results}{%
Experiment \#1: Push task.
%
Geometric means with bootstrap 95~\% confidence interval for each visual hand rendering
%
and Tukey's HSD pairwise comparisons: *** is \p[<]{0.001}, ** is \p[<]{0.01}, and * is \p[<]{0.05}.
}
\subfig[0.24]{3-Push-CompletionTime-Hand-Overall-Means}[Time to complete a trial.]
\subfig[0.24]{3-Push-ContactsCount-Hand-Overall-Means}[Number of contacts with the cube.]
\hspace*{10mm}
\subfig[0.24]{3-Push-MeanContactTime-Hand-Overall-Means}[Mean time spent on each contact.]
\end{subfigs}
\begin{subfigswide}{3_grasp_results}{%
Experiment \#1: Grasp task.
%
Geometric means with bootstrap 95~\% confidence interval for each visual hand rendering
%
and Tukey's HSD pairwise comparisons: *** is \p[<]{0.001}, ** is \p[<]{0.01}, and * is \p[<]{0.05}.
}
\subfig[0.24]{3-Grasp-CompletionTime-Hand-Overall-Means}[Time to complete a trial.]
\subfig[0.24]{3-Grasp-ContactsCount-Hand-Overall-Means}[Number of contacts with the cube.]
\subfig[0.24]{3-Grasp-MeanContactTime-Hand-Overall-Means}[Mean time spent on each contact.]
\subfig[0.24]{3-Grasp-GripAperture-Hand-Overall-Means}[\centering Distance between thumb and the other fingertips when grasping.]
\end{subfigswide}
Results of each trials measure were analyzed with a linear mixed model (LMM), with the order of the two manipulation tasks and the six visual hand renderings (Order), the visual hand renderings (Hand), the target volume position (Target), and their interactions as fixed effects and the Participant as random intercept.
%
For every LMM, residuals were tested with a Q-Q plot to confirm normality.
%
On statistically significant effects, estimated marginal means of the LMM were compared pairwise using Tukey's HSD test.
%
Only significant results were reported.
Because Completion Time, Contacts, and Time per Contact measure results were Gamma distributed, they were first
transformed with a log to approximate a normal distribution.
%
Their analysis results are reported anti-logged, corresponding to geometric means of the measures.
\input{content/3_2_1_push}
\input{content/3_2_2_grasp}
\input{content/3_2_3_ranks}
\input{content/3_2_4_questions}

View File

@@ -0,0 +1,58 @@
\subsection{Discussion}
\label{3_discussion}
We evaluated six visual hand renderings, as described in \secref{3_hands}, displayed on top of the real hand, in two virtual object manipulation tasks in AR.
During the Push task, the Skeleton hand rendering was the fastest (see \figref{3-Push-CompletionTime-Hand-Overall-Means}), as participants employed fewer and longer contacts to adjust the cube inside the target volume (see \figref{3-Push-ContactsCount-Hand-Overall-Means} and \figref{3-Push-MeanContactTime-Hand-Overall-Means}).
%
Participants consistently used few and continuous contacts for all visual hand renderings (see Fig. 3b), with only less than ten trials, carried out by two participants, quickly completed with multiple discrete touches.
%
However, during the Grasp task, despite no difference in completion time, providing no visible hand rendering (None and Occlusion renderings) led to more failed grasps or cube drops (see \figref{3-Grasp-CompletionTime-Hand-Overall-Means} and \figref{3-Grasp-MeanContactTime-Hand-Overall-Means}).
%
Indeed, participants found the None and Occlusion renderings less effective (see \figref{3-Ranks-Grasp}) and less precise (see \figref{3_questions}).
%
To understand whether the participants' previous experience might have played a role, we also carried out an additional statistical analysis considering VR experience as an additional between-subjects factor, \ie VR novices vs. VR experts (\enquote{I use it every week}, see \secref{3_participants}).
%
We found no statistically significant differences when comparing the considered metrics between VR novices and experts.
Interestingly, all visual hand renderings showed grip apertures very close to the size of the virtual cube, except for the None rendering (see \figref{3-Grasp-GripAperture-Hand-Overall-Means}), with which participants applied stronger grasps, \ie less distance between the fingertips.
%
Having no visual hand rendering, but only the reaction of the cube to the interaction as feedback, made participants less confident in their grip.
%
This result contrasts with the wrongly estimated grip apertures observed by \citeauthorcite{al-kalbani2016analysis} in an exocentric VST-AR setup.
%
Also, while some participants found the absence of visual hand rendering more natural, many of them commented on the importance of having feedback on the tracking of their hands, as observed by \citeauthorcite{xiao2018mrtouch} in a similar immersive OST-AR setup.
Yet, participants' opinions of the visual hand renderings were mixed on many questions, except for the Occlusion one, which was perceived less effective than more \enquote{complete} visual hands such as Contour, Skeleton, and Mesh hands (see \figref{3_questions}).
%
However, due to the latency of the hand tracking and the visual hand reacting to the cube, almost all participants thought that the Occlusion rendering to be a \enquote{shadow} of the real hand on the cube.
The Tips rendering, which showed the contacts made on the virtual cube, was controversial as it received the minimum and the maximum score on every question.
%
Many participants reported difficulties in seeing the orientation of the visual fingers,
%
while others found that it gave them a better sense of the contact points and improved their concentration on the task.
%
This result are consistent with \citeauthorcite{saito2021contact}, who found that displaying the points of contacts was beneficial for grasping a virtual object over an opaque visual hand overlay.
To summarize, when employing a visual hand rendering overlaying the real hand, participants were more performant and confident in manipulating virtual objects with bare hands in AR.
%
These results contrast with similar manipulation studies, but in non-immersive, on-screen AR, where the presence of a visual hand rendering was found by participants to improve the usability of the interaction, but not their performance~\cite{blaga2017usability,maisto2017evaluation,meli2018combining}.
%
Our results show the most effective visual hand rendering to be the Skeleton one{. Participants appreciated that} it provided a detailed and precise view of the tracking of the real hand{, without} hiding or masking it.
%
Although the Contour and Mesh hand renderings were also highly rated, some participants felt that they were too visible and masked the real hand.
%
This result is in line with the results of virtual object manipulation in VR of \citeauthorcite{prachyabrued2014visual}, who found that the most effective visual hand rendering was a double representation of both the real tracked hand and a visual hand physically constrained by the virtual environment.
%
This type of Skeleton rendering was also the one that provided the best sense of agency (control) in VR~\cite{argelaguet2016role, schwind2018touch}.
These results have of course some limitations as they only address limited types of manipulation tasks and visual hand characteristics, evaluated in a specific OST-AR setup.
%
The two manipulation tasks were also limited to placing a virtual cube in predefined target volumes.
%
Testing a wider range of virtual objects and more ecological tasks \eg stacking, assembly, will ensure a greater applicability of the results obtained in this work, as well as considering bimanual manipulation.
%
Similarly, a broader experimental study might shed light on the role of gender and age, as our subject pool was not sufficiently diverse in this respect.
%
However, we believe that the results presented here provide a rather interesting overview of the most promising approaches in AR manipulation.

View File

@@ -1,2 +1,4 @@
\mainchapter{Visual Rendering of the Hand in Augmented Reality} \chapter{Visual Rendering of the Hand in Augmented Reality}
\mainlabel{visual-hand} \mainlabel{visual-hand}
\chaptertoc

View File

@@ -1,2 +1,4 @@
\mainchapter{Visuo-Haptic Rendering of the Hand in Augmented Reality} \chapter{Visuo-Haptic Rendering of the Hand in Augmented Reality}
\mainlabel{visuo-haptic-hand} \mainlabel{visuo-haptic-hand}
\chaptertoc

View File

@@ -77,10 +77,7 @@
\includefrom{#1}{#2}% and relative paths \input in the chapter \includefrom{#1}{#2}% and relative paths \input in the chapter
} }
\newcommand{\mainchapter}[1]{% \newcommand{\chaptertoc}{% Print the table of contents for the chapter
\chapter{#1}%
%
% Print the table of contents for the chapter
\vspace*{-1cm}% \vspace*{-1cm}%
\localtableofcontents% \localtableofcontents%
\par\noindent\rule{\textwidth}{0.4pt}% \par\noindent\rule{\textwidth}{0.4pt}%