From 4a8ee35edecee2e17622c09d8163ba000043b734 Mon Sep 17 00:00:00 2001
From: Erwan Normand <normand.erwan@protonmail.com>
Date: Thu, 27 Jun 2024 22:51:32 +0200
Subject: [PATCH] Add visual-hand chapter

---
 1-introduction/related-work/1-hands.tex       | 28 +++----
 3-manipulation/visual-hand/1-introduction.tex | 24 +++---
 3-manipulation/visual-hand/2-method.tex       | 78 +++++++++----------
 3-manipulation/visual-hand/3-1-push.tex       | 30 +++----
 3-manipulation/visual-hand/3-2-grasp.tex      | 42 +++++-----
 3-manipulation/visual-hand/3-3-ranks.tex      | 16 ++--
 3-manipulation/visual-hand/3-4-questions.tex  | 22 +++---
 3-manipulation/visual-hand/3-results.tex      | 31 ++++----
 3-manipulation/visual-hand/4-discussion.tex   | 22 +++---
 9 files changed, 144 insertions(+), 149 deletions(-)

diff --git a/1-introduction/related-work/1-hands.tex b/1-introduction/related-work/1-hands.tex
index 4c13f80..f091256 100644
--- a/1-introduction/related-work/1-hands.tex
+++ b/1-introduction/related-work/1-hands.tex
@@ -6,31 +6,31 @@ This Section summarizes the state of the art in visual hand rendering and (weara
 \subsection{Visual Hand Rendering in AR}
 \label{2_hands}
 
-Mutual visual occlusion between a virtual object and the real hand, \ie hiding the virtual object when the real hand is in front of it and hiding the real hand when it is behind the virtual object, is often presented as natural and realistic, enhancing the blending of real and virtual environments~\cite{piumsomboon2014graspshell, al-kalbani2016analysis}.
+Mutual visual occlusion between a virtual object and the real hand, \ie hiding the virtual object when the real hand is in front of it and hiding the real hand when it is behind the virtual object, is often presented as natural and realistic, enhancing the blending of real and virtual environments~\autocite{piumsomboon2014graspshell, al-kalbani2016analysis}.
 %
-In video see-through AR (VST-AR), this could be solved as a masking problem by combining the image of the real world captured by a camera and the generated virtual image~\cite{macedo2023occlusion}.
+In video see-through AR (VST-AR), this could be solved as a masking problem by combining the image of the real world captured by a camera and the generated virtual image~\autocite{macedo2023occlusion}.
 %
 In OST-AR, this is more difficult because the virtual environment is displayed as a transparent 2D image on top of the 3D real world, which cannot be easily masked~\autocite{macedo2023occlusion}.
 %
-Moreover, in VST-AR, the grip aperture and depth positioning of virtual objects often seem to be wrongly estimated~\cite{al-kalbani2016analysis, maisto2017evaluation}.
+Moreover, in VST-AR, the grip aperture and depth positioning of virtual objects often seem to be wrongly estimated~\autocite{al-kalbani2016analysis, maisto2017evaluation}.
 %
 However, this effect has yet to be verified in an OST-AR setup.
 
 An alternative is to render the virtual objects and the hand semi-transparents, so that they are partially visible even when one is occluding the other, \eg in \figref{hands-none} the real hand is behind the virtual cube but still visible.
 %
-Although perceived as less natural, this seems to be preferred to a mutual visual occlusion in VST-AR~\cite{buchmann2005interaction, ha2014wearhand, piumsomboon2014graspshell} and VR~\cite{vanveldhuizen2021effect}, but has not yet been evaluated in OST-AR.
+Although perceived as less natural, this seems to be preferred to a mutual visual occlusion in VST-AR~\autocite{buchmann2005interaction, ha2014wearhand, piumsomboon2014graspshell} and VR~\autocite{vanveldhuizen2021effect}, but has not yet been evaluated in OST-AR.
 %
 However, this effect still causes depth conflicts  that make it difficult to determine if one's hand is behind or in front of a virtual object, \eg in \figref{hands-none} the thumb is in front of the virtual cube, but it appears to be behind it.
 
 In VR, as the user is fully immersed in the virtual environment and cannot see their real hands, it is necessary to represent them virtually.
 %
-It is known that the virtual hand representation has an impact on perception, interaction performance, and preference of users~\cite{prachyabrued2014visual, argelaguet2016role, grubert2018effects, schwind2018touch}.
+It is known that the virtual hand representation has an impact on perception, interaction performance, and preference of users~\autocite{prachyabrued2014visual, argelaguet2016role, grubert2018effects, schwind2018touch}.
 %
 In a pick-and-place task in VR, \textcite{prachyabrued2014visual} found that the virtual hand representation whose motion was constrained to the surface of the virtual objects performed the worst, while the virtual hand representation following the tracked human hand (thus penetrating the virtual objects), performed the best, even though it was rather disliked.
 %
 The authors also observed that the best compromise was a double rendering, showing both the tracked hand and a hand rendering constrained by the virtual environment.
 %
-It has also been shown that over a realistic avatar, a skeleton rendering (similar to \figref{hands-skeleton}) can provide a stronger sense of being in control~\cite{argelaguet2016role} and that minimalistic fingertip rendering (similar to \figref{hands-tips}) can be more effective in a typing task~\cite{grubert2018effects}.
+It has also been shown that over a realistic avatar, a skeleton rendering (similar to \figref{hands-skeleton}) can provide a stronger sense of being in control~\autocite{argelaguet2016role} and that minimalistic fingertip rendering (similar to \figref{hands-tips}) can be more effective in a typing task~\autocite{grubert2018effects}.
 
 In AR, as the real hand of a user is visible but not physically constrained by the virtual environment, adding a visual hand rendering that can physically interact with virtual objects would achieve a similar result to the promising double-hand rendering of \textcite{prachyabrued2014visual}.
 %
@@ -38,7 +38,7 @@ Additionally, \textcite{kahl2021investigation} showed that a virtual object over
 %
 This suggests that a visual hand rendering superimposed on the real hand could be helpful, but should not impair users.
 
-Few works have explored the effect of visual hand rendering in AR~\cite{blaga2017usability, maisto2017evaluation, krichenbauer2018augmented, yoon2020evaluating, saito2021contact}.
+Few works have explored the effect of visual hand rendering in AR~\autocite{blaga2017usability, maisto2017evaluation, krichenbauer2018augmented, yoon2020evaluating, saito2021contact}.
 %
 For example, \textcite{blaga2017usability} evaluated a skeleton rendering in several virtual object manipulations against no visual hand overlay.
 %
@@ -54,12 +54,12 @@ To the best of our knowledge, evaluating the role of a visual rendering of the h
 \label{2_haptics}
 
 Different haptic feedback systems have been explored to improve interactions in AR, including %
-grounded force feedback devices~\cite{bianchi2006high, jeon2009haptic, knorlein2009influence}, %
-exoskeletons~\cite{lee2021wearable}, %
-tangible objects~\cite{hettiarachchi2016annexing, detinguy2018enhancing, salazar2020altering, normand2018enlarging, xiao2018mrtouch}, and %
-wearable haptic devices~\cite{pacchierotti2016hring, lopes2018adding, pezent2019tasbi, teng2021touch}.
+grounded force feedback devices~\autocite{bianchi2006high, jeon2009haptic, knorlein2009influence}, %
+exoskeletons~\autocite{lee2021wearable}, %
+tangible objects~\autocite{hettiarachchi2016annexing, detinguy2018enhancing, salazar2020altering, normand2018enlarging, xiao2018mrtouch}, and %
+wearable haptic devices~\autocite{pacchierotti2016hring, lopes2018adding, pezent2019tasbi, teng2021touch}.
 
-Wearable haptics seems particularly suited for this context, as it takes into account many of the AR constraints, \eg limited impact on hand tracking performance and reduced impairment of the senses and ability of the users to interact with real content~\cite{pacchierotti2016hring, maisto2017evaluation, lopes2018adding, meli2018combining, pezent2019tasbi, teng2021touch, kourtesis2022electrotactile, marchal2022virtual}.
+Wearable haptics seems particularly suited for this context, as it takes into account many of the AR constraints, \eg limited impact on hand tracking performance and reduced impairment of the senses and ability of the users to interact with real content~\autocite{pacchierotti2016hring, maisto2017evaluation, lopes2018adding, meli2018combining, pezent2019tasbi, teng2021touch, kourtesis2022electrotactile, marchal2022virtual}.
 %
 For example, \textcite{pacchierotti2016hring} designed a haptic ring providing pressure and skin stretch sensations to be worn at the proximal finger phalanx, so as to improve the hand tracking during a pick-and-place task.
 %
@@ -67,7 +67,7 @@ For example, \textcite{pacchierotti2016hring} designed a haptic ring providing p
 %
 \textcite{teng2021touch} presented Touch\&Fold, a haptic device attached to the nail that provides pressure and texture sensations when interacting with virtual content, but also folds away when the user interacts with real objects, leaving the fingertip free.
 %
-This approach was also perceived as more realistic than providing sensations directly on the nail, as in~\cite{ando2007fingernailmounted}.
+This approach was also perceived as more realistic than providing sensations directly on the nail, as in~\autocite{ando2007fingernailmounted}.
 %
 Each of these haptic devices provided haptic feedback about fingertip interactions with the virtual content on other parts of the hand.
 %
@@ -81,7 +81,7 @@ Results proved that moving the haptic feedback away from the point(s) of contact
 %
 In pick-and-place tasks in AR involving both virtual and real objects, \textcite{maisto2017evaluation} and \textcite{meli2018combining} showed that having a haptic {rendering of the} fingertip interactions with the virtual objects led to better performance and perceived effectiveness than having only a visual rendering of the hand, similar to \figref{hands-tips}.
 %
-Moreover, employing the haptic ring of~\cite{pacchierotti2016hring} on the proximal finger phalanx led to an improved performance with respect to more standard fingertip haptic devices~\cite{chinello2020modular}.
+Moreover, employing the haptic ring of~\autocite{pacchierotti2016hring} on the proximal finger phalanx led to an improved performance with respect to more standard fingertip haptic devices~\autocite{chinello2020modular}.
 %
 However, the measured difference in performance could be attributed to either the device or the device position (proximal vs fingertip), or both.
 %
diff --git a/3-manipulation/visual-hand/1-introduction.tex b/3-manipulation/visual-hand/1-introduction.tex
index 663cf3a..68d94e5 100644
--- a/3-manipulation/visual-hand/1-introduction.tex
+++ b/3-manipulation/visual-hand/1-introduction.tex
@@ -1,5 +1,5 @@
 \section{Introduction}
-\label{introduction}
+\label{sec:introduction}
 
 \begin{subfigswide}{hands}{%
         Experiment \#1. The six considered visual hand renderings, as seen by the user through the AR headset
@@ -23,35 +23,35 @@
 
 Augmented reality (AR) integrates virtual content into our real-world surroundings, giving the illusion of one unique environment and promising natural and seamless interactions with real and virtual objects.
 %
-Virtual object manipulation is particularly critical for useful and effective AR usage, such as in medical applications, training, or entertainment~\cite{laviolajr20173d, kim2018revisiting}.
+Virtual object manipulation is particularly critical for useful and effective AR usage, such as in medical applications, training, or entertainment~\autocite{laviolajr20173d, kim2018revisiting}.
 %
-Hand tracking technologies~\cite{xiao2018mrtouch}, grasping techniques~\cite{holl2018efficient}, and real-time physics engines permit users to directly manipulate virtual objects with their bare hands as if they were real~\cite{piumsomboon2014graspshell}, without requiring controllers~\cite{krichenbauer2018augmented}, gloves~\cite{prachyabrued2014visual}, or predefined gesture techniques~\cite{piumsomboon2013userdefined, ha2014wearhand}.
+Hand tracking technologies~\autocite{xiao2018mrtouch}, grasping techniques~\autocite{holl2018efficient}, and real-time physics engines permit users to directly manipulate virtual objects with their bare hands as if they were real~\autocite{piumsomboon2014graspshell}, without requiring controllers~\autocite{krichenbauer2018augmented}, gloves~\autocite{prachyabrued2014visual}, or predefined gesture techniques~\autocite{piumsomboon2013userdefined, ha2014wearhand}.
 %
-Optical see-through AR (OST-AR) head-mounted displays (HMDs), such as the Microsoft HoloLens 2 or the Magic Leap, are particularly suited for this type of direct hand interaction~\cite{kim2018revisiting}.
+Optical see-through AR (OST-AR) head-mounted displays (HMDs), such as the Microsoft HoloLens 2 or the Magic Leap, are particularly suited for this type of direct hand interaction~\autocite{kim2018revisiting}.
 
 However, there are still several haptic and visual limitations that affect manipulation in OST-AR, degrading the user experience.
 %
-For example, it is difficult to estimate the position of one's hand in relation to a virtual content because mutual occlusion between the hand and the virtual object is often lacking~\cite{macedo2023occlusion}, the depth of virtual content is underestimated~\cite{diaz2017designing, peillard2019studying}, and hand tracking still has a noticeable latency~\cite{xiao2018mrtouch}.
+For example, it is difficult to estimate the position of one's hand in relation to a virtual content because mutual occlusion between the hand and the virtual object is often lacking~\autocite{macedo2023occlusion}, the depth of virtual content is underestimated~\autocite{diaz2017designing, peillard2019studying}, and hand tracking still has a noticeable latency~\autocite{xiao2018mrtouch}.
 %
-Similarly, it is challenging to ensure confident and realistic contact with a virtual object due to the lack of haptic feedback and the intangibility of the virtual environment, which of course cannot apply physical constraints on the hand~\cite{maisto2017evaluation, meli2018combining, lopes2018adding, teng2021touch}.
+Similarly, it is challenging to ensure confident and realistic contact with a virtual object due to the lack of haptic feedback and the intangibility of the virtual environment, which of course cannot apply physical constraints on the hand~\autocite{maisto2017evaluation, meli2018combining, lopes2018adding, teng2021touch}.
 %
-These limitations also make it difficult to confidently move a grasped object towards a target~\cite{maisto2017evaluation, meli2018combining}.
+These limitations also make it difficult to confidently move a grasped object towards a target~\autocite{maisto2017evaluation, meli2018combining}.
 
 To address these haptic and visual limitations, we investigate two types of sensory feedback that are known to improve virtual interactions with hands, but have not been studied together in an AR context: visual hand rendering and delocalized haptic rendering.
 %
-A few works explored the effect of a visual hand rendering on interactions in AR by simulating mutual occlusion between the real hand and virtual objects~\cite{ha2014wearhand, piumsomboon2014graspshell, al-kalbani2016analysis}, or displaying a 3D virtual hand model, semi-transparent~\cite{ha2014wearhand, piumsomboon2014graspshell} or opaque~\cite{blaga2017usability, yoon2020evaluating, saito2021contact}.
+A few works explored the effect of a visual hand rendering on interactions in AR by simulating mutual occlusion between the real hand and virtual objects~\autocite{ha2014wearhand, piumsomboon2014graspshell, al-kalbani2016analysis}, or displaying a 3D virtual hand model, semi-transparent~\autocite{ha2014wearhand, piumsomboon2014graspshell} or opaque~\autocite{blaga2017usability, yoon2020evaluating, saito2021contact}.
 %
-Indeed, some visual hand renderings are known to improve interactions or user experience in virtual reality (VR), where the real hand is not visible~\cite{prachyabrued2014visual, argelaguet2016role, grubert2018effects, schwind2018touch, vanveldhuizen2021effect}.
+Indeed, some visual hand renderings are known to improve interactions or user experience in virtual reality (VR), where the real hand is not visible~\autocite{prachyabrued2014visual, argelaguet2016role, grubert2018effects, schwind2018touch, vanveldhuizen2021effect}.
 %
 However, the role of a visual hand rendering superimposed and seen above the real tracked hand has not yet been investigated in AR.
 %
-Conjointly, several studies have demonstrated that wearable haptics can significantly improve interactions performance and user experience in AR~\cite{maisto2017evaluation, meli2018combining, sarac2022perceived}.
+Conjointly, several studies have demonstrated that wearable haptics can significantly improve interactions performance and user experience in AR~\autocite{maisto2017evaluation, meli2018combining, sarac2022perceived}.
 %
-But haptic rendering for AR remains a challenge as it is difficult to provide rich and realistic haptic sensations while limiting their negative impact on hand tracking~\cite{pacchierotti2016hring} and keeping the fingertips and palm free to interact with the real environment~\cite{lopes2018adding, teng2021touch, sarac2022perceived, palmer2022haptic}.
+But haptic rendering for AR remains a challenge as it is difficult to provide rich and realistic haptic sensations while limiting their negative impact on hand tracking~\autocite{pacchierotti2016hring} and keeping the fingertips and palm free to interact with the real environment~\autocite{lopes2018adding, teng2021touch, sarac2022perceived, palmer2022haptic}.
 %
 Therefore, the haptic feedback of the fingertip contact with the virtual environment needs to be rendered elsewhere on the hand, it is unclear which positioning should be preferred or which type of haptic feedback is best suited for manipulating virtual objects in AR.
 %
-A final question is whether one or the other of these (haptic or visual) hand renderings should be preferred~\cite{maisto2017evaluation, meli2018combining}, or whether a combined visuo-haptic rendering is beneficial for users.
+A final question is whether one or the other of these (haptic or visual) hand renderings should be preferred~\autocite{maisto2017evaluation, meli2018combining}, or whether a combined visuo-haptic rendering is beneficial for users.
 %
 In fact, both hand renderings can provide sufficient sensory cues for efficient manipulation of virtual objects in AR, or conversely, they can be shown to be complementary.
 
diff --git a/3-manipulation/visual-hand/2-method.tex b/3-manipulation/visual-hand/2-method.tex
index 0fa8e53..c141148 100644
--- a/3-manipulation/visual-hand/2-method.tex
+++ b/3-manipulation/visual-hand/2-method.tex
@@ -1,79 +1,79 @@
-\section{Experiment \#1: Visual Rendering of the Hand in AR}
-\label{method}
+\section{User Study}
+\label{sec:method}
 
-\noindent This first experiment aims to analyze whether the chosen visual hand rendering affects the performance and user experience of manipulating virtual objects with bare hands in AR.
+This first experiment aims to analyze whether the chosen visual hand rendering affects the performance and user experience of manipulating virtual objects with bare hands in AR.
 
 
 \subsection{Visual Hand Renderings}
-\label{hands}
+\label{sec:hands}
 
-We compared a set of the most popular visual hand renderings, as also presented in \secref{2_hands}.
+We compared a set of the most popular visual hand renderings.%, as also presented in \secref{hands}.
 %
 Since we address hand-centered manipulation tasks, we only considered renderings including the fingertips.
 %
-Moreover, as to keep the focus on the hand rendering itself, we used neutral semi-transparent grey meshes, consistent with the choices made in~\cite{yoon2020evaluating, vanveldhuizen2021effect}.
+Moreover, as to keep the focus on the hand rendering itself, we used neutral semi-transparent grey meshes, consistent with the choices made in~\autocite{yoon2020evaluating, vanveldhuizen2021effect}.
 %
 All considered hand renderings are drawn following the tracked pose of the user's real hand.
 %
 However, while the real hand can of course penetrate virtual objects, the visual hand is always constrained by the virtual environment.
 
 
-\subsubsection{None~(\figref{hands-none})}
-\label{hands_none}
+\subsubsection{None~(\figref{method/hands-none})}
+\label{sec:hands_none}
 
-As a reference, we considered no visual hand rendering, as is common in AR~\cite{hettiarachchi2016annexing, blaga2017usability, xiao2018mrtouch, teng2021touch}.
+As a reference, we considered no visual hand rendering, as is common in AR~\autocite{hettiarachchi2016annexing, blaga2017usability, xiao2018mrtouch, teng2021touch}.
 %
 Users have no information about hand tracking and no feedback about contact with the virtual objects, other than their movement when touched.
 %
-As virtual content is rendered on top of the real environment, the hand of the user can be hidden by the virtual objects when manipulating them (see \secref{2_hands}).
+As virtual content is rendered on top of the real environment, the hand of the user can be hidden by the virtual objects when manipulating them (see \secref{hands}).
 
 
-\subsubsection{Occlusion (Occl,~\figref{hands-occlusion})}
-\label{hands_occlusion}
+\subsubsection{Occlusion (Occl,~\figref{method/hands-occlusion})}
+\label{sec:hands_occlusion}
 
-To avoid the abovementioned undesired occlusions due to the virtual content being rendered on top of the real environment, we can carefully crop the former whenever it hides real content that should be visible~\cite{macedo2023occlusion}, \eg the thumb of the user in \figref{hands-occlusion}.
+To avoid the abovementioned undesired occlusions due to the virtual content being rendered on top of the real environment, we can carefully crop the former whenever it hides real content that should be visible~\autocite{macedo2023occlusion}, \eg the thumb of the user in \figref{method/hands-occlusion}.
 %
-This approach is frequent in works using VST-AR headsets~\cite{knorlein2009influence, ha2014wearhand, piumsomboon2014graspshell, suzuki2014grasping, al-kalbani2016analysis} .
+This approach is frequent in works using VST-AR headsets~\autocite{knorlein2009influence, ha2014wearhand, piumsomboon2014graspshell, suzuki2014grasping, al-kalbani2016analysis} .
 
 
-\subsubsection{Tips (\figref{hands-tips})}
-\label{hands_tips}
+\subsubsection{Tips (\figref{method/hands-tips})}
+\label{sec:hands_tips}
 
 This rendering shows small visual rings around the fingertips of the user, highlighting the most important parts of the hand and contact with virtual objects during fine manipulation.
 %
-Unlike work using small spheres~\cite{maisto2017evaluation, meli2014wearable, grubert2018effects, normand2018enlarging, schwind2018touch}, this ring rendering also provides information about the orientation of the fingertips.
+Unlike work using small spheres~\autocite{maisto2017evaluation, meli2014wearable, grubert2018effects, normand2018enlarging, schwind2018touch}, this ring rendering also provides information about the orientation of the fingertips.
 
 
-\subsubsection{Contour (Cont,~\figref{hands-contour})}
-\label{hands_contour}
+\subsubsection{Contour (Cont,~\figref{method/hands-contour})}
+\label{sec:hands_contour}
 
 This rendering is a {1-mm-thick} outline contouring the user's hands, providing information about the whole hand while leaving its inside visible.
 %
-Unlike the other renderings, it is not occluded by the virtual objects, as shown in \figref{hands-contour}.
+Unlike the other renderings, it is not occluded by the virtual objects, as shown in \figref{method/hands-contour}.
 %
-This rendering is not as usual as the previous others in the literature~\cite{kang2020comparative}.
+This rendering is not as usual as the previous others in the literature~\autocite{kang2020comparative}.
 
 
-\subsubsection{Skeleton (Skel,~\figref{hands-skeleton})}
-\label{hands_skeleton}
+\subsubsection{Skeleton (Skel,~\figref{method/hands-skeleton})}
+\label{sec:hands_skeleton}
 
 This rendering schematically renders the joints and phalanges of the fingers with small spheres and cylinders, respectively, leaving the outside of the hand visible.
 %
 It can be seen as an extension of the Tips rendering to include the complete fingers articulations.
 %
-It is widely used in VR~\cite{argelaguet2016role, schwind2018touch, chessa2019grasping} and AR~\cite{blaga2017usability, yoon2020evaluating}, as it is considered simple yet rich and comprehensive.
+It is widely used in VR~\autocite{argelaguet2016role, schwind2018touch, chessa2019grasping} and AR~\autocite{blaga2017usability, yoon2020evaluating}, as it is considered simple yet rich and comprehensive.
 
 
-\subsubsection{Mesh (\figref{hands-mesh})}
-\label{hands_mesh}
+\subsubsection{Mesh (\figref{method/hands-mesh})}
+\label{sec:hands_mesh}
 
-This rendering is a 3D semi-transparent ($a=0.2$) hand model, which is common in VR~\cite{prachyabrued2014visual, argelaguet2016role, schwind2018touch, chessa2019grasping, yoon2020evaluating, vanveldhuizen2021effect}.
+This rendering is a 3D semi-transparent ($a=0.2$) hand model, which is common in VR~\autocite{prachyabrued2014visual, argelaguet2016role, schwind2018touch, chessa2019grasping, yoon2020evaluating, vanveldhuizen2021effect}.
 %
 It can be seen as a filled version of the Contour hand rendering, thus partially covering the view of the real hand.
 
 
 \subsection{Manipulation Tasks and Virtual Scene}
-\label{tasks}
+\label{sec:tasks}
 
 \begin{subfigs}{tasks}{%
         Experiment \#1. The two manipulation tasks: %
@@ -87,11 +87,11 @@ It can be seen as a filled version of the Contour hand rendering, thus partially
     \subfig[0.23]{method/task-grasp}[Grasp task]
 \end{subfigs}
 
-Following the guidelines of \textcite{bergstrom2021how} for designing object manipulation tasks, we considered two variations of a 3D pick-and-place task, commonly found in interaction and manipulation studies~\cite{prachyabrued2014visual, maisto2017evaluation, meli2018combining, blaga2017usability, vanveldhuizen2021effect}.
+Following the guidelines of \textcite{bergstrom2021how} for designing object manipulation tasks, we considered two variations of a 3D pick-and-place task, commonly found in interaction and manipulation studies~\autocite{prachyabrued2014visual, maisto2017evaluation, meli2018combining, blaga2017usability, vanveldhuizen2021effect}.
 
 
 \subsubsection{Push Task}
-\label{push-task}
+\label{sec:push-task}
 
 The first manipulation task consists in pushing a virtual object along a real flat surface towards a target placed on the same plane (see \figref{method/task-push}).
 %
@@ -109,7 +109,7 @@ The task is considered completed when the cube is \emph{fully} inside the target
 
 
 \subsubsection{Grasp Task}
-\label{grasp-task}
+\label{sec:grasp-task}
 
 The second manipulation task consists in grasping, lifting, and placing a virtual object in a target placed on a different (higher) plane (see \figref{method/task-grasp}).
 %
@@ -121,7 +121,7 @@ As before, the task is considered completed when the cube is \emph{fully} inside
 
 
 \subsection{Experimental Design}
-\label{design}
+\label{sec:design}
 
 We analyzed the two tasks separately. For each of them, we considered two independent, within-subject, variables:
 %
@@ -139,7 +139,7 @@ This design led to a total of 2 manipulation tasks \x 6 visual hand renderings \
 
 
 \subsection{Apparatus and Implementation}
-\label{apparatus}
+\label{sec:apparatus}
 
 We used the OST-AR headset HoloLens~2.
 %
@@ -153,7 +153,7 @@ The compiled application ran directly on the HoloLens~2 at \qty{60}{FPS}.
 
 The default 3D hand model from MRTK was used for all visual hand renderings.
 %
-By changing the material properties of this hand model, we were able to achieve the six renderings shown in \figref{hands}.
+By changing the material properties of this hand model, we were able to achieve the six renderings shown in \figref{method/hands}.
 %
 A calibration was performed for every participant, so as to best adapt the size of the visual hand rendering to their real hand.
 %
@@ -173,7 +173,7 @@ This setup enabled a good and consistent tracking of the user's fingers.
 
 
 \subsection{Protocol}
-\label{protocol}
+\label{sec:protocol}
 
 First, participants were given a consent form that briefed them about the tasks and the protocol of the experiment.
 %
@@ -183,13 +183,13 @@ During this training, we did not use any of the six hand renderings we want to t
 
 Participants were asked to carry out the two tasks as naturally and as fast as possible.
 %
-Similarly to~\cite{prachyabrued2014visual, maisto2017evaluation, blaga2017usability, vanveldhuizen2021effect}, we only allowed the use of the dominant hand.
+Similarly to~\autocite{prachyabrued2014visual, maisto2017evaluation, blaga2017usability, vanveldhuizen2021effect}, we only allowed the use of the dominant hand.
 %
 The experiment took around 1 hour and 20 minutes to complete.
 
 
 \subsection{Participants}
-\label{participants}
+\label{sec:participants}
 
 Twenty-four subjects participated in the study (eight aged between 18 and 24, fourteen aged between 25 and 34, and two aged between 35 and 44; 22~males, 1~female, 1~preferred not to say).
 %
@@ -205,7 +205,7 @@ Participants signed an informed consent, including the declaration of having no
 
 
 \subsection{Collected Data}
-\label{metrics}
+\label{sec:metrics}
 
 Inspired by \textcite{laviolajr20173d}, we collected the following metrics during the experiment.
 %
@@ -217,7 +217,7 @@ Finally, (iii) the mean \emph{Time per Contact}, defined as the total time any p
 %
 Solely for the grasp-and-place task, we also measured the (iv) \emph{Grip Aperture}, defined as the average distance between the thumb's fingertip and the other fingertips during the grasping of the cube;
 %
-lower values indicate a greater finger interpenetration with the cube, resulting in a greater discrepancy between the real hand and the visual hand rendering constrained to the cube surfaces and showing how confident users are in their grasp~\cite{prachyabrued2014visual, al-kalbani2016analysis, blaga2017usability, chessa2019grasping}.
+lower values indicate a greater finger interpenetration with the cube, resulting in a greater discrepancy between the real hand and the visual hand rendering constrained to the cube surfaces and showing how confident users are in their grasp~\autocite{prachyabrued2014visual, al-kalbani2016analysis, blaga2017usability, chessa2019grasping}.
 %
 Taken together, these measures provide an overview of the performance and usability of each of the visual hand renderings tested, as we hypothesized that they should influence the behavior and effectiveness of the participants.
 
diff --git a/3-manipulation/visual-hand/3-1-push.tex b/3-manipulation/visual-hand/3-1-push.tex
index 19e5e2c..8472ec1 100644
--- a/3-manipulation/visual-hand/3-1-push.tex
+++ b/3-manipulation/visual-hand/3-1-push.tex
@@ -1,14 +1,14 @@
 \subsubsection{Push Task}
-\label{3_push}
+\label{sec:push}
 
 \subsubsubsection{Completion Time}
-\label{3_push_tct}
+\label{sec:push_tct}
 
 On the time to complete a trial, there were two statistically significant effects: %
-Hand (\anova{5}{2868}{24.8}, \p[<]{0.001}, see \figref{3-Push-ContactsCount-Hand-Overall-Means}) %
-and Target (\anova{7}{2868}{5.9}, \p[<]{0.001}).
+Hand (\anova{5}{2868}{24.8}, \pinf{0.001}, see \figref{results/Push-ContactsCount-Hand-Overall-Means}) %
+and Target (\anova{7}{2868}{5.9}, \pinf{0.001}).
 %
-Skeleton was the fastest, more than None (\qty{+18}{\%}, \p{0.005}), Occlusion (\qty{+26}{\%}, \p[<]{0.001}), Tips (\qty{+22}{\%}, \p[<]{0.001}), and Contour (\qty{+20}{\%}, \p{0.001}).
+Skeleton was the fastest, more than None (\qty{+18}{\%}, \p{0.005}), Occlusion (\qty{+26}{\%}, \pinf{0.001}), Tips (\qty{+22}{\%}, \pinf{0.001}), and Contour (\qty{+20}{\%}, \p{0.001}).
 %
 Three groups of targets volumes were identified:
 %
@@ -20,31 +20,31 @@ and (3) back N and NW targets were the slowest (\p{0.04}).
 
 
 \subsubsubsection{Contacts}
-\label{3_push_contacts_count}
+\label{sec:push_contacts_count}
 
 On the number of contacts, there were two statistically significant effects: %
-Hand (\anova{5}{2868}{6.7}, \p[<]{0.001}, see \figref{3-Push-ContactsCount-Hand-Overall-Means}) %
-and Target (\anova{7}{2868}{27.8}, \p[<]{0.001}).
+Hand (\anova{5}{2868}{6.7}, \pinf{0.001}, see \figref{results/Push-ContactsCount-Hand-Overall-Means}) %
+and Target (\anova{7}{2868}{27.8}, \pinf{0.001}).
 %
-\figref{3-Push-ContactsCount-Hand-Overall-Means} shows the Contacts for each Hand.
+\figref{results/Push-ContactsCount-Hand-Overall-Means} shows the Contacts for each Hand.
 %
-Less contacts were made with Skeleton than with None (\qty{-23}{\%}, \p[<]{0.001}), Occlusion (\qty{-26}{\%}, \p[<]{0.001}), Tips (\qty{-18}{\%}, \p{0.004}), and Contour (\qty{-15}{\%}, \p{0.02});
+Less contacts were made with Skeleton than with None (\qty{-23}{\%}, \pinf{0.001}), Occlusion (\qty{-26}{\%}, \pinf{0.001}), Tips (\qty{-18}{\%}, \p{0.004}), and Contour (\qty{-15}{\%}, \p{0.02});
 %
 and less with Mesh than with Occlusion (\qty{-14}{\%}, \p{0.04}).
 %
 This indicates how effective a visual hand rendering is: a lower result indicates a smoother ability to push and rotate properly the cube into the target, as one would probably do with a real cube.
 %
-Targets on the left (W) and the right (E, SW) were easier to reach than the back ones (N, NW, \p[<]{0.001}).
+Targets on the left (W) and the right (E, SW) were easier to reach than the back ones (N, NW, \pinf{0.001}).
 
 
 \subsubsubsection{Time per Contact}
-\label{3_push_time_per_contact}
+\label{sec:push_time_per_contact}
 
 On the mean time spent on each contact, there were two statistically significant effects: %
-Hand (\anova{5}{2868}{8.4}, \p[<]{0.001}, see \figref{3-Push-MeanContactTime-Hand-Overall-Means}) %
-and Target (\anova{7}{2868}{19.4}, \p[<]{0.001}).
+Hand (\anova{5}{2868}{8.4}, \pinf{0.001}, see \figref{results/Push-MeanContactTime-Hand-Overall-Means}) %
+and Target (\anova{7}{2868}{19.4}, \pinf{0.001}).
 %
-It was shorter with None than with Skeleton (\qty{-10}{\%}, \p[<]{0.001}) and Mesh (\qty{-8}{\%}, \p{0.03});
+It was shorter with None than with Skeleton (\qty{-10}{\%}, \pinf{0.001}) and Mesh (\qty{-8}{\%}, \p{0.03});
 %
 and shorter with Occlusion than with Tips (\qty{-10}{\%}, \p{0.002}), Contour (\qty{-10}{\%}, \p{0.001}), Skeleton (\qty{-14}{\%}, \p{0.001}), and Mesh (\qty{-12}{\%}, \p{0.03}).
 %
diff --git a/3-manipulation/visual-hand/3-2-grasp.tex b/3-manipulation/visual-hand/3-2-grasp.tex
index a11549d..44d530e 100644
--- a/3-manipulation/visual-hand/3-2-grasp.tex
+++ b/3-manipulation/visual-hand/3-2-grasp.tex
@@ -1,22 +1,22 @@
 \subsubsection{Grasp Task}
-\label{3_grasp}
+\label{sec:grasp}
 
 \subsubsubsection{Completion Time}
-\label{3_grasp_tct}
+\label{sec:grasp_tct}
 
 On the time to complete a trial, there was one statistically significant effect %
-of Target (\anova{7}{2868}{37.2}, \p[<]{0.001}) %
-but not of Hand (\anova{5}{2868}{1.8}, \p{0.1}, see \figref{3-Grasp-CompletionTime-Hand-Overall-Means}).
+of Target (\anova{7}{2868}{37.2}, \pinf{0.001}) %
+but not of Hand (\anova{5}{2868}{1.8}, \p{0.1}, see \figref{results/Grasp-CompletionTime-Hand-Overall-Means}).
 %
 Targets on the back and the left (N, NW, and W) were slower than targets on the front (SW, S, and SE, \p{0.003}) {except for} NE (back-right) which was also fast.
 
 
 \subsubsubsection{Contacts}
-\label{3_grasp_contacts_count}
+\label{sec:grasp_contacts_count}
 
 On the number of contacts, there were two statistically significant effects: %
-Hand (\anova{5}{2868}{5.2}, \p[<]{0.001}, see \figref{3-Grasp-ContactsCount-Hand-Overall-Means}) %
-and Target (\anova{7}{2868}{21.2}, \p[<]{0.001}).
+Hand (\anova{5}{2868}{5.2}, \pinf{0.001}, see \figref{results/Grasp-ContactsCount-Hand-Overall-Means}) %
+and Target (\anova{7}{2868}{21.2}, \pinf{0.001}).
 %
 Less contacts were made with Tips than with None (\qty{-13}{\%}, \p{0.02}) and Occlusion (\qty{-15}{\%}, \p{0.004});
 %
@@ -26,42 +26,42 @@ This result suggests that having no visible visual hand increased the number of
 %
 But, surprisingly, only Tips and Mesh were statistically significantly better, not Contour nor Skeleton.
 %
-Targets on the back and left were more difficult (N, NW, and W) than targets on the front (SW, S, and SE, \p[<]{0.001}).
+Targets on the back and left were more difficult (N, NW, and W) than targets on the front (SW, S, and SE, \pinf{0.001}).
 
 
 \subsubsubsection{Time per Contact}
-\label{3_grasp_time_per_contact}
+\label{sec:grasp_time_per_contact}
 
 On the mean time spent on each contact, there were two statistically significant effects: %
-Hand (\anova{5}{2868}{9.6}, \p[<]{0.001}, see \figref{3-Grasp-MeanContactTime-Hand-Overall-Means}) %
-and Target (\anova{7}{2868}{5.6}, \p[<]{0.001}).
+Hand (\anova{5}{2868}{9.6}, \pinf{0.001}, see \figref{results/Grasp-MeanContactTime-Hand-Overall-Means}) %
+and Target (\anova{7}{2868}{5.6}, \pinf{0.001}).
 %
-It was shorter with None than with Tips (\qty{-15}{\%}, \p[<]{0.001}), Skeleton (\qty{-11}{\%}, \p{0.001}) and Mesh (\qty{-11}{\%}, \p{0.001});
+It was shorter with None than with Tips (\qty{-15}{\%}, \pinf{0.001}), Skeleton (\qty{-11}{\%}, \p{0.001}) and Mesh (\qty{-11}{\%}, \p{0.001});
 %
-shorter with Occlusion than with Tips (\qty{-10}{\%}, \p[<]{0.001}), Skeleton (\qty{-8}{\%}, \p{0.05}), and Mesh (\qty{-8}{\%}, \p{0.04});
+shorter with Occlusion than with Tips (\qty{-10}{\%}, \pinf{0.001}), Skeleton (\qty{-8}{\%}, \p{0.05}), and Mesh (\qty{-8}{\%}, \p{0.04});
 %
-shorter with Contour than with Tips (\qty{-8}{\%}, \p[<]{0.001}).
+shorter with Contour than with Tips (\qty{-8}{\%}, \pinf{0.001}).
 %
 As for the Push task, the lack of visual hand increased the number of failed grasps or cube drops.
 %
 The Tips rendering seemed to provide one of the best feedback for the grasping, maybe thanks to the fact that it provides information about both position and rotation of the tracked fingertips.
 %
-This time was the shortest on the front S than on the other target volumes (\p[<]{0.001}).
+This time was the shortest on the front S than on the other target volumes (\pinf{0.001}).
 
 
 \subsubsubsection{Grip Aperture}
-\label{3_grasp_grip_aperture}
+\label{sec:grasp_grip_aperture}
 
 On the average distance between the thumb's fingertip and the other fingertips during grasping, there were two
 statistically significant effects: %
-Hand (\anova{5}{2868}{35.8}, \p[<]{0.001}, see \figref{3-Grasp-GripAperture-Hand-Overall-Means}) %
-and Target (\anova{7}{2868}{3.7}, \p[<]{0.001}).
+Hand (\anova{5}{2868}{35.8}, \pinf{0.001}, see \figref{results/Grasp-GripAperture-Hand-Overall-Means}) %
+and Target (\anova{7}{2868}{3.7}, \pinf{0.001}).
 %
-It was shorter with None than with Occlusion (\p[<]{0.001}), Tips (\p[<]{0.001}), Contour (\p[<]{0.001}), Skeleton (\p[<]{0.001}) and Mesh (\p[<]{0.001});
+It was shorter with None than with Occlusion (\pinf{0.001}), Tips (\pinf{0.001}), Contour (\pinf{0.001}), Skeleton (\pinf{0.001}) and Mesh (\pinf{0.001});
 %
-shorter with Tips than with Occlusion (\p{0.008}), Contour (\p{0.006}) and Mesh (\p[<]{0.001});
+shorter with Tips than with Occlusion (\p{0.008}), Contour (\p{0.006}) and Mesh (\pinf{0.001});
 %
-and shorter with Skeleton than with Mesh (\p[<]{0.001}).
+and shorter with Skeleton than with Mesh (\pinf{0.001}).
 %
 This result is an evidence of the lack of confidence of participants with no visual hand rendering: they grasped the cube more to secure it.
 %
diff --git a/3-manipulation/visual-hand/3-3-ranks.tex b/3-manipulation/visual-hand/3-3-ranks.tex
index 8332d91..d83260c 100644
--- a/3-manipulation/visual-hand/3-3-ranks.tex
+++ b/3-manipulation/visual-hand/3-3-ranks.tex
@@ -1,22 +1,22 @@
 \subsubsection{Ranking}
-\label{3_ranks}
+\label{sec:ranks}
 
-\begin{subfigs}{3_ranks}{%
+\begin{subfigs}{ranks}{%
         Experiment \#1. Boxplots of the ranking (lower is better) of each visual hand rendering
         %
         and pairwise Wilcoxon signed-rank tests with Holm-Bonferroni adjustment:
         %
-        ** is \p[<]{0.01} and * is \p[<]{0.05}.
+        ** is \pinf{0.01} and * is \pinf{0.05}.
     }
-    \subfig[0.24]{3-Ranks-Push}[Push Task]
-    \subfig[0.24]{3-Ranks-Grasp}[Grasp Task]
+    \subfig[0.24]{results/Ranks-Push}[Push Task]
+    \subfig[0.24]{results/Ranks-Grasp}[Grasp Task]
 \end{subfigs}
 
-\figref{3_ranks} shows the ranking of each visual hand rendering for the Push and Grasp tasks.
+\figref{ranks} shows the ranking of each visual hand rendering for the Push and Grasp tasks.
 %
-Friedman tests indicated that both ranking had statistically significant differences (\p[<]{0.001}).
+Friedman tests indicated that both ranking had statistically significant differences (\pinf{0.001}).
 %
-Pairwise Wilcoxon signed-rank tests with Holm-Bonferroni adjustment were then used on both ranking results (see \secref{3_metrics}):
+Pairwise Wilcoxon signed-rank tests with Holm-Bonferroni adjustment were then used on both ranking results (see \secref{metrics}):
 
 \begin{itemize}
     \item \textit{Push Ranking}: Occlusion was ranked lower than Contour (\p{0.005}), Skeleton (\p{0.02}), and Mesh (\p{0.03});
diff --git a/3-manipulation/visual-hand/3-4-questions.tex b/3-manipulation/visual-hand/3-4-questions.tex
index 258003a..20364c8 100644
--- a/3-manipulation/visual-hand/3-4-questions.tex
+++ b/3-manipulation/visual-hand/3-4-questions.tex
@@ -1,25 +1,25 @@
 \subsubsection{Questionnaire}
-\label{3_questions}
+\label{sec:questions}
 
-\begin{subfigswide}{3_questions}{%
+\begin{subfigswide}{questions}{%
         Experiment \#1. Boxplots of the questionnaire results of each visual hand rendering
         %
-        and pairwise Wilcoxon signed-rank tests with Holm-Bonferroni adjustment: ** is \p[<]{0.01} and * is \p[<]{0.05}.
+        and pairwise Wilcoxon signed-rank tests with Holm-Bonferroni adjustment: ** is \pinf{0.01} and * is \pinf{0.05}.
         %
         Lower is better for Difficulty and Fatigue. Higher is better for Precision, Efficiency, and Rating.
     }
-    \subfig[0.19]{3-Question-Difficulty}
-    \subfig[0.19]{3-Question-Fatigue}
-    \subfig[0.19]{3-Question-Precision}
-    \subfig[0.19]{3-Question-Efficiency}
-    \subfig[0.19]{3-Question-Rating}
+    \subfig[0.19]{results/Question-Difficulty}
+    \subfig[0.19]{results/Question-Fatigue}
+    \subfig[0.19]{results/Question-Precision}
+    \subfig[0.19]{results/Question-Efficiency}
+    \subfig[0.19]{results/Question-Rating}
 \end{subfigswide}
 
-\figref{3_questions} presents the questionnaire results for each visual hand rendering.
+\figref{questions} presents the questionnaire results for each visual hand rendering.
 %
-Friedman tests indicated that all questions had statistically significant differences (\p[<]{0.001}).
+Friedman tests indicated that all questions had statistically significant differences (\pinf{0.001}).
 %
-Pairwise Wilcoxon signed-rank tests with Holm-Bonferroni adjustment were then used each question results (see \secref{3_metrics}):
+Pairwise Wilcoxon signed-rank tests with Holm-Bonferroni adjustment were then used each question results (see \secref{metrics}):
 
 \begin{itemize}
     \item \textit{Difficulty}: Occlusion was considered more difficult than Contour (\p{0.02}), Skeleton (\p{0.01}), and Mesh (\p{0.03}).
diff --git a/3-manipulation/visual-hand/3-results.tex b/3-manipulation/visual-hand/3-results.tex
index 28b7d6a..2cc8558 100644
--- a/3-manipulation/visual-hand/3-results.tex
+++ b/3-manipulation/visual-hand/3-results.tex
@@ -1,30 +1,30 @@
-\subsection{Results}
-\label{3_results}
+\section{Results}
+\label{sec:results}
 
-\begin{subfigs}{3_push_results}{%
+\begin{subfigs}{push_results}{%
         Experiment \#1: Push task.
         %
         Geometric means with bootstrap 95~\% confidence interval for each visual hand rendering
         %
-        and Tukey's HSD pairwise comparisons: *** is \p[<]{0.001}, ** is \p[<]{0.01}, and * is \p[<]{0.05}.
+        and Tukey's HSD pairwise comparisons: *** is \pinf{0.001}, ** is \pinf{0.01}, and * is \pinf{0.05}.
     }
-    \subfig[0.24]{3-Push-CompletionTime-Hand-Overall-Means}[Time to complete a trial.]
-    \subfig[0.24]{3-Push-ContactsCount-Hand-Overall-Means}[Number of contacts with the cube.]
+    \subfig[0.24]{results/Push-CompletionTime-Hand-Overall-Means}[Time to complete a trial.]
+    \subfig[0.24]{results/Push-ContactsCount-Hand-Overall-Means}[Number of contacts with the cube.]
     \hspace*{10mm}
-    \subfig[0.24]{3-Push-MeanContactTime-Hand-Overall-Means}[Mean time spent on each contact.]
+    \subfig[0.24]{results/Push-MeanContactTime-Hand-Overall-Means}[Mean time spent on each contact.]
 \end{subfigs}
 
-\begin{subfigswide}{3_grasp_results}{%
+\begin{subfigswide}{grasp_results}{%
         Experiment \#1: Grasp task.
         %
         Geometric means with bootstrap 95~\% confidence interval for each visual hand rendering
         %
-        and Tukey's HSD pairwise comparisons: *** is \p[<]{0.001}, ** is \p[<]{0.01}, and * is \p[<]{0.05}.
+        and Tukey's HSD pairwise comparisons: *** is \pinf{0.001}, ** is \pinf{0.01}, and * is \pinf{0.05}.
     }
-    \subfig[0.24]{3-Grasp-CompletionTime-Hand-Overall-Means}[Time to complete a trial.]
-    \subfig[0.24]{3-Grasp-ContactsCount-Hand-Overall-Means}[Number of contacts with the cube.]
-    \subfig[0.24]{3-Grasp-MeanContactTime-Hand-Overall-Means}[Mean time spent on each contact.]
-    \subfig[0.24]{3-Grasp-GripAperture-Hand-Overall-Means}[\centering Distance between thumb and the other fingertips when grasping.]
+    \subfig[0.24]{results/Grasp-CompletionTime-Hand-Overall-Means}[Time to complete a trial.]
+    \subfig[0.24]{results/Grasp-ContactsCount-Hand-Overall-Means}[Number of contacts with the cube.]
+    \subfig[0.24]{results/Grasp-MeanContactTime-Hand-Overall-Means}[Mean time spent on each contact.]
+    \subfig[0.24]{results/Grasp-GripAperture-Hand-Overall-Means}[\centering Distance between thumb and the other fingertips when grasping.]
 \end{subfigswide}
 
 Results of each trials measure were analyzed with a linear mixed model (LMM), with the order of the two manipulation tasks and the six visual hand renderings (Order), the visual hand renderings (Hand), the target volume position (Target), and their interactions as fixed effects and the Participant as random intercept.
@@ -39,8 +39,3 @@ Because Completion Time, Contacts, and Time per Contact measure results were Gam
 transformed with a log to approximate a normal distribution.
 %
 Their analysis results are reported anti-logged, corresponding to geometric means of the measures.
-
-\input{content/3_2_1_push}
-\input{content/3_2_2_grasp}
-\input{content/3_2_3_ranks}
-\input{content/3_2_4_questions}
diff --git a/3-manipulation/visual-hand/4-discussion.tex b/3-manipulation/visual-hand/4-discussion.tex
index 96c1014..2b5d60d 100644
--- a/3-manipulation/visual-hand/4-discussion.tex
+++ b/3-manipulation/visual-hand/4-discussion.tex
@@ -1,21 +1,21 @@
-\subsection{Discussion}
-\label{3_discussion}
+\section{Discussion}
+\label{sec:discussion}
 
-We evaluated six visual hand renderings, as described in \secref{3_hands}, displayed on top of the real hand, in two virtual object manipulation tasks in AR.
+We evaluated six visual hand renderings, as described in \secref{hands}, displayed on top of the real hand, in two virtual object manipulation tasks in AR.
 
-During the Push task, the Skeleton hand rendering was the fastest (see \figref{3-Push-CompletionTime-Hand-Overall-Means}), as participants employed fewer and longer contacts to adjust the cube inside the target volume (see \figref{3-Push-ContactsCount-Hand-Overall-Means} and \figref{3-Push-MeanContactTime-Hand-Overall-Means}).
+During the Push task, the Skeleton hand rendering was the fastest (see \figref{results/Push-CompletionTime-Hand-Overall-Means}), as participants employed fewer and longer contacts to adjust the cube inside the target volume (see \figref{results/Push-ContactsCount-Hand-Overall-Means} and \figref{results/Push-MeanContactTime-Hand-Overall-Means}).
 %
 Participants consistently used few and continuous contacts for all visual hand renderings (see Fig. 3b), with only less than ten trials, carried out by two participants, quickly completed with multiple discrete touches.
 %
-However, during the Grasp task, despite no difference in completion time, providing no visible hand rendering (None and Occlusion renderings) led to more failed grasps or cube drops (see \figref{3-Grasp-CompletionTime-Hand-Overall-Means} and \figref{3-Grasp-MeanContactTime-Hand-Overall-Means}).
+However, during the Grasp task, despite no difference in completion time, providing no visible hand rendering (None and Occlusion renderings) led to more failed grasps or cube drops (see \figref{results/Grasp-CompletionTime-Hand-Overall-Means} and \figref{results/Grasp-MeanContactTime-Hand-Overall-Means}).
 %
-Indeed, participants found the None and Occlusion renderings less effective (see \figref{3-Ranks-Grasp}) and less precise (see \figref{3_questions}).
+Indeed, participants found the None and Occlusion renderings less effective (see \figref{results/Ranks-Grasp}) and less precise (see \figref{questions}).
 %
-To understand whether the participants' previous experience might have played a role, we also carried out an additional statistical analysis considering VR experience as an additional between-subjects factor, \ie VR novices vs. VR experts (\enquote{I use it every week}, see \secref{3_participants}).
+To understand whether the participants' previous experience might have played a role, we also carried out an additional statistical analysis considering VR experience as an additional between-subjects factor, \ie VR novices vs. VR experts (\enquote{I use it every week}, see \secref{participants}).
 %
 We found no statistically significant differences when comparing the considered metrics between VR novices and experts.
 
-Interestingly, all visual hand renderings showed grip apertures very close to the size of the virtual cube, except for the None rendering (see \figref{3-Grasp-GripAperture-Hand-Overall-Means}), with which participants applied stronger grasps, \ie less distance between the fingertips.
+Interestingly, all visual hand renderings showed grip apertures very close to the size of the virtual cube, except for the None rendering (see \figref{results/Grasp-GripAperture-Hand-Overall-Means}), with which participants applied stronger grasps, \ie less distance between the fingertips.
 %
 Having no visual hand rendering, but only the reaction of the cube to the interaction as feedback, made participants less confident in their grip.
 %
@@ -23,7 +23,7 @@ This result contrasts with the wrongly estimated grip apertures observed by \tex
 %
 Also, while some participants found the absence of visual hand rendering more natural, many of them commented on the importance of having feedback on the tracking of their hands, as observed by \textcite{xiao2018mrtouch} in a similar immersive OST-AR setup.
 
-Yet, participants' opinions of the visual hand renderings were mixed on many questions, except for the Occlusion one, which was perceived less effective than more \enquote{complete} visual hands such as Contour, Skeleton, and Mesh hands (see \figref{3_questions}).
+Yet, participants' opinions of the visual hand renderings were mixed on many questions, except for the Occlusion one, which was perceived less effective than more \enquote{complete} visual hands such as Contour, Skeleton, and Mesh hands (see \figref{questions}).
 %
 However, due to the latency of the hand tracking and the visual hand reacting to the cube, almost all participants thought that the Occlusion rendering to be a \enquote{shadow} of the real hand on the cube.
 
@@ -37,7 +37,7 @@ This result are consistent with \textcite{saito2021contact}, who found that disp
 
 To summarize, when employing a visual hand rendering overlaying the real hand, participants were more performant and confident in manipulating virtual objects with bare hands in AR.
 %
-These results contrast with similar manipulation studies, but in non-immersive, on-screen AR, where the presence of a visual hand rendering was found by participants to improve the usability of the interaction, but not their performance~\cite{blaga2017usability,maisto2017evaluation,meli2018combining}.
+These results contrast with similar manipulation studies, but in non-immersive, on-screen AR, where the presence of a visual hand rendering was found by participants to improve the usability of the interaction, but not their performance~\autocite{blaga2017usability,maisto2017evaluation,meli2018combining}.
 %
 Our results show the most effective visual hand rendering to be the Skeleton one{. Participants appreciated that} it provided a detailed and precise view of the tracking of the real hand{, without} hiding or masking it.
 %
@@ -45,7 +45,7 @@ Although the Contour and Mesh hand renderings were also highly rated, some parti
 %
 This result is in line with the results of virtual object manipulation in VR of \textcite{prachyabrued2014visual}, who found that the most effective visual hand rendering was a double representation of both the real tracked hand and a visual hand physically constrained by the virtual environment.
 %
-This type of Skeleton rendering was also the one that provided the best sense of agency (control) in VR~\cite{argelaguet2016role, schwind2018touch}.
+This type of Skeleton rendering was also the one that provided the best sense of agency (control) in VR~\autocite{argelaguet2016role, schwind2018touch}.
 
 These results have of course some limitations as they only address limited types of manipulation tasks and visual hand characteristics, evaluated in a specific OST-AR setup.
 %