Merge branch 'main' of https://gitlab.inria.fr/whar/projects/thesis

2024-09-17 17:32:22 +02:00
parent 4f849e2038 24c91cf49b
commit e950a1d1b3
16 changed files with 600 additions and 383 deletions
--- a/1-introduction/related-work/3-augmented-reality.tex
+++ b/1-introduction/related-work/3-augmented-reality.tex
@@ -1,23 +1,24 @@
 \section{Principles and Capabilities of AR}
 \label{augmented_reality}

-The first \AR headset was invented by \textcite{sutherland1968headmounted}: With the technology available at the time, it was already capable of displaying virtual objects at a fixed point in space in real time, giving the user the illusion that the content was present in the room (\figref{sutherland1968headmounted}).
+The first \AR headset was invented by \textcite{sutherland1968headmounted}: With the technology available at the time, it was already capable of displaying \VOs at a fixed point in space in real time, giving the user the illusion that the content was present in the room.
 Fixed to the ceiling, the headset displayed a stereoscopic (one image per eye) perspective projection of the virtual content on a transparent screen, taking into account the user's position, and thus already following the interaction loop presented in \figref[introduction]{interaction-loop}.

-\begin{subfigs}{sutherland1968headmounted}{Photos of the first \AR system~\cite{sutherland1968headmounted}. }[
-        \item The \AR headset.
-        \item Wireframe \ThreeD virtual objects were displayed registered in the real environment (as if there were part of it).
-    ]
-    \subfigsheight{45mm}
-    \subfig{sutherland1970computer3}
-    \subfig{sutherland1970computer2}
-\end{subfigs}
+%\begin{subfigs}{sutherland1968headmounted}{Photos of the first \AR system~\cite{sutherland1968headmounted}. }[
+%        \item The \AR headset.
+%        \item Wireframe \ThreeD \VOs were displayed registered in the real environment (as if there were part of it).
+%    ]
+%    \subfigsheight{45mm}
+%    \subfig{sutherland1970computer3}
+%    \subfig{sutherland1970computer2}
+%\end{subfigs}


 \subsection{What is Augmented Reality?}
-\label{ar_definition}
+\label{what_is_ar}

-\paragraph{A Definition}
+\subsubsection{A Definition}
+\label{ar_definition}

 The system of \cite{sutherland1968headmounted} already fulfilled the first formal definition of \AR, proposed by \textcite{azuma1997survey} in the first survey of the domain:
 \begin{enumerate}[label=(\arabic*)]
@@ -36,7 +37,9 @@ Yet, most of the research have focused on visual augmentations, and the term \AR

 %For example, \textcite{milgram1994taxonomy} proposed a taxonomy of \MR experiences based on the degree of mixing real and virtual environments, and \textcite{skarbez2021revisiting} revisited this taxonomy to include the user's perception of the experience.

-\paragraph{Applications}
+
+\subsubsection{Applications}
+\label{ar_applications}

 Advances in technology, research and development have enabled many usages of \AR, including medicine, education, industrial, navigation, collaboration and entertainment applications~\cite{dey2018systematic}.
 For example, \AR can help surgeons to visualize \ThreeD images of the brain overlaid on the patient's head prior or during surgery, \eg in \figref{watanabe2016transvisible}~\cite{watanabe2016transvisible}, or improve the learning of students with complex concepts and phenomena such as optics or chemistry~\cite{bousquet2024reconfigurable}.
@@ -59,7 +62,7 @@ Yet, the user experience in \AR is still highly dependent on the display used.
 \end{subfigs}


-\subsection{AR Displays and Perception}
+\subsubsection{AR Displays and Perception}
 \label{ar_displays}

 \cite{bimber2005spatial}
@@ -80,12 +83,14 @@ Using a VST-AR headset have notable consequences, as the "real" view of the envi

 % billinghurst2021grand

-\subsection{Presence and Embodiment in AR}
-\label{ar_presence}
+
+\subsubsection{Presence and Embodiment in AR}
+\label{ar_presence_embodiment}

 Despite the clear and acknowledged definition presented in \secref{ar_definition} and the viewpoint of this thesis that \AR and \VR are two type of \MR experience with different levels of mixing real and virtual environments, as presented in \secref[introduction]{visuo_haptic_augmentations}, there is still a debate on defining \AR and \MR as well as how to characterize and categorized such experiences~\cite{speicher2019what,skarbez2021revisiting}.

 \paragraph{Presence}
+\label{ar_presence}

 Presence is one of the key concept to characterize a \VR experience.
 \AR and \VR are both essentially illusions as the virtual content does not physically exist but is just digitally simulated and rendered to the user's perception through a user interface and the user's senses.
@@ -97,14 +102,14 @@ It doesn't mean that the virtual events are realistic, but that they are plausib
 A third strong illusion in \VR is the \SoE, which is the illusion that the virtual body is one's own~\cite{slater2022separate,guy2023sense}.

 The \AR presence is far less defined and studied than for \VR~\cite{tran2024survey}, but it will be useful to design, evaluate and discuss our contributions in the next chapters.
-Thereby, \textcite{slater2022separate} proposed to invert \PI to what we can call \enquote{object illusion}, \ie the sense of the virtual object to \enquote{feels here} in the \RE (\figref{presence-ar}).
+Thereby, \textcite{slater2022separate} proposed to invert \PI to what we can call \enquote{object illusion}, \ie the sense of the \VO to \enquote{feels here} in the \RE (\figref{presence-ar}).
 As with VR, \VOs must be able to be seen from different angles by moving the head but also, this is more difficult, be consistent with the \RE, \eg occlude or be occluded by real objects~\cite{macedo2023occlusion}, cast shadows or reflect lights.
 The \PSI can be applied to \AR as is, but the \VOs must additionally have knowledge of the \RE and react accordingly to it.
 \textcite{skarbez2021revisiting} also named \PI for \AR as \enquote{immersion} and \PSI as \enquote{coherence}, and these terms will be used in the remainder of this thesis.

 \begin{subfigs}{presence}{The sense of immersion in virtual and augmented environments. Adapted from \textcite{stevens2002putting}. }[
        \item Place Illusion (PI) is the sense of the user of \enquote{being there} in the \VE.
-        \item Objet illusion is the sense of the virtual object to \enquote{feels here} in the \RE.
+        \item Objet illusion is the sense of the \VO to \enquote{feels here} in the \RE.
    ]
    \subfigsheight{35mm}
    \subfig{presence-vr}
@@ -112,17 +117,21 @@ The \PSI can be applied to \AR as is, but the \VOs must additionally have knowle
 \end{subfigs}

 \paragraph{Embodiment}
+\label{ar_embodiment}

 As presence, \SoE in \AR is a recent topic and little is known about its perception on the user experience~\cite{genay2021virtual}.


 \subsection{Direct Hand Manipulation in AR}
+\label{ar_interaction}
+
+Both \AR/\VR and haptic systems are able to render \VOs and environments as sensations displayed to the user's senses.
+A user must also be able in turn to manipulate the \VOs and environments to complete the loop interaction (\figref[introduction]{interaction-loop}), \eg through a hand-held controller, a tangible object, or even directly with the hands.
+An \emph{interaction technique} is then required to map the user inputs to actions on the \VE~\cite{laviola20173d}.

-Both \AR/\VR and haptic systems are able to render virtual objects and environments as sensations displayed to the user's senses.
-However, as presented in \figref[introduction]{interaction-loop}, the user must be able to manipulate the virtual objects and environments to complete the loop through a \UI, \eg using a hand-held controller, a tangible object, or even directly with the hands.
-A \emph{interaction technique} is then required to map these user inputs to actions on the \VE~\cite{laviola20173d}.

 \subsubsection{User Interfaces and Interaction Techniques}
+\label{interaction_techniques}

 For a user to interact with a computer system, they first perceive the state of the system and then act on it with inputs through a \UI.
 An input \UI can be either an \emph{active sensing}, physically held or worn device, such as a mouse, a touchscreen, or a hand-held controller, or a \emph{passive sensing}, not requiring any physical contact, such as eye trackers, voice recognition, or hand tracking.
@@ -132,15 +141,17 @@ Choosing useful and efficient \UIs and interaction techniques is crucial for the

 \fig[0.5]{interaction-technique}{An interaction technique map user inputs to actions within a computer system. Adapted from \textcite{billinghurst2005designing}.}

-\paragraph{Tasks}
+
+\subsubsection{Tasks}
+\label{ve_tasks}

 \textcite{laviola20173d} classify interaction techniques into three categories based on the tasks they enable users to perform: manipulation, navigation, and system control.
 \textcite{hertel2021taxonomy} proposed a revised taxonomy of interaction techniques specifically for immersive \AR.

 The \emph{manipulation tasks} are the most fundamental tasks in \AR and \VR systems, and the basic blocks for more complex interactions.
-\emph{Selection} is the identification or acquisition of a specific virtual object, \eg pointing at a target as in \figref{grubert2015multifi}, touching a button with a finger, or grasping an object with a hand.
+\emph{Selection} is the identification or acquisition of a specific \VO, \eg pointing at a target as in \figref{grubert2015multifi}, touching a button with a finger, or grasping an object with a hand.
 \emph{Positioning} and \emph{rotation} of a selected object are respectively the change of its position and orientation in \ThreeD space.
-It is also common to \emph{resize} a virtual object to change its size.
+It is also common to \emph{resize} a \VO to change its size.
 These three tasks are geometric (rigid) manipulations of the object: they do not change its shape.

 The \emph{navigation tasks} are the movements of the user within the \VE.
@@ -149,20 +160,6 @@ Wayfinding is the cognitive planning of the movement such as pathfinding or rout

 The \emph{system control tasks} are changes in the system state through commands or menus such as creation, deletion, or modification of objects, \eg as in \figref{roo2017onea}. It is also the input of text, numbers, or symbols.

-\paragraph{Reducing the Physical-Virtual Gap}
-
-In \AR and \VR, the state of the system is displayed to the user as a \VE seen spatially in 3D.
-Within an immersive and portable \AR system, this \VE is experienced at a 1:1 scale and as an integral part of the \RE.
-The rendering gap between the physical and virtual elements, as described on the interaction loop in \figref[introduction]{interaction-loop}, is thus experienced as very narrow or even not consciously perceived by the user.
-This manifests as a sense of presence of the virtual, as presented in \secref{ar_presence}.
-
-As the physical-virtual rendering gap is reduced, we could expect a similar and seamless interaction with the \VE as with a physical environment that \cite{jacob2008realitybased} called \emph{reality based interactions}.
-As of today, an immersive \AR system track itself with the user in \ThreeD, using tracking sensors and pose estimation algorithms~\cite{marchand2016pose}, \eg as in \figref{newcombe2011kinectfusion}.
-It enables to register the \VE with the \RE and the user simply moves themselves to navigate within the virtual content.
-%This tracking and mapping of the user and \RE into the \VE is named the \enquote{extent of world knowledge} by \textcite{skarbez2021revisiting}, \ie to what extent the \AR system knows about the \RE and is able to respond to changes in it.
-However, direct hand manipulation of the virtual content is a challenge that requires specific interaction techniques~\cite{billinghurst2021grand}.
-Such \emph{reality based interaction}~\cite{jacob2008realitybased} in immersive \AR is often achieved using two interaction techniques: \emph{tangible objects} and \emph{virtual hands}~\cite{billinghurst2015survey,hertel2021taxonomy}.
-
 \begin{subfigs}{interaction-techniques}{Interaction techniques in \AR. }[
        \item Spatial selection of virtual item of an extended display using a hand-held smartphone~\cite{grubert2015multifi}.
        \item Displaying as an overlay registered on the \RE the route to follow~\cite{grubert2017pervasive}.
@@ -176,7 +173,25 @@ Such \emph{reality based interaction}~\cite{jacob2008realitybased} in immersive
    \subfig{newcombe2011kinectfusion}
 \end{subfigs}

-\paragraph{Manipulating with Tangibles}
+
+\subsubsection{Reducing the Physical-Virtual Gap}
+\label{physical-virtual-gap}
+
+In \AR and \VR, the state of the system is displayed to the user as a \VE seen spatially in 3D.
+Within an immersive and portable \AR system, this \VE is experienced at a 1:1 scale and as an integral part of the \RE.
+The rendering gap between the physical and virtual elements, as described on the interaction loop in \figref[introduction]{interaction-loop}, is thus experienced as very narrow or even not consciously perceived by the user.
+This manifests as a sense of presence of the virtual, as presented in \secref{ar_presence}.
+
+As the physical-virtual rendering gap is reduced, we could expect a similar and seamless interaction with the \VE as with a physical environment that \cite{jacob2008realitybased} called \emph{reality based interactions}.
+As of today, an immersive \AR system track itself with the user in \ThreeD, using tracking sensors and pose estimation algorithms~\cite{marchand2016pose}, \eg as in \figref{newcombe2011kinectfusion}.
+It enables to register the \VE with the \RE and the user simply moves themselves to navigate within the virtual content.
+%This tracking and mapping of the user and \RE into the \VE is named the \enquote{extent of world knowledge} by \textcite{skarbez2021revisiting}, \ie to what extent the \AR system knows about the \RE and is able to respond to changes in it.
+However, direct hand manipulation of the virtual content is a challenge that requires specific interaction techniques~\cite{billinghurst2021grand}.
+Such \emph{reality based interaction}~\cite{jacob2008realitybased} in immersive \AR is often achieved using two interaction techniques: \emph{tangible objects} and \emph{virtual hands}~\cite{billinghurst2015survey,hertel2021taxonomy}.
+
+
+\subsubsection{Manipulating with Tangibles}
+\label{ar_tangibles}

 As \AR integrates visual virtual content into the \RE perception, it can involve real surrounding objects as a \UI: to visually augment them, \eg by superimposing a visual texture~\cite{gupta2020replicate}, and to use them as physical proxies to support the interaction with \VOs~\cite{ishii1997tangible}.
 According to \textcite{billinghurst2005designing}, each \VO is coupled with a tangible object, and the \VO is physically manipulated via the tangible object, providing a direct, efficient and seamless interactions with both the real and virtual content.
@@ -193,83 +208,115 @@ Similarly, in immersive \OST-\AR,

 Triple problème :
 il faut un tangible par objet, problème de l'association qui ne fonctionne pas toujours (\cite{hettiarachchi2016annexing}) et du nombre de tangibles à avoir
-et l'objet visuellement peut ne pas correspondre aux sensations haptiques du tangible manipulé (\cite{tinguy2019how}).
+et l'objet visuellement peut ne pas correspondre aux sensations haptiques du tangible manipulé (\cite{detinguy2019how}).
 C'est pourquoi utiliser du wearable pour modifier les sensations cutanées du tangible est une solution qui fonctionne en VR (\cite{detinguy2018enhancing,salazar2020altering}) et pourrait être adaptée à la RA.
 Mais, spécifique à la RA vs RV, le tangible et la main sont visibles, du moins partiellement, même si caché par un objet virtuel : comment va fonctionner l'augmentation haptique en RA vs RV ? Biais perceptuels ? Le fait de voir toucher avec sa propre main le tangible vs en RV où il est caché, donc illusion potentiellement plus forte en RV ?

-\paragraph{Manipulating with Virtual Hands}

-Les techniques d'interactions dites \enquote{naturelles} sont celles qui permettent à l'utilisateur d'utiliser directement les mouvements de son corps comme interface d'entrée avec le système de \AR/\VR~\cite{billinghurst2015survey}.
-C'est la main qui nous permet de manipuler avec force et précision les objets réels de la vie de tous les jours (\secref{hand_anatomy}), et c'est donc les techniques d'interactions de mains virtuelles qui sont les plus naturelles pour manipuler des objets virtuels~\cite{laviola20173d}.
-Initialement suivi par des dispositifs de capture de mouvement sous forme de gants ou de contrôleurs, il est maintenant possible de suivre les mains d'un utilisateur en temps réel avec des caméra et algorithmes de vision par ordinateur intégrés nativement dans les casques de \AR~\cite{tong2023survey}.
+\subsubsection{Manipulating with Virtual Hands}
+\label{ar_virtual_hands}
+
+Natural UI allow the user to use their body movements directly as inputs with the \VE~\cite{billinghurst2015survey}.
+Our hands allow us to manipulate real everyday objects with both strength and precision (\secref{grasp_types}), hence virtual hand interaction techniques seem the most natural way to manipulate virtual objects~\cite{laviola20173d}.
+Initially tracked by active sensing devices such as gloves or controllers, it is now possible to track hands in real time using cameras and computer vision algorithms natively integrated into \AR/\VR headsets~\cite{tong2023survey}.

 La main de l'utilisateur est donc suivie et reconstruite dans le \VE sous forme d'une \emph{main virtuelle}~\cite{billinghurst2015survey,laviola20173d}.
 Les modèles les plus simples représentent la main sous forme d'un objet 3D rigide suivant les mouvements de la main réelle avec \qty{6}{\DoF} (position et orientation dans l'espace)~\cite{talvas2012novel}.
-Une alternative est de représenter seulement les bouts des doigts, ce qui permet de réaliser des oppositions entre les doigts (\secref{grasp_types}).
-Enfin, les techniques les plus courantes représentent l'ensemble du squelette de la main sous forme d'un modèle kinématique articulé:
-Chaque phalange virtuelle est alors représentée avec certain \DoFs par rapport à la phalange précédente (\secref{hand_anatomy}).
+Une alternative est de représenter seulement les bouts des doigts, as in \figref{lee2007handy}, voire de représenter la main sous forme d'un nuage de points (\figref{hilliges2012holodesk_1}).
+Enfin, les techniques les plus courantes représentent l'ensemble du squelette de la main sous forme d'un modèle cinématique articulé (\secref{hand_anatomy}):
+Chaque phalange virtuelle est alors représentée avec certain \DoFs de rotations par rapport à la phalange précédente~\cite{borst2006spring}.

-Il existe plusieurs techniques pour simuler les contacts et l'interaction du modèle de main virtuelle avec les objets virtuels~\cite{laviola20173d}.
-Les techniques avec une approche heuristique utilisent des règles pour déterminer la sélection, la manipulation et le lâcher d'un objet~\cite{kim2015physicsbased}.
-Une sélection se fait par exemple en réalisant avec la main un geste prédéfini sur l'objet comme un type de grasping (\secref{grasp_types})~\cite{piumsomboon2013userdefined}.
-Les techniques basées sur la physique simulent les forces aux points de contact du modèle avec l'objet.
+The user's hand is therefore tracked and reconstructed as a \emph{virtual hand} model in the \VE ~\cite{billinghurst2015survey,laviola20173d}.
+The simplest models represent the hand as a rigid 3D object that follows the movements of the real hand with \qty{6}{\DoF} (position and orientation in space)~\cite{talvas2012novel}.
+An alternative is to model only the fingertips (\figref{lee2007handy}) or the whole hand (\figref{hilliges2012holodesk_1}) as points.
+The most common technique is to reconstruct all the phalanges of the hand in an articulated kinematic model (\secref{hand_anatomy})~\cite{borst2006spring}.

+The contacts between the virtual hand model and the \VOs are then simulated using heuristic or physics-based techniques~\cite{laviola20173d}.
+Heuristic techniques use rules to determine the selection, manipulation and release of a \VO (\figref{piumsomboon2013userdefined_1}).
+But they produce unrealistic behaviour and are limited to the cases predicted by the rules.
+Physics-based techniques simulate forces at the contact points between the virtual hand and the \VO.
+In particular, \textcite{borst2006spring} have proposed an articulated kinematic model in which each phalanx is a rigid body simulated with the god-object~\cite{zilles1995constraintbased} method: the virtual phalanx follows the movements of the real phalanx, but remains constrained to the surface of the virtual objects during contact. The forces acting on the object are calculated as a function of the distance between the real and virtual hands (\figref{borst2006spring}).
+More advanced techniques simulate the friction phenomena described in \secref{friction}~\cite{talvas2013godfinger} and finger deformations~\cite{talvas2015aggregate}, allowing highly accurate and realistic interactions, but which can be difficult to compute in real time.

-Maglré tout, le principal problème de l'interaction naturelle avec les mains dans un \VE, outre la détection des mains, est le manque de contrainte physique sur le mouvement de la main et des doigts, ce qui rend les actions fatiguantes (\cite{hincapie-ramos2014consumed}), imprécises (on ne sait pas si on touche l'objet virtuel sans retour haptique) et difficile (idem, sans retour haptique on ne sent pas l'objet glisser, et on a pas de confirmation qu'il est bien en main). Des techniques d'interactions d'une part sont toujours nécessaire,et un retour haptique adapté aux contraintes d'interactions de la RA est indispensable pour une bonne expérience utilisateur.
+\begin{subfigs}{virtual-hand}{Virtual hand interactions in \AR. }[
+        \item A fingertip tracking that enables to select a \VO by opening the hand~\cite{lee2007handy}.
+        \item Physics-based hand-object interactions with a virtual hand made of numerous many small rigid-body spheres~\cite{hilliges2012holodesk}.
+        \item Grasping a through gestures when the fingers are detected as opposing on the \VO~\cite{piumsomboon2013userdefined}.
+        \item A kinematic hand model with rigid-body phalanges (in beige) following the real tracked hand (in green) but kept physically constrained to the \VO. Applied force are displayed as red arrows~\cite{borst2006spring}.
+    ]
+    \subfigsheight{37mm}
+    \subfig{lee2007handy}
+    \subfig{hilliges2012holodesk_1}
+    \subfig{piumsomboon2013userdefined_1}
+    \subfig{borst2006spring}
+\end{subfigs}

-Cela peut être aussi difficile à comprendre : "\cite{chan2010touching} proposent la combinaison de retours continus, pour que l’utilisateur situe le suivi de son corps, et de retours discrets pour confirmer ses actions." Un rendu et affichage visuel des mains est un retour continu, un bref changement de couleur ou un retour haptique est un retour discret. Mais cette combinaison n'a pas été évaluée.
-
-\cite{piumsomboon2013userdefined} : user-defined gestures for manipulation of virtual objects in AR.
-\cite{piumsomboon2014graspshell} : direct hand manipulation of virtual objects in immersive AR vs vocal commands.
-
-Problèmes d'occultation, les objets virtuels doivent toujours êtres visibles : soit en utilisant une main virtuelle transparente plutôt qu’opaque, soit en affichant leurs contours si elle les cache \cite{piumsomboon2014graspshell}.
+However, the lack of physical constraints on the user's hand movements makes manipulation actions tiring~\cite{hincapie-ramos2014consumed}.
+While the fingers of the user traverse the virtual object, a physics-based virtual hand remains in contact with the object, a discrepancy that may degrade the user's performance in \VR~\cite{prachyabrued2012virtual}.
+Finally, in the absence of haptic feedback on each finger, it is difficult to estimate the contact and forces exerted by the fingers on the object during grasping and manipulation~\cite{maisto2017evaluation,meli2018combining}.
+While a visual rendering of the virtual hand in \VR can compensate for these issues~\cite{prachyabrued2014visual},, the visual and haptic rendering of the virtual hand, or their combination, in \AR is under-researched.


 \subsection{Visual Rendering of Hands in AR}
+\label{ar_visual_hands}

-In VR, as the user is fully immersed in the \VE and cannot see their real hands, it is necessary to represent them virtually.
-Virtual hand rendering is also known to influence how an object is grasped in VR~\cite{prachyabrued2014visual,blaga2020too} and AR, or even how real bumps and holes are perceived in VR~\cite{schwind2018touch}, but its effect on the perception of a haptic texture augmentation has not yet been investigated.
-It is known that the virtual hand representation has an impact on perception, interaction performance, and preference of users~\cite{prachyabrued2014visual, argelaguet2016role, grubert2018effects, schwind2018touch}.
-In a pick-and-place task in VR, \textcite{prachyabrued2014visual} found that the virtual hand representation whose motion was constrained to the surface of the virtual objects performed the worst, while the virtual hand representation following the tracked human hand (thus penetrating the virtual objects), performed the best, even though it was rather disliked.
-The authors also observed that the best compromise was a double rendering, showing both the tracked hand and a hand rendering constrained by the virtual environment.
-It has also been shown that over a realistic avatar, a skeleton rendering  can provide a stronger sense of being in control~\cite{argelaguet2016role} and that minimalistic fingertip rendering can be more effective in a typing task~\cite{grubert2018effects}.
+In \VR, as the user is fully immersed in the \VE and cannot see their real hands, it is necessary to represent their virtually.
+When interacting using a physics-based virtual hand method (\secref{ar_virtual_hands}), the visual rendering of the virtual hand have an influence on perception, interaction performance, and preference of users~\cite{prachyabrued2014visual,argelaguet2016role,grubert2018effects,schwind2018touch}.
+In a pick-and-place manipulation task in \VR, \textcite{prachyabrued2014visual} and \textcite{canales2019virtual} found that the visual hand rendering whose motion was constrained to the surface of the \VOs similar as to \textcite{borst2006spring} (\enquote{Outer Hand} in \figref{prachyabrued2014visual}) performed the worst, while the visual hand rendering following the tracked human hand (thus penetrating the \VOs, \enquote{Inner Hand} in \figref{prachyabrued2014visual}), performed the best, even though it was rather disliked.
+\textcite{prachyabrued2014visual} also observed that the best compromise was a double rendering, showing both the virtual hand and the tracked hand (\enquote{2-Hand} in \figref{prachyabrued2014visual}).
+While a realistic human hand rendering increase the sense of ownership~\cite{lin2016need}, a skeleton-like rendering provide a stronger sense of control~\cite{argelaguet2016role}, and a minimalistic fingertip rendering reduce errors in typing text~\cite{grubert2018effects}.
+A visual hand rendering while in \VE also seems to affect how one grasps an object~\cite{blaga2020too}, or how real bumps and holes are perceived~\cite{schwind2018touch}.

-\fig{prachyabrued2014visual}{Effect of different hand renderings on a pick-and-place task in VR~\cite{prachyabrued2014visual}.}
+\fig{prachyabrued2014visual}{Visual hand renderings affect user experience in \VR~\cite{prachyabrued2014visual}.}

-\cite{hilliges2012holodesk}
-\cite{chan2010touching} : cues for touching (selection) virtual objects.
+As presented in \secref{ar_displays}, a user sees their hands in \AR, and the mutual occlusion between the hands and the \VOs is a common issue, \ie hiding the \VO when the real hand is in front of it and hiding the real hand when it is behind the \VO.
+For example, in \figref{hilliges2012holodesk_2}, the user is pinching a virtual cube in \OST-\AR with their thumb and index fingers, but while the index is behind the cube, it is seen as in front of it.
+While in \VST-\AR, this could be solved as a masking problem by combining the real and virtual images~\cite{battisti2018seamless}, \eg in \figref{suzuki2014grasping}, in \OST-\AR, this is much more difficult because the \VE is displayed as a transparent \TwoD image on top of the \ThreeD \RE, which cannot be easily masked~\cite{macedo2023occlusion}.
+%Yet, even in \VST-\AR,

-Mutual visual occlusion between a virtual object and the real hand, \ie hiding the virtual object when the real hand is in front of it and hiding the real hand when it is behind the virtual object, is often presented as natural and realistic, enhancing the blending of real and virtual environments~\cite{piumsomboon2014graspshell, al-kalbani2016analysis}.
-In video see-through AR (VST-AR), this could be solved as a masking problem by combining the image of the real world captured by a camera and the generated virtual image~\cite{macedo2023occlusion}.
-In OST-AR, this is more difficult because the virtual environment is displayed as a transparent 2D image on top of the 3D real world, which cannot be easily masked~\cite{macedo2023occlusion}.
-Moreover, in VST-AR, the grip aperture and depth positioning of virtual objects often seem to be wrongly estimated~\cite{al-kalbani2016analysis, maisto2017evaluation}.
-However, this effect has yet to be verified in an OST-AR setup.
-
-An alternative is to render the virtual objects and the hand semi-transparents, so that they are partially visible even when one is occluding the other, \eg the real hand is behind the virtual cube but still visible.
-Although perceived as less natural, this seems to be preferred to a mutual visual occlusion in VST-AR~\cite{buchmann2005interaction,ha2014wearhand,piumsomboon2014graspshell} and VR~\cite{vanveldhuizen2021effect}, but has not yet been evaluated in OST-AR.
-However, this effect still causes depth conflicts that make it difficult to determine if one's hand is behind or in front of a virtual object, \eg the thumb is in front of the virtual cube, but it appears to be behind it.
-
-In AR, as the real hand of a user is visible but not physically constrained by the virtual environment, adding a visual hand rendering that can physically interact with virtual objects would achieve a similar result to the promising double-hand rendering of \textcite{prachyabrued2014visual}.
-Additionally, \textcite{kahl2021investigation} showed that a virtual object overlaying a tangible object in OST-AR can vary in size without worsening the users' experience nor the performance.
+As the \VE is intangible and the hand of the user visible while in \AR, adding a visual rendering of the virtual hand that is physically constrained to the \VOs would achieve a similar result to the promising double-hand rendering of \textcite{prachyabrued2014visual}.
+Additionally, \textcite{kahl2021investigation} showed that a \VO overlaying a tangible object in \OST-\AR can vary in size without worsening the users' experience nor the performance when manipulating it.
 This suggests that a visual hand rendering superimposed on the real hand could be helpful, but should not impair users.

-Few works have explored the effect of visual hand rendering in AR~\cite{blaga2017usability, maisto2017evaluation, krichenbauer2018augmented, yoon2020evaluating, saito2021contact}.
-For example, \textcite{blaga2017usability} evaluated a skeleton rendering in several virtual object manipulations against no visual hand overlay.
-Performance did not improve, but participants felt more confident with the virtual hand.
-However, the experiment was carried out on a screen, in a non-immersive AR scenario.
-\textcite{saito2021contact} found that masking the real hand with a textured 3D opaque virtual hand did not improve performance in a reach-to-grasp task but displaying the points of contact on the virtual object did.
-To the best of our knowledge, evaluating the role of a visual rendering of the hand displayed \enquote{and seen} directly above real tracked hands in immersive OST-AR has not been explored, particularly in the context of virtual object manipulation.
+An alternative is to render the \VOs and the virtual hand semi-transparents, so that they are partially visible even when one is occluding the other (\figref{buchmann2005interaction}).
+Although perceived as less natural, this seems to be preferred to a mutual visual occlusion in \VST-\AR~\cite{buchmann2005interaction,ha2014wearhand,piumsomboon2014graspshell} and \VR~\cite{vanveldhuizen2021effect}, but has not yet been evaluated in \OST-\AR.
+However, this effect still causes depth conflicts that make it difficult to determine if one's hand is behind or in front of a \VO, \eg the thumb is in front of the virtual cube, but could be perceived to be behind it.

-Mais se pose la question de la représentation, qui a montré des effets sur la performance et expérience utilisateur en RV mais reste peu étudiée en RA.
+Few works have compared different visual hand rendering in \AR, nor with wearable haptic feedback.
+\textcite{blaga2017usability} evaluated direct hand manipulation in non-immersive \VST-\AR a skeleton-like rendering against no visual hand rendering: while user performance did not improve, participants felt more confident with the virtual hand (\figref{blaga2017usability}).
+%\textcite{krichenbauer2018augmented} found participants \percent{22} faster in immersive \VST-\AR than in \VR in the same pick-and-place manipulation task.
+%No visual hand rendering was used in \VR while the real hand was visible in \AR.
+In a collaboration task in immersive \OST-\AR \vs \VR, \textcite{yoon2020evaluating} showed that a realistic human hand rendering was the most preferred over a low-polygon hand and a skeleton-like hand for the remote partner.
+\textcite{genay2021virtual} found that the \SoE was stronger with robotic hands overlay in \OST-\AR when the environment contains \VOs (\figref{genay2021virtual}).
+Finally, \textcite{maisto2017evaluation} and \textcite{meli2018combining} compared visual and haptic rendering of the hand in \AR, as detailed in the next section (\secref{vhar_rings}).
+Taken together, these results suggest that a visual hand rendering in \AR could improve the user experience and performance in direct hand manipulation tasks, but the best rendering is still to be determined.
+%\cite{chan2010touching} : cues for touching (selection) \VOs.
+%\textcite{saito2021contact} found that masking the real hand with a textured 3D opaque virtual hand did not improve performance in a reach-to-grasp task but displaying the points of contact on the \VO did.
+%To the best of our knowledge, evaluating the role of a visual rendering of the hand displayed \enquote{and seen} directly above real tracked hands in immersive OST-AR has not been explored, particularly in the context of \VO manipulation.

+\begin{subfigs}{visual-hands}{Visual hand renderings of virtual hands in \AR. }[
+        \item Grasping a \VO in \OST-\AR with no visual hand rendering~\cite{hilliges2012holodesk}.
+        \item Simulated mutual-occlusion between the hand grasping and the \VO in \VST-\AR~\cite{suzuki2014grasping}.
+        \item Grasping a real object with a semi-transparent hand in \VST-\AR~\cite{buchmann2005interaction}.
+        \item Skeleton rendering overlaying the real hand in \VST-\AR~\cite{blaga2017usability}.
+        \item Robotic rendering overlaying the real hands in \OST-\AR~\cite{genay2021virtual}.
+    ]
+    \subfigsheight{29mm}
+    \subfig{hilliges2012holodesk_2}
+    \subfig{suzuki2014grasping}
+    \subfig{buchmann2005interaction}
+    \subfig{blaga2017usability}
+    \subfig{genay2021virtual}
+    %\subfig{yoon2020evaluating}
+\end{subfigs}

 \subsection{Conclusion}
 \label{ar_conclusion}

-\AR systems integrate virtual objects into the visual perception as if they were part of the \RE.
+\AR systems integrate \VOs into the visual perception as if they were part of the \RE.
 \AR headsets now enable real-time tracking of the head and hands, and high-quality display of virtual content, while being portable and mobile.
 They enable highly immersive \AEs that users can explore with a strong sense of the presence of the virtual content.
-But without a direct and seamless interaction with the virtual objects using the hands, the coherence of the \AE experience is compromised.
-In particular, there is a lack of mutual occlusion and interaction cues between hands and virtual objects in \OST-\AR that could be mitigated by visual rendering of the hand.
-A common alternative approach is to use tangible objects as proxies for interaction with virtual objects, but this raises concerns about their number and association with virtual objects, as well as consistency with the visual rendering.
-In this context, the use of wearable haptic systems worn on the hand seems to be a promising solution both for improving direct hand manipulation of virtual objects and for coherent visuo-haptic augmentation of touched tangible objects.
+But without a direct and seamless interaction with the \VOs using the hands, the coherence of the \AE experience is compromised.
+In particular, there is a lack of mutual occlusion and interaction cues between hands and virtual content while manipulating \VOs in \OST-\AR that could be mitigated by visual rendering of the hand.
+A common alternative approach is to use tangible objects as proxies for interaction with \VOs, but this raises concerns about their number and association with \VOs, as well as consistency with the visual rendering.
+In this context, the use of wearable haptic systems worn on the hand seems to be a promising solution both for improving direct hand manipulation of \VOs and for coherent visuo-haptic augmentation of touched tangible objects.
--- a/1-introduction/related-work/4-visuo-haptic-ar.tex
+++ b/1-introduction/related-work/4-visuo-haptic-ar.tex
@@ -29,17 +29,17 @@ Thus, the overall perception can be modified by changing one of the modalities,
 % The ability to discriminate whether two stimuli are simultaneous is important to determine whether stimuli should be bound together and form a single multisensory perceptual object. diluca2019perceptual

 Similarly but in VR, \textcite{degraen2019enhancing} combined visual textures with different passive haptic hair-like structure that were touched with the finger to induce a larger set of visuo-haptic materials perception.
-\textcite{gunther2022smooth} studied in a complementary way how the visual rendering of a virtual object touching the arm with a tangible object influenced the perception of roughness.
+\textcite{gunther2022smooth} studied in a complementary way how the visual rendering of a \VO touching the arm with a tangible object influenced the perception of roughness.
 Likewise, visual textures were combined in VR with various tangible objects to induce a larger set of visuo-haptic material perceptions, in both active touch~\cite{degraen2019enhancing} and passive touch~\cite{gunther2022smooth} contexts.
 A common finding of these studies is that haptic sensations seem to dominate the perception of roughness, suggesting that a smaller set of haptic textures can support a larger set of visual textures.

 \subsubsection{Pseudo-Haptic Feedback}
 \label{pseudo_haptic}

-% Visual feedback in VR and AR is known to influence haptic perception [13]. The phenomenon of ”visual dominance” was notably observed when estimating the stiffness of virtual objects. L´ecuyer et al. [13] based their ”pseudo-haptic feedback” approach on this notion of visual dominance gaffary2017ar
+% Visual feedback in VR and AR is known to influence haptic perception [13]. The phenomenon of ”visual dominance” was notably observed when estimating the stiffness of \VOs. L´ecuyer et al. [13] based their ”pseudo-haptic feedback” approach on this notion of visual dominance gaffary2017ar

 A few works have also used pseudo-haptic feedback to change the perception of haptic stimuli to create richer feedback by deforming the visual representation of a user input~\cite{ujitoko2021survey}.
-For example, different levels of stiffness can be simulated on a grasped virtual object with the same passive haptic device~\cite{achibet2017flexifingers} or
+For example, different levels of stiffness can be simulated on a grasped \VO with the same passive haptic device~\cite{achibet2017flexifingers} or
 the perceived softness of tangible objects can be altered by superimposing in AR a virtual texture that deforms when pressed by the hand~\cite{punpongsanon2015softar}, or in combination with vibrotactile rendering in VR~\cite{choi2021augmenting}.

 \cite{ban2012modifying}
@@ -63,9 +63,9 @@ Even before manipulating a visual representation to induce a haptic sensation, s
 \subsubsection{Perception of Visuo-Haptic Rendering in AR and VR}
 \label{AR_vs_VR}

-Some studies have investigated the visuo-haptic perception of virtual objects in \AR and \VR.
+Some studies have investigated the visuo-haptic perception of \VOs in \AR and \VR.
 They have shown how the latency of the visual rendering of an object with haptic feedback or the type of environment (\VE or \RE) can affect the perception of an identical haptic rendering.
-Indeed, there are indeed inherent and unavoidable latencies in the visual and haptic rendering of virtual objects, and the visual-haptic feedback may not appear to be simultaneous.
+Indeed, there are indeed inherent and unavoidable latencies in the visual and haptic rendering of \VOs, and the visual-haptic feedback may not appear to be simultaneous.

 In an immersive \VST-\AR setup, \textcite{knorlein2009influence} rendered a virtual piston using force-feedback haptics that participants pressed directly with their hand (\figref{visuo-haptic-stiffness}).
 In a \TAFC task, participants pressed two pistons and indicated which was stiffer.
@@ -93,8 +93,8 @@ Therefore, a haptic delay (positive $\Delta t$) increases the perceived stiffnes
 In a similar \TAFC user study, participants compared perceived stiffness of virtual pistons in \OST-\AR and \VR~\cite{gaffary2017ar}.
 However, the force-feedback device and the participant's hand were not visible (\figref{gaffary2017ar}).
 The reference piston was judged to be stiffer when seen in \VR than in \AR, without participants noticing this difference, and more force was exerted on the piston overall in \VR.
-This suggests that the haptic stiffness of virtual objects feels \enquote{softer} in an \AE than in a full \VE.
-%Two differences that could be worth investigating with the two previous studies are the type of \AR (visuo or optical) and to see the hand touching the virtual object.
+This suggests that the haptic stiffness of \VOs feels \enquote{softer} in an \AE than in a full \VE.
+%Two differences that could be worth investigating with the two previous studies are the type of \AR (visuo or optical) and to see the hand touching the \VO.

 \begin{subfigs}{gaffary2017ar}{Perception of haptic stiffness in \OST-\AR \vs \VR~\cite{gaffary2017ar}. }[
        \item Experimental setup: a virtual piston was pressed with a force-feedback placed to the side of the participant.
@@ -125,15 +125,16 @@ A first reason is that they permanently cover the fingertip and affect the inter
 Another category of actuators relies on systems that cannot be considered as portable, such as REVEL~\cite{bau2012revel} that provide friction sensations with reverse electrovibration that need to modify the real objects to augment, or Electrical Muscle Stimulation (EMS) devices~\cite{lopes2018adding} that provide kinesthetic feedback by contracting the muscles.

 \subsubsection{Nail-Mounted Devices}
+\label{vhar_nails}

 \textcite{ando2007fingernailmounted} were the first to propose this approach that they experimented with a voice-coil mounted on the index nail (\figref{ando2007fingernailmounted}).
 The sensation of crossing edges of a virtual patterned texture (\secref{texture_rendering}) on a real sheet of paper were rendered with \qty{20}{\ms} vibration impulses at \qty{130}{\Hz}.
 Participants were able to match the virtual patterns to their real counterparts of height \qty{0.25}{\mm} and width \qtyrange{1}{10}{\mm}, but systematically overestimated the virtual width to be \qty{4}{\mm} longer.

-This approach was later extended by \textcite{teng2021touch} with Touch\&Fold, a haptic device mounted on the nail but able to unfold its end-effector on demand to make contact with the fingertip when touching virtual objects (\figref{teng2021touch}).
+This approach was later extended by \textcite{teng2021touch} with Touch\&Fold, a haptic device mounted on the nail but able to unfold its end-effector on demand to make contact with the fingertip when touching \VOs (\figref{teng2021touch}).
 This moving platform also contains a \LRA (\secref{moving_platforms}) and provides contact pressure (\qty{0.34}{\N} force) and texture (\qtyrange{150}{190}{\Hz} bandwidth) sensations.
 %The whole system is very compact (\qtyproduct{24 x 24 x 41}{\mm}), lightweight (\qty{9.5}{\g}), and fully portable by including a battery and Bluetooth wireless communication. \qty{20}{\ms} for the Bluetooth
-When touching virtual objects in \OST-\AR with the index finger, this device was found to be more realistic overall (5/7) than vibrations with a \LRA at \qty{170}{\Hz} on the nail (3/7).
+When touching \VOs in \OST-\AR with the index finger, this device was found to be more realistic overall (5/7) than vibrations with a \LRA at \qty{170}{\Hz} on the nail (3/7).
 Still, there is a high (\qty{92}{\ms}) latency for the folding mechanism and this design is not suitable for augmenting real tangible objects.

 % teng2021touch: (5.27+3.03+5.23+5.5+5.47)/5 = 4.9
@@ -158,21 +159,23 @@ However, as for \textcite{teng2021touch}, finger speed was not taken into accoun
 \end{subfigs}

 \subsubsection{Ring Belt Devices}
+\label{vhar_rings}

-The haptic ring belt devices of \textcite{minamizawa2007gravity} and \textcite{pacchierotti2016hring}, presented in \secref{belt_actuators}, have been employed to improve the manipulation of real and virtual objects in \AR.
+The haptic ring belt devices of \textcite{minamizawa2007gravity} and \textcite{pacchierotti2016hring}, presented in \secref{belt_actuators}, have been employed to improve the manipulation of \VOs in \AR, which is a fundamental task with a \VE (\secref{ar_interaction}).

 In a \VST-\AR setup, \textcite{scheggi2010shape} explored the effect of rendering the weight (\secref{weight_rendering}) of a virtual cube placed on a real surface hold with the thumb, index, and middle fingers (\figref{scheggi2010shape}).
 The middle phalanx of each of these fingers was equipped with a haptic ring of \textcite{minamizawa2007gravity}.
-However, no proper user study was conducted to evaluate this feedback.% on the manipulation of the cube.
+%However, no proper user study was conducted to evaluate this feedback.% on the manipulation of the cube.
 %that simulated the weight of the cube.
 %A virtual cube that could push on the cube was manipulated with the other hand through a force-feedback device.
-%\textcite{scheggi2010shape} report that \percent{80} of the participants appreciated the weight feedback.
+\textcite{scheggi2010shape} report that 12 out of 15 participants found the weight haptic feedback essential to feel the presence of the virtual cube.

-In pick-and-place tasks in non-immersive \VST-\AR involving both virtual and real objects (\figref{maisto2017evaluation}), \textcite{maisto2017evaluation} and \textcite{meli2018combining} compared the effects of providing haptic feedback about contacts at the fingertips using either the haptic ring of \textcite{pacchierotti2016hring}, or on the proximal phalanx, the moving platform of \textcite{chinello2020modular} on the fingertip.
-They showed that the haptic feedback improved the performance (completion time), reduced the exerted force on the cubes over a visual feedback alone.
+In a pick-and-place task in non-immersive \VST-\AR involving direct hand manipulation of both virtual and real objects (\figref{maisto2017evaluation}), \textcite{maisto2017evaluation} and \textcite{meli2018combining} compared the effects of providing haptic or visual feedback about fingertip-object contacts.
+They compared the haptic ring of \textcite{pacchierotti2016hring} on the proximal phalanx, the moving platform of \textcite{chinello2020modular} on the fingertip, and a visual rendering of the tracked fingertips as virtual points.
+They showed that the haptic feedback improved the completion time, reduced the exerted force on the cubes over the visual feedback (\figref{ar_visual_hands}).
 The haptic ring was also perceived by users to be more effective than the moving platform.
 However, the measured difference in performance could be attributed to either the device or the device position (proximal vs fingertip), or both.
-These two studies were also conducted in non-immersive setups, where users looked at a screen displaying the visual interactions, and only compared haptic and visual feedback, but did not examine them together.
+These two studies were also conducted in non-immersive setups, where users looked at a screen displaying the visual interactions, and only compared haptic and visual rendering of the hand-object contacts, but did not examine them together.

 \begin{subfigs}{ar_rings}{Wearable haptic ring devices for \AR. }[
        \item Rendering weight of a virtual cube placed on a real surface~\cite{scheggi2010shape}.
@@ -184,6 +187,7 @@ These two studies were also conducted in non-immersive setups, where users looke
 \end{subfigs}

 \subsubsection{Wrist Bracelet Devices}
+\label{vhar_bracelets}

 With their \enquote{Tactile And Squeeze Bracelet Interface} (Tasbi), already mentioned in \secref{belt_actuators}, \textcite{pezent2019tasbi} and \textcite{pezent2022design} explored the use of a wrist-worn bracelet actuator.
 It is capable of providing a uniform pressure sensation (up to \qty{15}{\N} and \qty{10}{\Hz}) and vibration with six \LRAs (\qtyrange{150}{200}{\Hz} bandwidth).
--- a/1-introduction/related-work/figures/achibet2017flexifingers.jpg
+++ b/1-introduction/related-work/figures/achibet2017flexifingers.jpg
--- a/1-introduction/related-work/figures/blaga2017usability.jpg
+++ b/1-introduction/related-work/figures/blaga2017usability.jpg
--- a/1-introduction/related-work/figures/borst2006spring.jpg
+++ b/1-introduction/related-work/figures/borst2006spring.jpg
--- a/1-introduction/related-work/figures/buchmann2005interaction.jpg
+++ b/1-introduction/related-work/figures/buchmann2005interaction.jpg
--- a/1-introduction/related-work/figures/genay2021virtual.jpg
+++ b/1-introduction/related-work/figures/genay2021virtual.jpg
--- a/1-introduction/related-work/figures/hilliges2012holodesk_1.jpg
+++ b/1-introduction/related-work/figures/hilliges2012holodesk_1.jpg
--- a/1-introduction/related-work/figures/hilliges2012holodesk_2.jpg
+++ b/1-introduction/related-work/figures/hilliges2012holodesk_2.jpg
--- a/1-introduction/related-work/figures/lee2007handy.jpg
+++ b/1-introduction/related-work/figures/lee2007handy.jpg
--- a/1-introduction/related-work/figures/pacchierotti2015cutaneous.jpg
+++ b/1-introduction/related-work/figures/pacchierotti2015cutaneous.jpg
--- a/1-introduction/related-work/figures/piumsomboon2013userdefined_1.jpg
+++ b/1-introduction/related-work/figures/piumsomboon2013userdefined_1.jpg
--- a/1-introduction/related-work/figures/suzuki2014grasping.jpg
+++ b/1-introduction/related-work/figures/suzuki2014grasping.jpg
--- a/1-introduction/related-work/figures/yoon2020evaluating.jpg
+++ b/1-introduction/related-work/figures/yoon2020evaluating.jpg
--- a/references.bib
+++ b/references.bib