Remove comments

2025-04-10 16:14:23 +02:00
parent 49c33a5a37
commit 93a47df0f8
4 changed files with 10 additions and 117 deletions
--- a/2-related-work/3-augmented-reality.tex
+++ b/2-related-work/3-augmented-reality.tex
@@ -1,19 +1,9 @@
 \section{Manipulating Objects with the Hands in AR}
 \label{augmented_reality}

-%As with haptic systems (\secref{wearable_haptics}), visual
 \AR devices generate and integrate virtual content into the user's perception of their real environment (\RE), creating the illusion of the \emph{presence} of the virtual \cite{azuma1997survey,skarbez2021revisiting}.
 Immersive systems such as headsets leave the hands free to interact with virtual objects (virtual objects), promising natural and intuitive interactions similar to those with everyday real objects \cite{billinghurst2021grand,hertel2021taxonomy}.

-%\begin{subfigs}{sutherland1968headmounted}{Photos of the first \AR system \cite{sutherland1968headmounted}. }[
-%        \item The \AR headset.
-%        \item Wireframe \ThreeD virtual objects were displayed registered in the \RE (as if there were part of it).
-%    ]
-%    \subfigsheight{45mm}
-%    \subfig{sutherland1970computer3}
-%    \subfig{sutherland1970computer2}
-%\end{subfigs}
-
 \subsection{What is Augmented Reality?}
 \label{what_is_ar}

@@ -23,15 +13,12 @@ Fixed to the ceiling, the headset displayed a stereoscopic (one image per eye) p
 \subsubsection{A Definition of AR}
 \label{ar_definition}

-%\footnotetext{There quite confusion in the literature and in (because of) the industry about the terms \AR and \MR. The term \MR is very often used as a synonym of \AR, or a version of \AR that enables an interaction with the virtual content. The title of this section refers to the title of the highly cited paper by \textcite{speicher2019what} that examines this debate.}
-
 The first formal definition of \AR was proposed by \textcite{azuma1997survey}: (1) combine real and virtual, (2) be interactive in real time, and (3) register real and virtual\footnotemark.
-Each of these characteristics is essential: the real-virtual combination distinguishes \AR from \VR, a movie with integrated digital content is not interactive and a \TwoD overlay like an image filter is not registered.
+Each of these characteristics is essential: the real-virtual combination distinguishes \AR from \VR, a movie with integrated digital content is not interactive and a 2D overlay like an image filter is not registered.
 There are also two key aspects of this definition: it does not focus on technology or method, but on the user's perspective of the system experience, and it does not specify a particular human sense, \ie it can be auditory \cite{yang2022audio}, haptic \cite{bhatia2024augmenting}, or even olfactory \cite{brooks2021stereosmell} or gustatory \cite{brooks2023taste}.
 Yet, most research has focused on visual augmentation, and the term \AR (without a prefix) is almost always understood as visual \AR.

 \footnotetext{This third characteristic has been slightly adapted to use the version of \textcite{marchand2016pose}, the original definition was: \enquote{registered in \ThreeD}.}
-%For example, \textcite{milgram1994taxonomy} proposed a taxonomy of \MR experiences based on the degree of mixing real and virtual environments, and \textcite{skarbez2021revisiting} revisited this taxonomy to include the user's perception of the experience.

 \begin{subfigs}{ar_applications}{Examples of \AR applications. }[][
  \item Visuo-haptic surgery training with cutting into virtual soft tisues \cite{harders2009calibration}.
@@ -59,7 +46,6 @@ However, the user experience in \AR is still highly dependent on the display use
 \label{ar_displays}

 To experience a virtual content combined and registered with the \RE, an output device that display the \VE to the user is necessary.
-%An output device is more formally defined as an output \emph{\UI}
 There is a large variety of \AR displays with different methods of combining the real and virtual content, and different locations on the \RE or the user \cite[p.126]{billinghurst2015survey}.

 In \emph{\VST-\AR}, the virtual images are superimposed to images of the \RE captured by a camera \cite{marchand2016pose}, and the combined real-virtual image is displayed on a screen to the user, as illustrated in \figref{itoh2022indistinguishable_vst}. An example is shown in \figref{hartl2013mobile}.
@@ -87,15 +73,12 @@ Regardless the \AR display, it can be placed at different locations \cite{bimber
 \emph{Spatial \AR} is usually projection-based displays placed at fixed location (\figref{roo2017inner}), but it can also be \OST or \VST \emph{fixed windows} (\figref{lee2013spacetop}).
 Alternatively, \AR displays can be \emph{hand-held}, like a \VST smartphone (\figref{hartl2013mobile}), or body-attached, like a micro-projector used as a flashlight \cite[p.141]{billinghurst2015survey}.
 Finally, \AR displays can be head-worn like \VR \emph{headsets} or glasses, providing a highly immersive and portable experience.
-%Smartphones, shipped with sensors, computing ressources and algorithms, are the most common \AR today's displays, but research and development promise more immersive and interactive \AR with headset displays \cite{billinghurst2021grand}.

 \fig[0.75]{roo2017one_1}{Locations of \AR displays from eye-worn to spatially projected. Adapted by \textcite{roo2017one} from \textcite{bimber2005spatial}.}

 \subsubsection{Presence and Embodiment in AR}
 \label{ar_presence_embodiment}

-%Despite the clear and acknowledged definition presented in \secref{ar_definition} and the viewpoint of this thesis that \AR and \VR are two type of \MR experience with different levels of mixing real and virtual environments, as presented in \secref[introduction]{visuo_haptic_augmentations}, there is still a debate on defining \AR and \MR as well as how to characterize and categorized such experiences \cite{speicher2019what,skarbez2021revisiting}.
-
 Presence and embodiment are two key concepts that characterize the user experience in \AR and \VR.
 While there is a large literature on these topics in \VR, they are less defined and studied for \AR \cite{genay2022being,tran2024survey}.
 These concepts will be useful for the design, evaluation, and discussion of our contributions:
@@ -112,12 +95,9 @@ It emerges from the real time rendering of the \VE from the user's perspective:
 It doesn't mean that the virtual events are realistic, \ie that reproduce the real world with high fidelity \cite{skarbez2017survey}, but that they are believable and coherent with the user's expectations.
 In the same way, a film can be plausible even if it is not realistic, such as a cartoon or a science-fiction movie.

-%The \AR presence is far less defined and studied than for \VR \cite{tran2024survey}
 For \AR, \textcite{slater2022separate} proposed to invert place illusion to what we can call \enquote{object illusion}, \ie the sense of the virtual object to \enquote{feels here} in the \RE (\figref{presence-ar}).
 As with \VR, virtual objects must be able to be seen from different angles by moving the head, but also, this is more difficult, appear to be coherent enough with the \RE \cite{skarbez2021revisiting}, \eg occlude or be occluded by real objects \cite{macedo2023occlusion}, cast shadows or reflect lights.
 The plausibility can be applied to \AR as is, but the virtual objects must additionally have knowledge of the \RE and react accordingly to it to be, again, perceived as coherently behaving with the real world \cite{skarbez2021revisiting}.
-%\textcite{skarbez2021revisiting} also named place illusion for \AR as \enquote{immersion} and plausibility as \enquote{coherence}, and these terms will be used in the remainder of this thesis.
-%One main issue with presence is how to measure it both in \VR \cite{slater2022separate} and \AR \cite{tran2024survey}.

 \begin{subfigs}{presence}{
    The sense of immersion in virtual and augmented environments. Adapted from \textcite{stevens2002putting}.
@@ -142,7 +122,7 @@ In \AR, it could take the form of body accessorization, \eg wearing virtual clot
 \subsection{Direct Hand Manipulation in AR}
 \label{ar_interaction}

-A user in \AR must be able to interact with the virtual content to fulfil the second point of \textcite{azuma1997survey}'s definition (\secref{ar_definition}) and complete our proposed visuo-haptic interaction loop (\figref[introduction]{interaction-loop}). %, \eg through a hand-held controller, a real object, or even directly with the hands.
+A user in \AR must be able to interact with the virtual content to fulfil the second point of \textcite{azuma1997survey}'s definition (\secref{ar_definition}) and complete our proposed visuo-haptic interaction loop (\figref[introduction]{interaction-loop}).
 In all examples of \AR applications shown in \secref{ar_applications}, the user interacts with the \VE using their hands, either directly or through a physical interface.

 \subsubsection{User Inputs and Interaction Techniques}
@@ -153,12 +133,10 @@ Such input devices form an input \emph{\UI} that captures and translates user's
 Similarly, an output \UI render and display the state of the system to the user (such as a \AR/\VR display, \secref{ar_displays}, or an haptic actuator, \secref{wearable_haptic_devices}).

 Inputs \UI can be either an \emph{active sensing}, a held or worn device, such as a mouse, a touch screen, or a hand-held controller, or a \emph{passive sensing}, that does not require a contact, such as eye trackers, voice recognition, or hand tracking \cite[p.294]{laviolajr20173d}.
-The captured information from the sensors is then translated into actions within the computer system by an \emph{interaction technique}. %(\figref{interaction-technique}).
+The captured information from the sensors is then translated into actions within the computer system by an \emph{interaction technique}.
 For example, a cursor on a screen can be moved using either with a mouse or with the arrow keys on a keyboard, or a two-finger swipe on a touchscreen can be used to scroll or zoom an image.
 Choosing useful and efficient \UIs and interaction techniques is crucial for the user experience and the tasks that can be performed within the system.

-%\fig[0.5]{interaction-technique}{An interaction technique map user inputs to actions within a computer system. Adapted from \textcite{billinghurst2005designing}.}
-
 \subsubsection{Tasks with Virtual Environments}
 \label{ve_tasks}

@@ -204,7 +182,6 @@ This manifests as a sense of presence of the virtual, as described in \secref{ar
 As the gap between real and virtual rendering is reduced, one could expect a similar and seamless interaction with the \VE as with a \RE, which \textcite{jacob2008realitybased} called \emph{reality based interactions}.
 As of today, an immersive \AR system tracks itself with the user in \ThreeD, using tracking sensors and pose estimation algorithms \cite{marchand2016pose}.
 It enables the \VE to be registered with the \RE and the user simply moves to navigate within the virtual content.
-%This tracking and mapping of the user and \RE into the \VE is named the \enquote{extent of world knowledge} by \textcite{skarbez2021revisiting}, \ie to what extent the \AR system knows about the \RE and is able to respond to changes in it.
 However, direct hand manipulation of virtual content is a challenge that requires specific interaction techniques \cite{billinghurst2021grand}.
 It is often achieved using two interaction techniques: \emph{tangible objects} and \emph{virtual hands} \cite[p.165]{billinghurst2015survey}.

@@ -212,10 +189,8 @@ It is often achieved using two interaction techniques: \emph{tangible objects} a
 \label{ar_tangibles}

 As \AR integrates visual virtual content into \RE perception, it can involve real surrounding objects as \UI: to either visually augment them (\figref{roo2017inner}), or to use them as physical proxies to support interaction with virtual objects \cite{ishii1997tangible}.
-%According to \textcite{billinghurst2005designing}
 Each virtual object is coupled to a real object and physically manipulated through it, providing a direct, efficient and seamless interaction with both the real and virtual content \cite{billinghurst2005designing}.
 The real objects are called \emph{tangible} in this usage context.
-%This technique is similar to mapping the movements of a mouse to a virtual cursor on a screen.

 Methods have been developed to automatically pair and adapt the virtual objects for rendering with available tangibles of similar shape and size \cite{hettiarachchi2016annexing,jain2023ubitouch} (\figref{jain2023ubitouch}).
 The issue with these \emph{space-multiplexed} interfaces is the large number and variety of tangibles required.
@@ -246,7 +221,6 @@ Similarly, in \secref{tactile_rendering} we described how a material property (\
 \subsubsection{Manipulating with Virtual Hands}
 \label{ar_virtual_hands}

-%can track the user's movements and use them as inputs to the \VE \textcite[p.172]{billinghurst2015survey}.
 Initially tracked by active sensing devices such as gloves or controllers, it is now possible to track hands in real time using passive sensing (\secref{interaction_techniques}) and computer vision algorithms natively integrated into \AR/\VR headsets \cite{tong2023survey}.
 Our hands allow us to manipulate real everyday objects (\secref{grasp_types}), hence virtual hand interaction techniques seem to be the most natural way to manipulate virtual objects \cite[p.400]{laviolajr20173d}.

@@ -285,7 +259,6 @@ While a visual feedback of the virtual hand in \VR can compensate for these issu
 \subsection{Visual Feedback of Virtual Hands in AR}
 \label{ar_visual_hands}

-%In \VR, since the user is fully immersed in the \VE and cannot see their real hands, it is necessary to represent them virtually (\secref{ar_embodiment}).
 When interacting with a physics-based virtual hand method (\secref{ar_virtual_hands}) in \VR, the visual feedback of the virtual hand has an influence on perception, interaction performance, and preference of users \cite{prachyabrued2014visual,argelaguet2016role,grubert2018effects,schwind2018touch}.
 In a pick-and-place manipulation task in \VR, \textcite{prachyabrued2014visual} and \textcite{canales2019virtual} found that the visual hand feedback whose motion was constrained to the surface of the virtual objects similar as to \textcite{borst2006spring} (\enquote{Outer Hand} in \figref{prachyabrued2014visual}) performed the worst, while the visual hand feedback following the tracked human hand (thus penetrating the virtual objects, \enquote{Inner Hand} in \figref{prachyabrued2014visual}) performed the best, though it was rather disliked.
 \textcite{prachyabrued2014visual} also found that the best compromise was a double feedback, showing both the virtual hand and the tracked hand (\enquote{2-Hand} in \figref{prachyabrued2014visual}).
@@ -295,13 +268,7 @@ A visual hand feedback while in \VE also seems to affect how one grasps an objec
 \fig{prachyabrued2014visual}{Visual hand feedback affect user experience in \VR \cite{prachyabrued2014visual}.}

 Conversely, a user sees their own hands in \AR, and the mutual occlusion between the hands and the virtual objects is a common issue (\secref{ar_displays}), \ie hiding the virtual object when the real hand is in front of it, and hiding the real hand when it is behind the virtual object (\figref{hilliges2012holodesk_2}).
-%For example, in \figref{hilliges2012holodesk_2}, the user is pinching a virtual cube in \OST-\AR with their thumb and index fingers, but while the index is behind the cube, it is seen as in front of it.
 While in \VST-\AR, this could be solved as a masking problem by combining the real and virtual images \cite{battisti2018seamless}, \eg in \figref{suzuki2014grasping}, in \OST-\AR, this is much more difficult because the \VE is displayed as a transparent 2D image on top of the \ThreeD \RE, which cannot be easily masked \cite{macedo2023occlusion}.
-%Yet, even in \VST-\AR,
-
-%An alternative is to render the virtual objects and the virtual hand semi-transparents, so that they are partially visible even when one is occluding the other (\figref{buchmann2005interaction}).
-%Although perceived as less natural, this seems to be preferred to a mutual visual occlusion in \VST-\AR \cite{buchmann2005interaction,ha2014wearhand,piumsomboon2014graspshell} and \VR \cite{vanveldhuizen2021effect}, but has not yet been evaluated in \OST-\AR.
-%However, this effect still causes depth conflicts that make it difficult to determine if one's hand is behind or in front of a virtual object, \eg the thumb is in front of the virtual cube, but could be perceived to be behind it.

 Since the \VE is intangible, adding a visual feedback of the virtual hand in \AR that is physically constrained to the virtual objects would achieve a similar result to the double-hand feedback of \textcite{prachyabrued2014visual}.
 A virtual object overlaying a real object in \OST-\AR can vary in size and shape without degrading user experience or manipulation performance \cite{kahl2021investigation,kahl2023using}.
@@ -309,16 +276,11 @@ This suggests that a visual hand feedback superimposed on the real hand as a par

 Few works have compared different visual feedback of the virtual hand in \AR or with wearable haptic feedback.
 Rendering the real hand as a semi-transparent hand in \VST-\AR is perceived as less natural but seems to be preferred to a mutual visual occlusion for interaction with real and virtual objects \cite{buchmann2005interaction,piumsomboon2014graspshell}.
-%Although perceived as less natural, this seems to be preferred to a mutual visual occlusion in \VST-\AR \cite{buchmann2005interaction,ha2014wearhand,piumsomboon2014graspshell} and \VR \cite{vanveldhuizen2021effect}, but has not yet been evaluated in \OST-\AR.
 Similarly, \textcite{blaga2017usability} evaluated direct hand manipulation in non-immersive \VST-\AR with a skeleton-like rendering \vs no visual hand feedback: while user performance did not improve, participants felt more confident with the virtual hand (\figref{blaga2017usability}).
-%\textcite{krichenbauer2018augmented} found that participants were \percent{22} faster in immersive \VST-\AR than in \VR in the same pick-and-place manipulation task, but no visual hand rendering was used in \VR while the real hand was visible in \AR.
 In a collaborative task in immersive \OST-\AR \vs \VR, \textcite{yoon2020evaluating} showed that a realistic human hand rendering was the most preferred over a low-polygon hand and a skeleton-like hand for the remote partner.
 \textcite{genay2021virtual} found that the sense of embodiment with robotic hands overlay in \OST-\AR was stronger when the environment contained both real and virtual objects (\figref{genay2021virtual}).
 Finally, \textcite{maisto2017evaluation} and \textcite{meli2018combining} compared the visual and haptic feedback of the hand in \VST-\AR, as detailed in the next section (\secref{vhar_rings}).
 Taken together, these results suggest that a visual augmentation of the hand in \AR could improve usability and performance in direct hand manipulation tasks, but the best rendering has yet to be determined.
-%\cite{chan2010touching} : cues for touching (selection) virtual objects.
-%\textcite{saito2021contact} found that masking the real hand with a textured \ThreeD opaque virtual hand did not improve performance in a reach-to-grasp task but displaying the points of contact on the virtual object did.
-%To the best of our knowledge, evaluating the role of a visual rendering of the hand displayed \enquote{and seen} directly above real tracked hands in immersive OST-AR has not been explored, particularly in the context of virtual object manipulation.

 \begin{subfigs}{visual-hands}{Visual feedback of the virtual hand in \AR. }[][
  \item Grasping a virtual object in \OST-\AR with no visual hand feedback \cite{hilliges2012holodesk}.
@@ -333,7 +295,6 @@ Taken together, these results suggest that a visual augmentation of the hand in
  \subfig{buchmann2005interaction}
  \subfig{blaga2017usability}
  \subfig{genay2021virtual}
-  %\subfig{yoon2020evaluating}
 \end{subfigs}

 \subsection{Conclusion}