Complete related work

This commit is contained in:
2024-09-23 03:09:45 +02:00
parent ee2b739ddb
commit 0495afd60c
12 changed files with 213 additions and 180 deletions

View File

@@ -205,8 +205,8 @@ In an immersive and portable \AR system, this \VE is experienced at a 1:1 scale
The rendering gap between the real and virtual elements, as described on the interaction loop in \figref[introduction]{interaction-loop}, is thus experienced as very narrow or even not consciously perceived by the user.
This manifests as a sense of presence of the virtual, as described in \secref{ar_presence}.
As the real-virtual rendering gap is reduced, we could expect a similar and seamless interaction with the \VE as with a \RE, which \textcite{jacob2008realitybased} called \emph{reality based interactions}.
As of today, an immersive \AR system track itself with the user in \ThreeD, using tracking sensors and pose estimation algorithms \cite{marchand2016pose}, \eg as in \figref{newcombe2011kinectfusion}.
As the gap between real and virtual rendering is reduced, one could expects a similar and seamless interaction with the \VE as with a \RE, which \textcite{jacob2008realitybased} called \emph{reality based interactions}.
As of today, an immersive \AR system tracks itself with the user in \ThreeD, using tracking sensors and pose estimation algorithms \cite{marchand2016pose}, \eg as in \figref{newcombe2011kinectfusion}.
It enables the \VE to be registered with the \RE and the user simply moves to navigate within the virtual content.
%This tracking and mapping of the user and \RE into the \VE is named the \enquote{extent of world knowledge} by \textcite{skarbez2021revisiting}, \ie to what extent the \AR system knows about the \RE and is able to respond to changes in it.
However, direct hand manipulation of virtual content is a challenge that requires specific interaction techniques \cite{billinghurst2021grand}.
@@ -218,18 +218,18 @@ It is often achieved using two interaction techniques: \emph{tangible objects} a
As \AR integrates visual virtual content into \RE perception, it can involve real surrounding objects as \UI: to visually augment them, \eg by superimposing visual textures \cite{roo2017inner} (\figref{roo2017inner}), and to use them as physical proxies to support interaction with \VOs \cite{ishii1997tangible}.
According to \textcite{billinghurst2005designing}, each \VO is coupled to a tangible object, and the \VO is physically manipulated through the tangible object, providing a direct, efficient and seamless interaction with both the real and virtual content.
This is a technique similar to mapping a mouse's movements to a virtual cursor on a screen.
This technique is similar to mapping the movements of a mouse to a virtual cursor on a screen.
Methods have been developed to automatically pair and adapt the \VOs to render with available tangibles of similar shape and size \cite{hettiarachchi2016annexing,jain2023ubitouch} (\figref{jain2023ubitouch}).
The issue with these \enquote{space-multiplexed} interfaces is the high number and variety of tangibles required.
An alternative is to use a single \enquote{universal} tangible object like a hand-held controller, such as a cube \cite{issartel2016tangible} or a sphere \cite{englmeier2020tangible}.
These \enquote{time-multiplexed} interfaces require interaction techniques that allow the user to pair the tangible with any \VO, \eg by placing the tangible into the \VO and pressing the fingers \cite{issartel2016tangible} (\figref{issartel2016tangible}), similar to a real grasp (\secref{grasp_types}).
Methods have been developed to automatically pair and adapt the \VOs for rendering with available tangibles of similar shape and size \cite{hettiarachchi2016annexing,jain2023ubitouch} (\figref{jain2023ubitouch}).
The issue with these \emph{space-multiplexed} interfaces is the large number and variety of tangibles required.
An alternative is to use a single \emph{universal} tangible object like a hand-held controller, such as a cube \cite{issartel2016tangible} or a sphere \cite{englmeier2020tangible}.
These \emph{time-multiplexed} interfaces require interaction techniques that allow the user to pair the tangible with any \VO, \eg by placing the tangible into the \VO and pressing the fingers \cite{issartel2016tangible} (\figref{issartel2016tangible}), similar to a real grasp (\secref{grasp_types}).
Still, the virtual visual rendering and the tangible haptic sensations can be inconsistent.
Especially in \OST-\AR, as the \VOs are slightly transparent allowing the paired tangibles to be seen through them.
In a pick-and-place task with tangibles of different shapes, a difference in size \cite{kahl2021investigation} (\figref{kahl2021investigation}) and shape \cite{kahl2023using} (\figref{kahl2023using}) with the \VOs does not affect user performance or presence, and that small variations (\percent{\sim 10} for size) were not even noticed by the users.
Especially in \OST-\AR, since the \VOs are inherently slightly transparent allowing the paired tangibles to be seen through them.
In a pick-and-place task with tangibles of different shapes, a difference in size \cite{kahl2021investigation} (\figref{kahl2021investigation}) and shape \cite{kahl2023using} (\figref{kahl2023using}) of the \VOs does not affect user performance or presence, and that small variations (\percent{\sim 10} for size) were not even noticed by the users.
This suggests the feasibility of using simplified tangibles in \AR whose spatial properties (\secref{object_properties}) abstract those of the \VOs.
Similarly, we described in \secref{tactile_rendering} how a material property (\secref{object_properties}) of a touched tangible can be modified using wearable haptic devices \cite{detinguy2018enhancing,salazar2020altering}: It could be used to render coherent visuo-haptic material perceptions directly touched with the hand in \AR.
Similarly, in \secref{tactile_rendering} we described how a material property (\secref{object_properties}) of a touched tangible can be modified using wearable haptic devices \cite{detinguy2018enhancing,salazar2020altering}: It could be used to render coherent visuo-haptic material perceptions directly touched with the hand in \AR.
\begin{subfigs}{ar_applications}{Manipulating \VOs with tangibles. }[
\item Ubi-Touch paired the movements and screw interaction of a virtual drill with a real vaporizer held by the user \cite{jain2023ubitouch}.
@@ -248,8 +248,8 @@ Similarly, we described in \secref{tactile_rendering} how a material property (\
\subsubsection{Manipulating with Virtual Hands}
\label{ar_virtual_hands}
Natural UI allow the user to use their body movements directly as inputs with the \VE \cite{billinghurst2015survey}.
Our hands allow us to manipulate real everyday objects with both strength and precision (\secref{grasp_types}), hence virtual hand interaction techniques seem the most natural way to manipulate virtual objects \cite{laviola20173d}.
Natural \UIs allow the user to use their body movements directly as inputs to the \VE \cite{billinghurst2015survey}.
Our hands allow us to manipulate real everyday objects with both strength and precision (\secref{grasp_types}), so virtual hand interaction techniques seem to be the most natural way to manipulate virtual objects \cite{laviola20173d}.
Initially tracked by active sensing devices such as gloves or controllers, it is now possible to track hands in real time using cameras and computer vision algorithms natively integrated into \AR/\VR headsets \cite{tong2023survey}.
The user's hand is therefore tracked and reconstructed as a \emph{virtual hand} model in the \VE \cite{billinghurst2015survey,laviola20173d}.
@@ -259,16 +259,18 @@ The most common technique is to reconstruct all the phalanges of the hand in an
The contacts between the virtual hand model and the \VOs are then simulated using heuristic or physics-based techniques \cite{laviola20173d}.
Heuristic techniques use rules to determine the selection, manipulation and release of a \VO (\figref{piumsomboon2013userdefined_1}).
But they produce unrealistic behaviour and are limited to the cases predicted by the rules.
Physics-based techniques simulate forces at the contact points between the virtual hand and the \VO.
In particular, \textcite{borst2006spring} have proposed an articulated kinematic model in which each phalanx is a rigid body simulated with the god-object \cite{zilles1995constraintbased} method: the virtual phalanx follows the movements of the real phalanx, but remains constrained to the surface of the virtual objects during contact. The forces acting on the object are calculated as a function of the distance between the real and virtual hands (\figref{borst2006spring}).
However, they produce unrealistic behaviour and are limited to the cases predicted by the rules.
Physics-based techniques simulate forces at the points of contact between the virtual hand and the \VO.
In particular, \textcite{borst2006spring} have proposed an articulated kinematic model in which each phalanx is a rigid body simulated with the god-object method \cite{zilles1995constraintbased}:
The virtual phalanx follows the movements of the real phalanx, but remains constrained to the surface of the virtual objects during contact.
The forces acting on the object are calculated as a function of the distance between the real and virtual hands (\figref{borst2006spring}).
More advanced techniques simulate the friction phenomena \cite{talvas2013godfinger} and finger deformations \cite{talvas2015aggregate}, allowing highly accurate and realistic interactions, but which can be difficult to compute in real time.
\begin{subfigs}{virtual-hand}{Manipulating \VOs with virtual hands. }[
\item A fingertip tracking that enables to select a \VO by opening the hand \cite{lee2007handy}.
\item A fingertip tracking that allows to select a \VO by opening the hand \cite{lee2007handy}.
\item Physics-based hand-object manipulation with a virtual hand made of numerous many small rigid-body spheres \cite{hilliges2012holodesk}.
\item Grasping a through gestures when the fingers are detected as opposing on the \VO \cite{piumsomboon2013userdefined}.
\item A kinematic hand model with rigid-body phalanges (in beige) following the real tracked hand (in green) but kept physically constrained to the \VO. Applied force are displayed as red arrows \cite{borst2006spring}.
\item A kinematic hand model with rigid-body phalanges (in beige) taht follows the real tracked hand (in green) but kept physically constrained to the \VO. Applied forces are shown as red arrows \cite{borst2006spring}.
]
\subfigsheight{37mm}
\subfig{lee2007handy}
@@ -278,7 +280,7 @@ More advanced techniques simulate the friction phenomena \cite{talvas2013godfing
\end{subfigs}
However, the lack of physical constraints on the user's hand movements makes manipulation actions tiring \cite{hincapie-ramos2014consumed}.
While the fingers of the user traverse the virtual object, a physics-based virtual hand remains in contact with the object, a discrepancy that may degrade the user's performance in \VR \cite{prachyabrued2012virtual}.
While the user's fingers traverse the virtual object, a physics-based virtual hand remains in contact with the object, a discrepancy that may degrade the user's performance in \VR \cite{prachyabrued2012virtual}.
Finally, in the absence of haptic feedback on each finger, it is difficult to estimate the contact and forces exerted by the fingers on the object during grasping and manipulation \cite{maisto2017evaluation,meli2018combining}.
While a visual rendering of the virtual hand in \VR can compensate for these issues \cite{prachyabrued2014visual}, the visual and haptic rendering of the virtual hand, or their combination, in \AR is under-researched.
@@ -286,16 +288,16 @@ While a visual rendering of the virtual hand in \VR can compensate for these iss
\subsection{Visual Rendering of Hands in AR}
\label{ar_visual_hands}
In \VR, as the user is fully immersed in the \VE and cannot see their real hands, it is necessary to represent their virtually (\secref{ar_embodiment}).
When interacting using a physics-based virtual hand method (\secref{ar_virtual_hands}), the visual rendering of the virtual hand have an influence on perception, interaction performance, and preference of users \cite{prachyabrued2014visual,argelaguet2016role,grubert2018effects,schwind2018touch}.
In a pick-and-place manipulation task in \VR, \textcite{prachyabrued2014visual} and \textcite{canales2019virtual} found that the visual hand rendering whose motion was constrained to the surface of the \VOs similar as to \textcite{borst2006spring} (\enquote{Outer Hand} in \figref{prachyabrued2014visual}) performed the worst, while the visual hand rendering following the tracked human hand (thus penetrating the \VOs, \enquote{Inner Hand} in \figref{prachyabrued2014visual}), performed the best, even though it was rather disliked.
\textcite{prachyabrued2014visual} also observed that the best compromise was a double rendering, showing both the virtual hand and the tracked hand (\enquote{2-Hand} in \figref{prachyabrued2014visual}).
While a realistic human hand rendering increase the sense of ownership \cite{lin2016need}, a skeleton-like rendering provide a stronger sense of agency \cite{argelaguet2016role} (\secref{ar_embodiment}), and a minimalistic fingertip rendering reduce errors in typing text \cite{grubert2018effects}.
In \VR, since the user is fully immersed in the \VE and cannot see their real hands, it is necessary to represent them virtually (\secref{ar_embodiment}).
When interacting with a physics-based virtual hand method (\secref{ar_virtual_hands}), the visual rendering of the virtual hand has an influence on perception, interaction performance, and preference of users \cite{prachyabrued2014visual,argelaguet2016role,grubert2018effects,schwind2018touch}.
In a pick-and-place manipulation task in \VR, \textcite{prachyabrued2014visual} and \textcite{canales2019virtual} found that the visual hand rendering whose motion was constrained to the surface of the \VOs similar as to \textcite{borst2006spring} (\enquote{Outer Hand} in \figref{prachyabrued2014visual}) performed the worst, while the visual hand rendering following the tracked human hand (thus penetrating the \VOs, \enquote{Inner Hand} in \figref{prachyabrued2014visual}) performed the best, though it was rather disliked.
\textcite{prachyabrued2014visual} also found that the best compromise was a double rendering, showing both the virtual hand and the tracked hand (\enquote{2-Hand} in \figref{prachyabrued2014visual}).
While a realistic rendering of the human hand increased the sense of ownership \cite{lin2016need}, a skeleton-like rendering provided a stronger sense of agency \cite{argelaguet2016role} (\secref{ar_embodiment}), and a minimalist fingertip rendering reduced typing errors \cite{grubert2018effects}.
A visual hand rendering while in \VE also seems to affect how one grasps an object \cite{blaga2020too}, or how real bumps and holes are perceived \cite{schwind2018touch}.
\fig{prachyabrued2014visual}{Visual hand renderings affect user experience in \VR \cite{prachyabrued2014visual}.}
Conversely, a user sees their own hands in \AR, and the mutual occlusion between the hands and the \VOs is a common issue (\secref{ar_displays}), \ie hiding the \VO when the real hand is in front of it and hiding the real hand when it is behind the \VO (\figref{hilliges2012holodesk_2}).
Conversely, a user sees their own hands in \AR, and the mutual occlusion between the hands and the \VOs is a common issue (\secref{ar_displays}), \ie hiding the \VO when the real hand is in front of it, and hiding the real hand when it is behind the \VO (\figref{hilliges2012holodesk_2}).
%For example, in \figref{hilliges2012holodesk_2}, the user is pinching a virtual cube in \OST-\AR with their thumb and index fingers, but while the index is behind the cube, it is seen as in front of it.
While in \VST-\AR, this could be solved as a masking problem by combining the real and virtual images \cite{battisti2018seamless}, \eg in \figref{suzuki2014grasping}, in \OST-\AR, this is much more difficult because the \VE is displayed as a transparent \TwoD image on top of the \ThreeD \RE, which cannot be easily masked \cite{macedo2023occlusion}.
%Yet, even in \VST-\AR,
@@ -304,19 +306,19 @@ While in \VST-\AR, this could be solved as a masking problem by combining the re
%Although perceived as less natural, this seems to be preferred to a mutual visual occlusion in \VST-\AR \cite{buchmann2005interaction,ha2014wearhand,piumsomboon2014graspshell} and \VR \cite{vanveldhuizen2021effect}, but has not yet been evaluated in \OST-\AR.
%However, this effect still causes depth conflicts that make it difficult to determine if one's hand is behind or in front of a \VO, \eg the thumb is in front of the virtual cube, but could be perceived to be behind it.
As the \VE is intangible, adding a visual rendering of the virtual hand in \AR that is physically constrained to the \VOs would achieve a similar result to the promising double-hand rendering of \textcite{prachyabrued2014visual}.
A \VO overlaying a tangible object in \OST-\AR can vary in size and shape without worsening the users' experience nor the performance when manipulating it \cite{kahl2021investigation,kahl2023using}.
Since the \VE is intangible, adding a visual rendering of the virtual hand in \AR that is physically constrained to the \VOs would achieve a similar result to the promising double-hand rendering of \textcite{prachyabrued2014visual}.
A \VO overlaying a tangible object in \OST-\AR can vary in size and shape without degrading user experience or manipulation performance \cite{kahl2021investigation,kahl2023using}.
This suggests that a visual hand rendering superimposed on the real hand as a partial avatarization (\secref{ar_embodiment}) might be helpful without impairing the user.
Few works have compared different visual hand rendering in \AR or with wearable haptic feedback.
Rendering the real hand as a semi-transparent hand in \VST-\AR is perceived as less natural but seems to be preferred to a mutual visual occlusion for interaction with real and virtual objects \cite{buchmann2005interaction,piumsomboon2014graspshell}.
%Although perceived as less natural, this seems to be preferred to a mutual visual occlusion in \VST-\AR \cite{buchmann2005interaction,ha2014wearhand,piumsomboon2014graspshell} and \VR \cite{vanveldhuizen2021effect}, but has not yet been evaluated in \OST-\AR.
Similarly, \textcite{blaga2017usability} evaluated direct hand manipulation in non-immersive \VST-\AR a skeleton-like rendering against no visual hand rendering: while user performance did not improve, participants felt more confident with the virtual hand (\figref{blaga2017usability}).
\textcite{krichenbauer2018augmented} found participants \percent{22} faster in immersive \VST-\AR than in \VR in the same pick-and-place manipulation task, but no visual hand rendering was used in \VR while the real hand was visible in \AR.
In a collaboration task in immersive \OST-\AR \vs \VR, \textcite{yoon2020evaluating} showed that a realistic human hand rendering was the most preferred over a low-polygon hand and a skeleton-like hand for the remote partner.
\textcite{genay2021virtual} found that the \SoE was stronger with robotic hands overlay in \OST-\AR when the environment contains both real and virtual objects (\figref{genay2021virtual}).
Finally, \textcite{maisto2017evaluation} and \textcite{meli2018combining} compared visual and haptic rendering of the hand in \VST-\AR, as detailed in the next section (\secref{vhar_rings}).
Taken together, these results suggest that a visual hand rendering in \AR could improve the user experience and performance in direct hand manipulation tasks, but the best rendering is still to be determined.
Similarly, \textcite{blaga2017usability} evaluated direct hand manipulation in non-immersive \VST-\AR with a skeleton-like rendering \vs no visual hand rendering: while user performance did not improve, participants felt more confident with the virtual hand (\figref{blaga2017usability}).
%\textcite{krichenbauer2018augmented} found that participants were \percent{22} faster in immersive \VST-\AR than in \VR in the same pick-and-place manipulation task, but no visual hand rendering was used in \VR while the real hand was visible in \AR.
In a collaborative task in immersive \OST-\AR \vs \VR, \textcite{yoon2020evaluating} showed that a realistic human hand rendering was the most preferred over a low-polygon hand and a skeleton-like hand for the remote partner.
\textcite{genay2021virtual} found that the \SoE with robotic hands overlay in \OST-\AR was stronger when the environment contained both real and virtual objects (\figref{genay2021virtual}).
Finally, \textcite{maisto2017evaluation} and \textcite{meli2018combining} compared the visual and haptic rendering of the hand in \VST-\AR, as detailed in the next section (\secref{vhar_rings}).
Taken together, these results suggest that a visual rendering of the hand in \AR could improve usability and performance in direct hand manipulation tasks, but the best rendering has yet to be determined.
%\cite{chan2010touching} : cues for touching (selection) \VOs.
%\textcite{saito2021contact} found that masking the real hand with a textured 3D opaque virtual hand did not improve performance in a reach-to-grasp task but displaying the points of contact on the \VO did.
%To the best of our knowledge, evaluating the role of a visual rendering of the hand displayed \enquote{and seen} directly above real tracked hands in immersive OST-AR has not been explored, particularly in the context of \VO manipulation.
@@ -340,10 +342,10 @@ Taken together, these results suggest that a visual hand rendering in \AR could
\subsection{Conclusion}
\label{ar_conclusion}
\AR systems integrate virtual content into the user's perception as if it is part of the \RE.
\AR systems integrate virtual content into the user's perception as if it were part of the \RE.
\AR headsets now enable real-time tracking of the head and hands, and high-quality display of virtual content, while being portable and mobile.
They enable highly immersive \AEs that users can explore with a strong sense of the presence of the virtual content.
But without a direct and seamless interaction with the \VOs using the hands, the coherence of the \AE experience is compromised.
In particular, there is a lack of mutual occlusion and interaction cues between hands and virtual content while manipulating \VOs in \OST-\AR that could be mitigated by visual rendering of the hand.
However, without a direct and seamless interaction with the \VOs using the hands, the coherence of the \AE experience is compromised.
In particular, there is a lack of mutual occlusion and interaction cues between the hands and virtual content when manipulating \VOs in \OST-\AR that could be mitigated by a visual rendering of the hand.
A common alternative approach is to use tangible objects as proxies for interaction with \VOs, but this raises concerns about their consistency with the visual rendering.
In this context, the use of wearable haptic systems worn on the hand seems to be a promising solution both for improving direct hand manipulation of \VOs and for coherent visuo-haptic augmentation of touched tangible objects.