186 lines
18 KiB
TeX
186 lines
18 KiB
TeX
\section{Results}
|
|
\label{results}
|
|
|
|
\subsection{Textures Matching}
|
|
\label{results_matching}
|
|
|
|
\paragraph{Confusion Matrix}
|
|
\label{results_matching_confusion_matrix}
|
|
|
|
\comans{JG}{For the two-sample Chi-Squared tests in the matching task, the number of samples reported is 540 due to 20 participants conducting 3 trials for 9 textures each. However, this would only hold true if the repetitions per participant would be independent and not correlated (and then, one could theoretically also run 10 participants with 6 trials each, or 5 participants with 12 trials each). If they are not independent, this would lead to an artificial inflated sample size and Type I error. If the trials are not independent (please double check), I suggest either aggregating data on the participant level or to use alternative models that account for the within-subject correlation (as was done in other chapters).}{Data of the three confusion matrices have been aggregated on the participant level and analyzed using a Poisson regression.}
|
|
\figref{results/matching_confusion_matrix} shows the confusion matrix of the \level{Matching} task with the visual textures and the proportion of haptic texture selected in response, \ie the proportion of times the corresponding \response{Haptic Texture} was selected in response to the presentation of the corresponding \factor{Visual Texture}.
|
|
To determine which haptic textures were selected most often, the repetitions of the trials were first aggregated by counting the number of selections per participant for each (\factor{Visual Texture}, \response{Haptic Texture}) pair.
|
|
An \ANOVA based on a Poisson regression (no overdispersion was detected) indicated a statistically significant effect on the number of selections of the interaction \factor{Visual Texture} \x \response{Haptic Texture} (\chisqr{64}{180}{414}, \pinf{0.001}).
|
|
Post-hoc pairwise comparisons using the Tukey's \HSD test then indicated there was statistically significant differences for the following visual textures:
|
|
\begin{itemize}
|
|
\item With \level{Sandpaper~320}, \level{Coffee Filter} was more selected than the other haptic textures (\ztest{3.4}, \pinf{0.05} each) except \level{Plastic Mesh~1} and \level{Terra Cotta}.
|
|
\item With \level{Terra Cotta}, \level{Coffee Filter} was more selected than the others (\ztest{3.4}, \pinf{0.05} each) except \level{Plastic Mesh~1} and \level{Terra Cotta}.
|
|
\item With \level{Coffee Filter}, \level{Coffee Filter} was more selected than the others (\ztest{4.0}, \pinf{0.01} each) except \level{Terra Cotta}.
|
|
\end{itemize}
|
|
|
|
\fig[0.85]{results/matching_confusion_matrix}{Confusion matrix of the \level{Matching} task results.}[
|
|
With the presented visual textures as columns and the selected haptic texture in proportion as rows.
|
|
The number in a cell is the proportion of times the corresponding haptic texture was selected in response to the presentation of the corresponding visual texture.
|
|
The diagonal represents the expected correct answers.
|
|
]
|
|
|
|
Many mistakes were made: the expected haptic texture was selected on average only \percent{20} of the time for five of the visual textures, and even around \percent{5} for (visual) \level{Sandpaper~100}, \level{Brick~2}, and \level{Sandpaper~320}.
|
|
Only haptic \level{Coffee Filter} was correctly selected \percent{57} of the time, and was also particularly matched with the visual \level{Sandpaper~320} and \level{Terra Cotta} (around \percent{44} each).
|
|
Similarly, the haptic textures \level{Sandpaper~320} and \level{Plastic Mesh~1} were also selected for four and three visual textures, respectively (around \percent{25} each).
|
|
Additionally, the Spearman correlations between the trials were computed for each participant and only 21 out of 60 were statistically significant (\pinf{0.05}), with a mean \spearman{0.52} \ci{0.43}{0.59}.
|
|
|
|
These results indicate that the participants hesitated between several haptic textures for a given visual texture, as also reported in several comments, some haptic textures being more favored while some others were almost not selected at all.
|
|
Another explanation could be that the participants had difficulties to estimate the roughness of the visual textures.
|
|
Indeed, many participants explained that they tried to identify or imagine the roughness of a given visual texture then to select the most plausible haptic texture, in terms of frequency and/or amplitude of vibrations.
|
|
|
|
\paragraph{Completion Time}
|
|
|
|
To verify that the difficulty with all the visual textures was the same on the \level{Matching} task, the \response{Completion Time} of a trial was analyzed.
|
|
As the \response{Completion Time} results were Gamma distributed, they were transformed with a log to approximate a normal distribution.
|
|
An \ANOVA based on a \LMM on the log \response{Completion Time} with the \factor{Visual Texture} as fixed effect and the participant as random intercept was performed.
|
|
Normality was verified with a QQ-plot of the model residuals.
|
|
No statistical significant effect of \factor{Visual Texture} was found (\anova{8}{512}{1.9}, \p{0.06}) on \response{Completion Time} (\geomean{44}{\s} \ci{42}{46}), indicating an equal difficulty and participant behaviour for all the visual textures.
|
|
|
|
\subsection{Textures Ranking}
|
|
\label{results_ranking}
|
|
|
|
\figref{results/rankings_modality} presents the results of the three rankings of the haptic textures alone, the visual textures alone, and the visuo-haptic texture pairs.
|
|
For each ranking, a Friedman test was performed with post-hoc Wilcoxon signed-rank tests and Holm-Bonferroni adjustment.
|
|
|
|
\fig[1]{results/rankings_modality}{Means with bootstrap \percent{95} \CI of the \level{Ranking} task results for each \factor{Modality}.}[
|
|
Shown for the haptic textures alone (left), the visual textures alone (center) and the visuo-haptic textures pairs (right).
|
|
The order of the visual textures on the x-axis differs between modalities.
|
|
A lower rank means that the texture was considered rougher, a higher rank means smoother.
|
|
Wilcoxon signed-rank tests and Holm-Bonferroni adjustment: all comparisons were statistically significantly different (\pinf{0.05}) except when marked with an \enquote{X}.
|
|
]
|
|
|
|
\paragraph{Haptic Textures Ranking}
|
|
|
|
Almost all the texture pairs in the haptic textures ranking results were statistically significantly different (\chisqr{8}{20}{146}, \pinf{0.001}; \pinf{0.05} for each comparison; see \figref{results/rankings_modality}, left).
|
|
However, no difference was found between the pairs (\level{Metal Mesh}, \level{Sandpaper~100}), (\level{Cork}, \level{Brick~2}), (\level{Cork}, \level{Sandpaper~320}), (\level{Plastic Mesh~1}, \level{Velcro Hooks}), and (\level{Plastic Mesh~1}, \level{Terra Cotta}).
|
|
Average Kendall's Tau correlations between the participants indicated a high consensus (\kendall{0.82} \ci{0.81}{0.84}) showing that participants perceived similarly the roughness of the haptic textures.
|
|
|
|
\paragraph{Visual Textures Ranking}
|
|
|
|
Most of the texture pairs in the visual textures ranking results were also statistically significantly different (\chisqr{8}{20}{119}, \pinf{0.001}; \pinf{0.05} for each comparison; see \figref{results/rankings_modality}, center), except for the following.
|
|
No difference was found between \level{Plastic Mesh~1} and \level{Metal Mesh}, \level{Brick 2}, \level{Sandpaper 100}, \level{Cork}, \level{Velcro Hooks};
|
|
nor between \level{Velcro Hooks} and \level{Sandpaper 100}, \level{Cork}, \level{Brick 2}.
|
|
No difference was also found between the pairs (\level{Metal Mesh}, \level{Cork}), (\level{Sandpaper~100}, \level{Brick~2}), (\level{Sandpaper~320}, \level{Terra Cotta}) and (\level{Sandpaper~320}, \level{Coffee Filter}).
|
|
Even though the consensus was high (\kendall{0.61} \ci{0.58}{0.64}), the roughness of the visual textures were more difficult to estimate, in particular for \level{Plastic Mesh~1} and \level{Velcro Hooks}.
|
|
|
|
\paragraph{Visuo-Haptic Textures Ranking}
|
|
|
|
Also, almost all the texture pairs in the visuo-haptic textures ranking results were statistically significantly different (\chisqr{8}{20}{140}, \pinf{0.001}; \pinf{0.05} for each comparison; see \figref{results/rankings_modality}, right).
|
|
However, no difference was found between the textures for each of the following groups: \{\level{Sandpaper~100}, \level{Cork}\}; \{\level{Cork}, \level{Brick~2}\}; and \{\level{Plastic Mesh~1}, \level{Velcro Hooks}, \level{Sandpaper~320}\}.
|
|
The consensus between the participants was also high \kendall{0.77} \ci{0.74}{0.79}.
|
|
|
|
Finally, the similarity of the three rankings of each participant was calculated (\figref{results/rankings_texture}).
|
|
The \textit{Visuo-Haptic Textures Ranking} was on average highly similar to the \textit{Haptic Textures Ranking} (\kendall{0.79} \ci{0.72}{0.86}) and moderately to the \textit{Visual Textures Ranking} (\kendall{0.48} \ci{0.39}{0.56}).
|
|
A Wilcoxon signed-rank test indicated that this difference was statistically significant (\wilcoxon{190}, \p{0.002}).
|
|
These results indicate that the two haptic and visual modalities were integrated together, the resulting roughness ranking being between the two rankings of the modalities alone, but with haptics predominating.
|
|
|
|
\fig[1]{results/rankings_texture}{Means with bootstrap \percent{95} \CI of the \level{Ranking} task results for each \factor{Visual Texture}.}[
|
|
A lower rank means that the texture was considered rougher, a higher rank means smoother.
|
|
]
|
|
|
|
\subsection{Perceived Similarity of Visual and Haptic Textures}
|
|
\label{results_clusters}
|
|
|
|
The high level of agreement between participants on the three haptic, visual and visuo-haptic rankings in the \level{Ranking} task (\secref{results_ranking}), as well as the similarity of the within-participant rankings, suggest that participants perceived the roughness of the textures similarly, but differed in their strategies for matching the haptic and visual textures in the \level{Matching} task (\secref{results_matching}).
|
|
|
|
To further investigate the perceived similarity of the haptic and visual textures, and to identify groups of textures that were perceived as similar on the \level{Matching} task, a correspondence analysis and a hierarchical clustering were performed on the matching task confusion matrix (\figref{results/matching_confusion_matrix}).
|
|
|
|
\paragraph{Correspondence Analysis}
|
|
|
|
The correspondence analysis captured \percent{60} and \percent{29} of the variance in the first and second dimensions, respectively, with the remaining dimensions each accounting for less than \percent{5} each.
|
|
\figref{results/matching_correspondence_analysis} shows the first two dimensions with the 18 haptic and visual textures.
|
|
The first dimension was similar to the rankings (\figref{results/rankings_texture}), distributing the textures according to their perceived roughness.
|
|
It seems that the second dimension opposed textures that were perceived as hard with those perceived as softer, as also reported by participants.
|
|
Stiffness is indeed an important perceptual dimension of a material (\secref[related_work]{hardness}).
|
|
|
|
\fig[1]{results/matching_correspondence_analysis}{Correspondence analysis of the confusion matrix of the \level{Matching} task.}[
|
|
The closer the haptic and visual textures are, the more similar they were judged.
|
|
The first dimension (horizontal axis) explains \percent{60} of the variance, the second dimension (vertical axis) explains \percent{29} of the variance.
|
|
The confusion matrix is shown in \figref{results/matching_confusion_matrix}.
|
|
]
|
|
|
|
\paragraph{Hierarchical Clustering}
|
|
|
|
\figref{results_clusters} shows the dendrograms of the two hierarchical clusterings of the haptic and visual textures, constructed using the Euclidean distance and the Ward's method on squared distance.
|
|
|
|
The four identified haptic texture clusters were: \enquote{Roughest} \{\level{Metal Mesh}, \level{Sandpaper~100}, \level{Brick~2}, \level{Cork}\}; \enquote{Rougher} \{\level{Sandpaper~320}, \level{Velcro Hooks}\}; \enquote{Smoother} \{\level{Plastic Mesh~1}, \level{Terra Cotta}\}; \enquote{Smoothest} \{\level{Coffee Filter}\} (\figref{results/clusters_haptic}).
|
|
Similar to the haptic ranks (\figref{results/rankings_modality}, left), the clusters could have been named according to their perceived roughness.
|
|
It also shows that the participants compared and ranked the haptic textures during the \level{Matching} task to select the one that best matched the given visual texture.
|
|
|
|
The five identified visual texture clusters were: \enquote{Roughest} \{\level{Metal Mesh}\}; \enquote{Rougher} \{\level{Sandpaper~100}, \level{Brick~2}, \level{Velcro Hooks}\}; \enquote{Medium} \{\level{Cork}, \level{Plastic Mesh~1}\}; \enquote{Smoother} \{\level{Sandpaper~320}, \level{Terra Cotta}\}; \enquote{Smoothest} \{\level{Coffee Filter}\} (\figref{results/clusters_visual}).
|
|
They are also easily identifiable on the visual ranking results, which also made it possible to name them.
|
|
|
|
\begin{subfigs}{results_clusters}{Dendrograms of the hierarchical clusterings of the confusion matrix of the \level{Matching} task.}[
|
|
Done with the Euclidean distance and the Ward's method on squared distance.
|
|
The height of the dendrograms represents the distance between the clusters.
|
|
][
|
|
\item For the haptic textures.
|
|
\item For the visual textures.
|
|
]
|
|
\subfig[0.48]{results/clusters_haptic}
|
|
\subfig[0.48]{results/clusters_visual}
|
|
\end{subfigs}
|
|
|
|
\paragraph{Confusion Matrices of Clusters}
|
|
|
|
Based on these results, two alternative confusion matrices were constructed.
|
|
Similarly to \secref{results_matching}, an \ANOVA based on a Poisson regression was performed for each confusion matrix on the number of selections, followed by post-hoc pairwise comparisons using the Tukey's \HSD test. No overdispersion was detected on the Poisson regressions.
|
|
|
|
\figref{results/haptic_visual_clusters_confusion_matrices} (left) shows the confusion matrix of the \level{Matching} task with visual texture clusters and the proportion of haptic texture clusters selected in response.
|
|
There was a statistically significant effect on the number of selections of the interaction visual texture cluster \x haptic texture cluster (\chisqr{12}{180}{324}, \pinf{0.001}), and statistically significant differences for the following visual clusters:
|
|
\begin{itemize}
|
|
\item With \enquote{Roughest}, the haptic cluster \enquote{Roughest} was the most selected (\ztest{4.6}, \pinf{0.001}).
|
|
\item With \enquote{Rougher}, \enquote{Smoothest} was the least selected (\ztest{-4.0}, \pinf{0.001}) and \enquote{Rougher} more than \enquote{Smoother} (\ztest{-3.4}, \pinf{0.001}).
|
|
\item With \enquote{Medium}, \enquote{Rougher} and \enquote{Smoother} were both (\ztest{4.5}, \pinf{0.001}) more selected than \enquote{Roughest} and \enquote{Smoothest}.
|
|
\item With \enquote{Smoother}, \enquote{Smoother} (\ztest{4.2}, \pinf{0.001}) and \enquote{Smoothest} (\ztest{4.7}, \pinf{0.001}) were both more selected than \enquote{Roughest} and \enquote{Rougher}.
|
|
\item With \enquote{Smoothest}, \enquote{Smoother} (\ztest{2.6}, \p{0.05}) and \enquote{Smoothest} (\ztest{3.9}, \pinf{0.001}) were both more selected than \enquote{Roughest} and \enquote{Rougher}.
|
|
\end{itemize}
|
|
|
|
\figref{results/haptic_visual_clusters_confusion_matrices} (right) shows the confusion matrix of the \level{Matching} task with visual texture ranks and the proportion of haptic texture clusters selected in response.
|
|
There was a statistically significant effect on the number of selections of the visual texture rank \x haptic texture cluster interaction (\chisqr{24}{180}{340}, \pinf{0.001}), and statistically significant differences for the following visual texture ranks:
|
|
\begin{itemize}
|
|
\item Rank 0: the haptic cluster \enquote{Roughest} was the most selected (\ztest{4.5}, \pinf{0.001}).
|
|
\item Ranks 1, 2 and 3: \enquote{Smoothest} was the least selected (\ztest{-3.0}, \p{0.04}).
|
|
\item Rank 4: \enquote{Rougher} was more selected than \enquote{Roughest} and \enquote{Smoothest} (\ztest{3.0}, \p{0.03}).
|
|
\item Rank 5: \enquote{Rougher} and \enquote{Smoother} were both (\ztest{4.5}, \p{0.01}) more selected than \enquote{Roughest} and \enquote{Smoothest}.
|
|
\item Rank 6: \enquote{Smoother} was more selected than \enquote{Roughest} (\ztest{3.2}, \p{0.006}).
|
|
\item Rank 7: \enquote{Smoother} and \enquote{Smoothest} were both (\ztest{3.4}, \p{0.04}) more selected than \enquote{Roughest} and \enquote{Rougher}.
|
|
\item Rank 7: \enquote{Smoother} and \enquote{Smoothest} were both (\ztest{3.2}, \p{0.04}) more selected than \enquote{Roughest} and \enquote{Rougher}.
|
|
\end{itemize}
|
|
|
|
\fig{results/haptic_visual_clusters_confusion_matrices}{
|
|
Confusion matrices of the visual texture (left) or rank (right) with the corresponding haptic texture clusters selected in proportion.
|
|
}[]
|
|
|
|
\subsection{Questionnaire}
|
|
\label{results_questions}
|
|
|
|
\figref{results_questions} presents the questionnaire results of the \level{Matching} and \level{Ranking} tasks.
|
|
A non-parametric \ANOVA on \ART models were used for the \response{Difficulty} and \response{Realism} question results.
|
|
The other question results were analyzed using Wilcoxon signed-rank tests, with Holm-Bonferroni adjustment.
|
|
The results are shown as mean $\pm$ standard deviation.
|
|
|
|
On \response{Difficulty}, there were statistically significant effects of \factor{Task} (\anova{1}{57}{13}, \pinf{0.001}) and of \factor{Modality} (\anova{1}{57}{8}, \p{0.007}), but no interaction effect \factor{Task} \x \factor{Modality} (\anova{1}{57}{2}, \ns).
|
|
The \level{Ranking} task was found easier (\num{2.9 \pm 1.2}) than the \level{Matching} task (\num{3.9 \pm 1.5}), and the Haptic textures were found easier to discriminate (\num{3.0 \pm 1.3}) than the Visual ones (\num{3.8 \pm 1.5}).
|
|
|
|
Both haptic and visual textures were judged moderately realistic for both tasks (\num{4.2 \pm 1.3}), with no statistically significant effect of \factor{Task}, \factor{Modality} or their interaction on \response{Realism}.
|
|
No statistically significant effects of \factor{Task} on \response{Textures Match} and \response{Uncomfort} were found either.
|
|
The coherence of the texture pairs was considered moderate (\num{4.6 \pm 1.2}) and the haptic device was not felt uncomfortable (\num{2.4 \pm 1.4}).
|
|
|
|
\begin{subfigs}{results_questions}{Boxplots of the questionnaire results for each visual hand rendering.}[
|
|
Pairwise Wilcoxon signed-rank tests with Holm-Bonferroni adjustment: * is \pinf{0.05}, ** is \pinf{0.01} and *** is \pinf{0.001}.
|
|
Lower is better for Difficulty and Uncomfortable; higher is better for Realism and Textures Match.
|
|
][
|
|
\item By \factor{Modality}.
|
|
\item By \factor{Task}.
|
|
]
|
|
\subfigsheight{70mm}
|
|
\subfig{results/questions_modalities}
|
|
\subfig{results/questions_tasks}
|
|
\end{subfigs}
|