Fix vhar textures results

This commit is contained in:
2025-04-08 17:07:04 +02:00
parent 65ab354b62
commit 5f9638c718
20 changed files with 63 additions and 36 deletions

View File

@@ -7,27 +7,28 @@
\paragraph{Confusion Matrix}
\label{results_matching_confusion_matrix}
\figref{results/matching_confusion_matrix} shows the confusion matrix of the \level{Matching} task with the visual textures and the proportion of haptic texture selected in response, \ie the proportion of times the corresponding haptic texture was selected in response to the presentation of the corresponding visual texture.
A two-sample Pearson Chi-Squared test (\chisqr{64}{540}{420}, \pinf{0.001}) and Holm-Bonferroni adjusted binomial tests indicated that the following (\factor{Visual Texture}, \response{Haptic Texture}) pairs have proportion selections statistically significantly higher than chance (\ie \percent{11} each):
\figref{results/matching_confusion_matrix} shows the confusion matrix of the \level{Matching} task with the visual textures and the proportion of haptic texture selected in response, \ie the proportion of times the corresponding \response{Haptic Texture} was selected in response to the presentation of the corresponding \factor{Visual Texture}.
To determine which haptic textures were selected most often, the repetitions of the trials were first aggregated by counting the number of selections per participant for each (\factor{Visual Texture}, \response{Haptic Texture}) pair.
An \ANOVA based on mixed Poisson regression indicated a statistically significant effect on the number of selections of the interaction \factor{Visual Texture} \x\ \response{Haptic Texture} (\chisqr{64}{180}{414}, \pinf{0.001}).
No overdispersion was detected on the Poisson regression.
Post-hoc pairwise comparisons using the Tukey's \HSD test then indicated there was statistically significant differences for the following visual textures:
\begin{itemize}
\item (\level{Sandpaper~320}, \level{Coffee Filter}), (\level{Terra Cotta}, \level{Coffee Filter}), and (\level{Coffee Filter}, \level{Coffee Filter}) (\pinf{0.001} each);
\item (\level{Cork}, \level{Sandpaper~320}), (\level{Brick~2}, \level{Plastic Mesh~1}), (\level{Brick~2}, \level{Sandpaper~320}), (\level{Plastic Mesh~1}, \level{Sandpaper~320}), and (\level{Sandpaper~320}, \level{Plastic Mesh~1}) (\pinf{0.01}); and
\item (\level{Metal Mesh}, \level{Cork}), (\level{Cork}, \level{Velcro Hooks}), (\level{Velcro Hooks}, \level{Plastic Mesh~1}), (\level{Velcro Hooks}, \level{Sandpaper~320}), and (\level{Coffee Filter}, \level{Terra Cotta}) (\pinf{0.05} each).
\item With \level{Sandpaper~320}, \level{Coffee Filter} was more selected than the other haptic textures (\ztest{3.4}, \pinf{0.05} each) except \level{Plastic Mesh~1} and \level{Terra Cotta}.
\item With \level{Terra Cotta}, \level{Coffee Filter} was more selected than the others (\ztest{3.4}, \pinf{0.05} each) except \level{Plastic Mesh~1} and \level{Terra Cotta}.
\item With \level{Coffee Filter}, \level{Coffee Filter} was more selected than the others (\ztest{4.0}, \pinf{0.01} each) except \level{Terra Cotta}.
\end{itemize}
Except for one visual texture (\level{Sandpaper~100}) and 4 haptic textures (\level{Metal Mesh}, \level{Sandpaper~100}, \level{Brick~2}, and \level{Terra Cotta}), all haptic and visual textures were matched statistically significantly higher than chance with at least one visual and haptic texture, respectively.
However, many mistakes were made: the expected haptic texture was selected on average only \percent{20} of the time for five of the visual textures, and even around \percent{5} for (visual) \level{Sandpaper~100}, \level{Brick~2}, and \level{Sandpaper~320}.
Only haptic \level{Coffee Filter} was correctly selected \percent{59} of the time, and was also particularly matched with the visual \level{Sandpaper~320} and \level{Terra Cotta} (around \percent{45} each).
Similarly, the haptic textures \level{Sandpaper~320} and \level{Plastic Mesh~1} were also selected for four and three visual textures, respectively (around \percent{25} each).
Additionally, the Spearman correlations between the trials were computed for each participant and only 21 out of 60 were statistically significant (\pinf{0.05}), with a mean \spearman{0.52} (\ci{0.43}{0.59}).
\fig[0.82]{results/matching_confusion_matrix}{Confusion matrix of the \level{Matching} task.}[
\fig[0.85]{results/matching_confusion_matrix}{Confusion matrix of the \level{Matching} task results.}[%
With the presented visual textures as columns and the selected haptic texture in proportion as rows.
The number in a cell is the proportion of times the corresponding haptic texture was selected in response to the presentation of the corresponding visual texture.
The diagonal represents the expected correct answers.
Holm-Bonferroni adjusted binomial test results are marked in bold when the proportion is higher than chance (\ie more than \percent{11}, \pinf{0.05}).
]
Many mistakes were made: the expected haptic texture was selected on average only \percent{20} of the time for five of the visual textures, and even around \percent{5} for (visual) \level{Sandpaper~100}, \level{Brick~2}, and \level{Sandpaper~320}.
Only haptic \level{Coffee Filter} was correctly selected \percent{57} of the time, and was also particularly matched with the visual \level{Sandpaper~320} and \level{Terra Cotta} (around \percent{44} each).
Similarly, the haptic textures \level{Sandpaper~320} and \level{Plastic Mesh~1} were also selected for four and three visual textures, respectively (around \percent{25} each).
Additionally, the Spearman correlations between the trials were computed for each participant and only 21 out of 60 were statistically significant (\pinf{0.05}), with a mean \spearman{0.52} \ci{0.43}{0.59}.
These results indicate that the participants hesitated between several haptic textures for a given visual texture, as also reported in several comments, some haptic textures being more favored while some others were almost not selected at all.
Another explanation could be that the participants had difficulties to estimate the roughness of the visual textures.
Indeed, many participants explained that they tried to identify or imagine the roughness of a given visual texture then to select the most plausible haptic texture, in terms of frequency and/or amplitude of vibrations.
@@ -36,35 +37,49 @@ Indeed, many participants explained that they tried to identify or imagine the r
To verify that the difficulty with all the visual textures was the same on the \level{Matching} task, the \response{Completion Time} of a trial was analyzed.
As the \response{Completion Time} results were Gamma distributed, they were transformed with a log to approximate a normal distribution.
A \LMM on the log \response{Completion Time} with the \factor{Visual Texture} as fixed effect and the participant as random intercept was performed.
An \ANOVA based on a \LMM on the log \response{Completion Time} with the \factor{Visual Texture} as fixed effect and the participant as random intercept was performed.
Normality was verified with a QQ-plot of the model residuals.
No statistical significant effect of \factor{Visual Texture} was found (\anova{8}{512}{1.9}, \p{0.06}) on \response{Completion Time} (\geomean{44}{\s} \ci{42}{46}), indicating an equal difficulty and participant behaviour for all the visual textures.
\subsection{Textures Ranking}
\label{results_ranking}
\figref{results/ranking_mean_ci} presents the results of the three rankings of the haptic textures alone, the visual textures alone, and the visuo-haptic texture pairs.
\figref{results/rankings_modality} presents the results of the three rankings of the haptic textures alone, the visual textures alone, and the visuo-haptic texture pairs.
For each ranking, a Friedman test was performed with post-hoc Wilcoxon signed-rank tests and Holm-Bonferroni adjustment.
\fig[1]{results/rankings_modality}{Means with bootstrap \percent{95} \CI of the \level{Ranking} task results for each \factor{Modality}.}[%
Shown for the haptic textures alone (left), the visual textures alone (center) and the visuo-haptic textures pairs (right).
The order of the visual textures on the x-axis differs between modalities.
A lower rank means that the texture was considered rougher, a higher rank means smoother.
Wilcoxon signed-rank tests and Holm-Bonferroni adjustment: all comparisons were statistically significantly different (\pinf{0.05}) except when marked with an \enquote{X}.
]
\paragraph{Haptic Textures Ranking}
Almost all the texture pairs in the haptic textures ranking results were statistically significantly different (\chisqr{8}{20}{146}, \pinf{0.001}; \pinf{0.05} for each comparison), except between (\level{Metal Mesh}, \level{Sandpaper~100}), (\level{Cork}, \level{Brick~2}), (\level{Cork}, \level{Sandpaper~320}) (\level{Plastic Mesh~1}, \level{Velcro Hooks}), and (\level{Plastic Mesh~1}, \level{Terra Cotta}).
Almost all the texture pairs in the haptic textures ranking results were statistically significantly different (\chisqr{8}{20}{146}, \pinf{0.001}; \pinf{0.05} for each comparison; see \figref{results/rankings_modality}, left).
However, no difference was found between the pairs (\level{Metal Mesh}, \level{Sandpaper~100}), (\level{Cork}, \level{Brick~2}), (\level{Cork}, \level{Sandpaper~320}), (\level{Plastic Mesh~1}, \level{Velcro Hooks}), and (\level{Plastic Mesh~1}, \level{Terra Cotta}).
Average Kendall's Tau correlations between the participants indicated a high consensus (\kendall{0.82} \ci{0.81}{0.84}) showing that participants perceived similarly the roughness of the haptic textures.
\paragraph{Visual Textures Ranking}
Most of the texture pairs in the visual textures ranking results were also statistically significantly different (\chisqr{8}{20}{119}, \pinf{0.001}; \pinf{0.05} for each comparison), except for the following groups: \{\level{Metal Mesh}, \level{Cork}, \level{Plastic Mesh~1}\}; \{\level{Sandpaper~100}, \level{Brick~2}, \level{Plastic Mesh~1}, \level{Velcro Hooks}\}; \{\level{Cork}, \level{Velcro Hooks}\}; \{\level{Sandpaper~320}, \level{Terra Cotta}\}; and \{\level{Sandpaper~320}, \level{Coffee Filter}\}.
Most of the texture pairs in the visual textures ranking results were also statistically significantly different (\chisqr{8}{20}{119}, \pinf{0.001}; \pinf{0.05} for each comparison; see \figref{results/rankings_modality}, center), except for the following.
No difference was found between \level{Plastic Mesh~1} and \level{Metal Mesh}, \level{Brick 2}, \level{Sandpaper 100}, \level{Cork}, \level{Velcro Hooks};
nor between \level{Velcro Hooks} and \level{Sandpaper 100}, \level{Cork}, \level{Brick 2}.
No difference was also found between the pairs (\level{Metal Mesh}, \level{Cork}), (\level{Sandpaper~100}, \level{Brick~2}), (\level{Sandpaper~320}, \level{Terra Cotta}) and (\level{Sandpaper~320}, \level{Coffee Filter}).
Even though the consensus was high (\kendall{0.61} \ci{0.58}{0.64}), the roughness of the visual textures were more difficult to estimate, in particular for \level{Plastic Mesh~1} and \level{Velcro Hooks}.
\paragraph{Visuo-Haptic Textures Ranking}
Also, almost all the texture pairs in the visuo-haptic textures ranking results were statistically significantly different (\chisqr{8}{20}{140}, \pinf{0.001}; \pinf{0.05} for each comparison), except for the following groups: \{\level{Sandpaper~100}, \level{Cork}\}; \{\level{Cork}, \level{Brick~2}\}; and \{\level{Plastic Mesh~1}, \level{Velcro Hooks}, \level{Sandpaper~320}\}.
Also, almost all the texture pairs in the visuo-haptic textures ranking results were statistically significantly different (\chisqr{8}{20}{140}, \pinf{0.001}; \pinf{0.05} for each comparison; see \figref{results/rankings_modality}, right).
However, no difference was found between the textures for each of the following groups: \{\level{Sandpaper~100}, \level{Cork}\}; \{\level{Cork}, \level{Brick~2}\}; and \{\level{Plastic Mesh~1}, \level{Velcro Hooks}, \level{Sandpaper~320}\}.
The consensus between the participants was also high \kendall{0.77} \ci{0.74}{0.79}.
Finally, calculating the similarity of the three rankings of each participant, the \textit{Visuo-Haptic Textures Ranking} was on average highly similar to the \textit{Haptic Textures Ranking} (\kendall{0.79} \ci{0.72}{0.86}) and moderately to the \textit{Visual Textures Ranking} (\kendall{0.48} \ci{0.39}{0.56}).
Finally, the similarity of the three rankings of each participant was calculated (see \figref{results/rankings_texture}).
The \textit{Visuo-Haptic Textures Ranking} was on average highly similar to the \textit{Haptic Textures Ranking} (\kendall{0.79} \ci{0.72}{0.86}) and moderately to the \textit{Visual Textures Ranking} (\kendall{0.48} \ci{0.39}{0.56}).
A Wilcoxon signed-rank test indicated that this difference was statistically significant (\wilcoxon{190}, \p{0.002}).
These results indicate that the two haptic and visual modalities were integrated together, the resulting roughness ranking being between the two rankings of the modalities alone, but with haptics predominating.
\fig[0.7]{results/ranking_mean_ci}{Means with bootstrap \percent{95} \CI of the three rankings of the haptic textures alone, the visual textures alone, and the visuo-haptic texture pairs. }[
\fig[1]{results/rankings_texture}{Means with bootstrap \percent{95} \CI of the \level{Ranking} task results for each \factor{Visual Texture}.}[%
A lower rank means that the texture was considered rougher, a higher rank means smoother.
]
@@ -73,17 +88,17 @@ These results indicate that the two haptic and visual modalities were integrated
The high level of agreement between participants on the three haptic, visual and visuo-haptic rankings in the \level{Ranking} task (\secref{results_ranking}), as well as the similarity of the within-participant rankings, suggest that participants perceived the roughness of the textures similarly, but differed in their strategies for matching the haptic and visual textures in the \level{Matching} task (\secref{results_matching}).
To further investigate the perceived similarity of the haptic and visual textures and to identify groups of textures that were perceived as similar on the \level{Matching} task, a correspondence analysis and a hierarchical clustering were performed on the matching task confusion matrix (\figref{results/matching_confusion_matrix}).
To further investigate the perceived similarity of the haptic and visual textures, and to identify groups of textures that were perceived as similar on the \level{Matching} task, a correspondence analysis and a hierarchical clustering were performed on the matching task confusion matrix (\figref{results/matching_confusion_matrix}).
\paragraph{Correspondence Analysis}
The correspondence analysis captured \percent{60} and \percent{29} of the variance in the first and second dimensions, respectively, with the remaining dimensions each accounting for less than \percent{5} each.
\figref{results/matching_correspondence_analysis} shows the first two dimensions with the 18 haptic and visual textures.
The first dimension was similar to the rankings (\figref{results/ranking_mean_ci}), distributing the textures according to their perceived roughness.
The first dimension was similar to the rankings (\figref{results/rankings_texture}), distributing the textures according to their perceived roughness.
It seems that the second dimension opposed textures that were perceived as hard with those perceived as softer, as also reported by participants.
Stiffness is indeed an important perceptual dimension of a material (\secref[related_work]{hardness}).% \cite{okamoto2013psychophysical,culbertson2014modeling}.
Stiffness is indeed an important perceptual dimension of a material (\secref[related_work]{hardness}).
\fig[1]{results/matching_correspondence_analysis}{Correspondence analysis of the confusion matrix of the \level{Matching} task.}[
\fig[1]{results/matching_correspondence_analysis}{Correspondence analysis of the confusion matrix of the \level{Matching} task.}[%
The closer the haptic and visual textures are, the more similar they were judged.
The first dimension (horizontal axis) explains \percent{60} of the variance, the second dimension (vertical axis) explains \percent{29} of the variance.
The confusion matrix is shown in \figref{results/matching_confusion_matrix}.
@@ -93,11 +108,11 @@ Stiffness is indeed an important perceptual dimension of a material (\secref[rel
\figref{results_clusters} shows the dendrograms of the two hierarchical clusterings of the haptic and visual textures, constructed using the Euclidean distance and the Ward's method on squared distance.
The four identified haptic texture clusters were: "Roughest" \{\level{Metal Mesh}, \level{Sandpaper~100}, \level{Brick~2}, \level{Cork}\}; "Rougher" \{\level{Sandpaper~320}, \level{Velcro Hooks}\}; "Smoother" \{\level{Plastic Mesh~1}, \level{Terra Cotta}\}; "Smoothest" \{\level{Coffee Filter}\} (\figref{results/clusters_haptic}).
Similar to the haptic ranks (\figref{results/ranking_mean_ci}), the clusters could have been named according to their perceived roughness.
The four identified haptic texture clusters were: \enquote{Roughest} \{\level{Metal Mesh}, \level{Sandpaper~100}, \level{Brick~2}, \level{Cork}\}; \enquote{Rougher} \{\level{Sandpaper~320}, \level{Velcro Hooks}\}; \enquote{Smoother} \{\level{Plastic Mesh~1}, \level{Terra Cotta}\}; \enquote{Smoothest} \{\level{Coffee Filter}\} (\figref{results/clusters_haptic}).
Similar to the haptic ranks (\figref{results/rankings_modality}, left), the clusters could have been named according to their perceived roughness.
It also shows that the participants compared and ranked the haptic textures during the \level{Matching} task to select the one that best matched the given visual texture.
The five identified visual texture clusters were: "Roughest" \{\level{Metal Mesh}\}; "Rougher" \{\level{Sandpaper~100}, \level{Brick~2}, \level{Velcro Hooks}\}; "Medium" \{\level{Cork}, \level{Plastic Mesh~1}\}; "Smoother" \{\level{Sandpaper~320}, \level{Terra Cotta}\}; "Smoothest" \{\level{Coffee Filter}\} (\figref{results/clusters_visual}).
The five identified visual texture clusters were: \enquote{Roughest} \{\level{Metal Mesh}\}; \enquote{Rougher} \{\level{Sandpaper~100}, \level{Brick~2}, \level{Velcro Hooks}\}; \enquote{Medium} \{\level{Cork}, \level{Plastic Mesh~1}\}; \enquote{Smoother} \{\level{Sandpaper~320}, \level{Terra Cotta}\}; \enquote{Smoothest} \{\level{Coffee Filter}\} (\figref{results/clusters_visual}).
They are also easily identifiable on the visual ranking results, which also made it possible to name them.
\begin{subfigs}{results_clusters}{Dendrograms of the hierarchical clusterings of the confusion matrix of the \level{Matching} task.}[
@@ -114,21 +129,33 @@ They are also easily identifiable on the visual ranking results, which also made
\paragraph{Confusion Matrices of Clusters}
Based on these results, two alternative confusion matrices were constructed.
Similarly to \secref{results_matching}, an \ANOVA based on mixed Poisson regression was performed for each confusion matrix on the number of selections, followed by post-hoc pairwise comparisons using the Tukey's \HSD test. No overdispersion was detected on the Poisson regressions.
\figref{results/haptic_visual_clusters_confusion_matrices} (left) shows the confusion matrix of the \level{Matching} task with visual texture clusters and the proportion of haptic texture clusters selected in response.
A two-sample Pearson Chi-Squared test (\chisqr{16}{540}{353}, \pinf{0.001}) and Holm-Bonferroni adjusted binomial tests indicated that the following (Visual Cluster, Haptic Cluster) pairs have proportion selections statistically significantly higher than chance (\ie \percent{20} each): %
(Roughest, Roughest), (Rougher, Rougher), (Medium, Rougher), (Medium, Smoother), (Smoother, Smoother), (Smoother, Smoothest), and (Smoothest, Smoothest) (\pinf{0.005} each).
There was a statistically significant effect on the number of selections of the interaction visual texture cluster \x\ haptic texture cluster (\chisqr{12}{180}{324}, \pinf{0.001}), and statistically significant differences for the following visual clusters:
\begin{itemize}
\item With \enquote{Roughest}, the haptic cluster \enquote{Roughest} was the most selected (\ztest{4.6}, \pinf{0.001}).
\item With \enquote{Rougher}, \enquote{Smoothest} was the least selected (\ztest{-4.0}, \pinf{0.001}) and \enquote{Rougher} more than \enquote{Smoother} (\ztest{-3.4}, \pinf{0.001}).
\item With \enquote{Medium}, \enquote{Rougher} and \enquote{Smoother} were both (\ztest{4.5}, \pinf{0.001}) more selected than \enquote{Roughest} and \enquote{Smoothest}.
\item With \enquote{Smoother}, \enquote{Smoother} (\ztest{4.2}, \pinf{0.001}) and \enquote{Smoothest} (\ztest{4.7}, \pinf{0.001}) were both more selected than \enquote{Roughest} and \enquote{Rougher}.
\item With \enquote{Smoothest}, \enquote{Smoother} (\ztest{2.6}, \p{0.05}) and \enquote{Smoothest} (\ztest{3.9}, \pinf{0.001}) were both more selected than \enquote{Roughest} and \enquote{Rougher}.
\end{itemize}
\figref{results/haptic_visual_clusters_confusion_matrices} (right) shows the confusion matrix of the \level{Matching} task with visual texture ranks and the proportion of haptic texture clusters selected in response.
A two-sample Pearson Chi-Squared test (\chisqr{24}{540}{342}, \pinf{0.001}) and Holm-Bonferroni adjusted binomial tests indicated that the following (Visual Texture Rank, Haptic Cluster) pairs have proportion selections statistically significantly higher than chance: %
(0, Roughest); (1, Rougher); (2, Rougher); (3, Rougher); (4, Rougher); (5, Smoother); (6, Smoother); (7, Smoothest); and (8, Smoothest) (\pinf{0.05} each).
This shows that the participants consistently identified the roughness of each visual texture and selected the corresponding haptic texture cluster.
There was a statistically significant effect on the number of selections of the visual texture rank \x\ haptic texture cluster interaction (\chisqr{24}{180}{340}, \pinf{0.001}), and statistically significant differences for the following visual texture ranks:
\begin{itemize}
\item Rank 0: the haptic cluster \enquote{Roughest} was the most selected (\ztest{4.5}, \pinf{0.001}).
\item Ranks 1, 2 and 3: \enquote{Smoothest} was the least selected (\ztest{-3.0}, \p{0.04}).
\item Rank 4: \enquote{Rougher} was more selected than \enquote{Roughest} and \enquote{Smoothest} (\ztest{3.0}, \p{0.03}).
\item Rank 5: \enquote{Rougher} and \enquote{Smoother} were both (\ztest{4.5}, \p{0.01}) more selected than \enquote{Roughest} and \enquote{Smoothest}.
\item Rank 6: \enquote{Smoother} was more selected than \enquote{Roughest} (\ztest{3.2}, \p{0.006}).
\item Rank 7: \enquote{Smoother} and \enquote{Smoothest} were both (\ztest{3.4}, \p{0.04}) more selected than \enquote{Roughest} and \enquote{Rougher}.
\item Rank 7: \enquote{Smoother} and \enquote{Smoothest} were both (\ztest{3.2}, \p{0.04}) more selected than \enquote{Roughest} and \enquote{Rougher}.
\end{itemize}
\fig{results/haptic_visual_clusters_confusion_matrices}{
Confusion matrices of the visual texture (left) or rank (right) with the corresponding haptic texture clusters selected in proportion.
}[
Holm-Bonferroni adjusted binomial test results are marked in bold when the proportion is higher than chance (\ie more than \percent{20}, \pinf{0.05}).
]
}[]
\subsection{Questionnaire}
\label{results_questions}
@@ -152,7 +179,7 @@ The coherence of the texture pairs was considered moderate (\num{4.6 \pm 1.2}) a
\item By \factor{Modality}.
\item By \factor{Task}.
]
\subfigsheight{75mm}
\subfigsheight{70mm}
\subfig{results/questions_modalities}
\subfig{results/questions_tasks}
\end{subfigs}