Findings

Undergraduate students were the primary source of subjects, although about 20% of the pool were over the age of 24. Seventy-three percent of the subjects were male. Eighty- three percent were in science/technology/engineering/math (STEM) majors. In all, 112 subjects’ simulations were used for analysis, generating 6664 data entry events for timing and accuracy analysis, and 1662 situational awareness assessment data points.

Each of the 112 subjects received 72 audio messages, which would optimally result in 8064 data entry events. Of the 6664 entries (82.6% of the optimal number) that were recorded, only 5339 (80.1%) could be matched to an audio message delivered within the prior 30 seconds, and so could be used for timing and accuracy assessment.

There is a Significant Effect on the Speed of Data Entry Correlated with the Different Interface Design

A series of analyses of variance (ANOVA’s) comparing the timings across the three interface variants at the slow message arrival rate showed that the time needed to enter data was significantly reduced with the interface supplements of color-coded backgrounds and icons, as well as the text transcriptions. (p = 0.0494) However, the significant difference in timing was not evident when the messages arrival rate was increased. (p = 0.6076, Figure 16.4).

The Validity of the Data Collected Quickly Is Maximized in Interface #2

Accuracy of the data entries were measured in two distinct ways: entries that could be matched to audio messages delivered in 30 seconds prior to the entry event and errant entry events that could not be matched. A gross analysis comparing the accurate and errant reporting rates across the three interface variants was run, finding a slightly improved accuracy rate for the interface, which included text transcriptions. All other comparisons were flat, with accuracy rates in the mid-50%'s and errant reports in the upper 20%'s. The remaining percentages for each interface represent audio messages, which were not reported at all (Figures 16.5 and 16.6).[1]

Session aggregated response timing (in seconds) across experiment conditions; mean values are surrounded by 95% confidence intervals

FIGURE 16.4 Session aggregated response timing (in seconds) across experiment conditions; mean values are surrounded by 95% confidence intervals.

Accurate reporting rates (based on announcements) by interface

FIGURE 16.5 Accurate reporting rates (based on announcements) by interface.

The percentage of active errors was used as the primary metric of accuracy. Active errors were defined to be the data entries, which could not be matched with an audio message that had been delivered in 30 seconds prior to the entry event. This metric excluded errors of omission, might have been explained simply by the pace of the arriving messages.

Errant reporting rates (based on announcements) by interface

FIGURE 16.6 Errant reporting rates (based on announcements) by interface.

The percentage of invalid reports during the slow period was minimized (though not significantly so) with the third interface variant, whereas in the fast period, errors were minimized (this time with statistical significance: p = 0.0049) using the second interface variant. The difference between the second and third interface designs is the addition of a text transcription of the audio messages. A possible interpretation is that during the slow period, participants were able to verify the messages using both the audio and the text, whereas at the fast rate of message arrival they struggled to continue to use both, a strategy which backfired and caused more errors (Figure 16.7).[2]

The Rate of Errors When Measuring Situational Awareness across the Three Interfaces Was Just under Statistical Significance; However, the Magnitude of the Errors Was Significantly Higher for Interface #3

The SAGAT measures across the three interface variants did not show a significant difference in the number of accurate SA-Level-1 reports (p = 0.0580). However, the magnitude of errors, when they were made (roughly 25% of the time) was highly significant (p < 0.0001). Interface #3, with the transcriptions of the audio messages showed a near doubling of the average difference between the reported values and the actual values (Figure 16.8).

Throughout the Analysis, Gender Was the Most Often Significant Covariate; Other Covariates Can Be Correlated with Gender

Throughout the analysis of the collected data, the most common significant covariate was gender. Other covariates that appeared are all commonly correlated with

Percent of reports which are invalid, by experiment group; mean values are surrounded by 95% confidence intervals; the fast period measures are significantly different

FIGURE 16.7 Percent of reports which are invalid, by experiment group; mean values are surrounded by 95% confidence intervals; the fast period measures are significantly different.

Mean values surrounded by 95% confidence intervals for level-1 situation awareness error magnitudes (Reported values minus actual values) across interface variants

FIGURE 16.8 Mean values surrounded by 95% confidence intervals for level-1 situation awareness error magnitudes (Reported values minus actual values) across interface variants.

gender: time spent playing video games, cognitive style (a measure of brain-side dominance), and so on. However, the size and composition of this study’s sample group (112 valid participants, 27% female, and 77% STEM majors) call any would-be conclusions regarding gender into question.

  • [1] The slightly improved accuracy of interface #3 was not statistically significant in this gross (all dataper interface, without timings broken out) analysis. Other tests of significance were not assessed onthese values in favor of the more detailed analysis to follow, with the message timings being included.
  • [2] Covariate analysis showed gender to be a highly significant factor (p = 0.0002). Subsequent analysisshowed that both genders were more accurate with the addition of graphical cues to the interface, butthe improvement was much more dramatic for males. However, the addition of the text transcriptions ininterface #3 negated the improvement for women, nearly doubling the error rate.
 
Source
< Prev   CONTENTS   Source   Next >