Viewpoint dependence versus orientation dependence

Related to many of the topics above is the debate over viewpoint dependence in spatial representations.Viewpoint dependence was originally debated (and continues to be debated) as a property of visual object representations, and that term has been used interchangeably with “orientation dependence” (Biederman, 1987; Biederman & Gerhardstein, 1993, 1995; Tarr, 1995; Tarr & Pinker, 1989, 1990, 1991). For visual object recognition, viewpoint and orientation have very similar connotations; however, the implications for spatial cognition may be different, particularly with respect to the role of vision and other modalities in the representation.

A viewpoint-dependent representation of space denotes a representation that is specific with regard to both the location and orientation of the observer at the time of encoding. Implicit in this type of representation is the need for visual experience.

That is, space is represented with respect to a learned viewpoint. Data from scene recognition experiments support this kind of highly visual, view-specific representation of spatial information (Christou & Biilthoff, 1999; Diwadkar & McNamara, 1997; Shelton & McNamara, 2001b, 2004a, 2004b; Shelton & Pippitt, 2007;Waller, 2006). For example, Waller (2006) asked participants to learn scenes of objects and compared recognition for images that were taken from the same viewpoint to those that were translated forward, backward, or laterally. Recognition of forward and lateral translations was slower and less accurate than recognition of the original image, suggesting that participants recognized the specific learned viewpoint better than translated viewpoints. In addition, Shelton and McNamara (2004a) investigated scene recognition following navigational learning from different perspectives. The results suggested that the degree of visual similarity from study to test was associated with the speed of scene recognition, indicating fastest recognition for the exact viewpoint seen during encoding (details of this study are discussed later in this chapter).Taken together, such results support viewpoint-dependent representations.

Scene recognition is a visual matching task, and viewpoint dependence denotes the capture of spatial information from a specified view—implied to be a visually experienced view of the space. As noted above, however, humans have the capacity to learn and represent spatial information from multiple modalities with equivalent access to that information after learning, raising questions about how viewpoint dependence might be defined in other modalities. Even if we relax the dependence on a visual view, a viewpoint still denotes a stationary position and heading. This necessity for experiencing space from a static position may apply to vision and possibly audition, but it cannot account for other forms of learning. For example, Yamamoto and Shelton (2005) compared visual learning to proprioceptive learning (broadly defined) of room-sized layouts. As shown in Figure 2.4A, viewpoint for the visually learned space is easily defined by the stationary position and heading of the observer. In contrast, for the proprioceptively learned space, the spatial information must be learned from the movements by changing positions along a path, in this case, while maintaining the same heading in space (Figure 2.4B). As a result, the “viewpoint” is constantly changing, and these dynamics make defining the viewpoint in viewpoint dependence complicated for nonvisual modalities.

An alternative to viewpoint dependence for spatial representations is orientation dependence. Orientation dependence refers to a broader concept of accessing a spatial memory from a particular orientation in space. In an orientation-dependent representation, there is greater emphasis placed on the heading in space than on the exact position of the observer. Alignment effects provide strong support for orientation dependence in spatial memory acquired in vision (e.g., Easton & Sholl, 1995; Holmes Sc Sholl. 2005; McNamara, 2003; McNamara, Rump, Sc Werner, 2003; Roskos-Ewoldsen, McNamara, Shelton, Sc Carr, 1998; Shelton Sc McNamara, 1997, 2001a, 2001b, 2004a, 2004b; Sholl Sc Nolin, 1997;Yamamoto & Shelton, in press) and other modalities (Shelton Sc McNamara, 2001b, 2004a, 2004b;Yamamoto Sc Shelton, 2005, 2007, 2009). For example, Shelton and McNamara (2001a) had participants learn room-sized layouts and tested memory with judgments of relative

Schematics of learning conditions used in Yamamoto & Shelton (2005)

FIGURE 2.4 Schematics of learning conditions used in Yamamoto & Shelton (2005).

A. Visual learning. 0° is the stationary view, and dashed lines indicate the direction to each object from the viewpoint. B. Proprioceptive learning (blindfolded walking) from a single orientation. Dashed line shows the path. Gray arrows show a vector field corresponding to the common orientation maintained throughout encoding

direction. Across multiple experiments, the results revealed that participants had preferential access to one orientation over all novel orientations and even some previously learned orientations. These results were taken as an indication that the representation was dependent on a preferred orientation on the space.

The key difference between viewpoint dependence and orientation dependence is in flexibility for retrieving information from different positions within a preferred orientation. In both orientation- and viewpoint-dependent representations, there should be preferential access to the orientation of the representation. Only in viewpoint-dependent representations, however, would a cost also be expected for changes in position within the preferred orientation. Although Waller (2006) showed some evidence for a cost in scene recognition after translations, it was not clear for all types of translations. For imagined judgments about locations and directions, the evidence is even less clear. Studies on the role of physical movement in imagining new locations and headings suggest that rotations but not translations improve performance relative to a no-movement, imagine-only baseline (Presson & Montello, 1994; Rieser, 1989). These results indicate the possibility that mentally translating a viewpoint can be done with very little cost. However, there has been some limited evidence for a cost in mental translations (Easton & Sholl, 1995; Tlauka, 2006). For example, Tlauka asked participants to learn an array of objects that included three possible viewing positions in addition to the actual learning position. The additional viewing positions were the to-be-imagined positions for the test and reflected different combinations of rotation and translation from the actual learned viewpoint. The results revealed that judgments from positions with imagined rotations were more than 200 ms slower than the original viewpoint or translated views, but the lateral translations (without rotation) also incurred about a 90-ms cost in response latency relative to the original viewpoint. It is notable, however, that there were no differences between the rotational conditions based on whether they included forward translations or forward + lateral translations. Taken together, these findings suggest that rotations are computationally more demanding than translations, as predicted by orientation dependence, but they do not completely discount some degree of viewpoint specificity as well.

Although the evidence is not conclusive with regard to viewpoint versus orientation dependence, positing orientation dependence has certain advantages. First, orientation dependence can more readily accommodate multiple modalities without having to establish different principles across modalities—an important issue given that different modalities can support equivalent performance. As illustrated in Figure 2.4B, for example, while it is difficult to give a strict definition of viewpoint dependence in proprioceptive learning, orientation dependence is readily defined. Even if we accept that viewpoint need not be strictly visual, viewpoint dependence in proprioceptive and haptic learning would still require specifying a mechanism by which a viewpoint might be selected from the many learned positions throughout learning. For haptic learning, one can use the position of learning as a virtual viewpoint on the space. That is, the extension of the arms to each object originates from a particular position, and moving about the space would cause the origin of this proprioceptive information to shift. Such viewpoint dependence for haptic learning accounts for the observation of small but significant translation effects in haptics (Klatzky, 1999). For proprioception from blindfolded walking, this notion of a viewpoint selection may be more akin to finding some canonical position for representing the space. Such canonical positions have already been suggested by Waller (2006) to account for the observation that some translations had an effect when others did not in visual learning.

A second potential advantage of orientation dependence is that it is consonant with theories of spatial representation that posit non-egocentric/environmentally centered reference frames. Unlike viewpoint dependence, which seems to suggest a largely egocentric (learned-position) basis for representation (e.g.,Tlauka, 2006), orientation dependence does not require that the preferred orientation be a directly experienced orientation. As such, orientation dependence can more readily accommodate observations of non-egocentric orientations emerging as the preferred orientations in memory (e.g., Mou, Liu, & McNamara, 2009; Mou & McNamara, 2002; Mou, Zhao, & McNamara, 2007). For example, Mou and McNamara (2002) asked participants to learn room-sized object displays that had strong intrinsic structure when observed from a view that was 45° away from the learning position. If participants were alerted to the structure, the 45° view would become the preferred orientation for memory retrieval. Mou and McNamara suggested that this reflected the selection of an intrinsic reference frame that could be based on either egocentric experience or salient structures in the environment.

Returning to visual memory, viewpoint dependence reflects representational constructs that are more analogous to the type of coding one would expect for visual information. That is, we have a point of origin (namely, the eyes) from which we observe the world visually, and viewpoint dependence suggests a similar anchoring position. Orientation dependence is less directly tied to notions of visual coding and may be more commensurate with supramodal theories of spatial information. For example, the principal reference theory (e.g., McNamara & Valiquette, 2004; Shelton & McNamara, 2001a; Werner & Schmidt, 1999), upon which the intrinsic theories have been built, suggests that any environmental learning will begin with the selection of a principal orientation, without regard for the degree to which it can be tied to vision. However, the principal reference theory and other supramodal theories are agnostic with regard to how experience might cause this supramodal system to be more tuned for and/or more readily connected to visual inputs. As such, they cannot discount some prominent role for vision as the primary input or as an intermediary for other modalities.


In the preceding sections, we have outlined some of the major issues and debates surrounding the properties of spatial representations and how they might be related to vision and visual memory. The jury is still out on a number of these issues, reflecting the lack of a unifying theory in the spatial cognition literature. The balance of the data supports the claim that sighted individuals rely heavily on visual information for spatial learning. However, they also highlight the ability for humans, blind or sighted, to use many other sources of input to acquire spatial information.

< Prev   CONTENTS   Source   Next >