Form and Function
Abstract and Keywords
This chapter addresses the question of how we can account for spatial language. Perceptual features, functional features, schemas, context, and affordances are among the bases proposed. It is argued that all can be operative. A survey of research from a variety of entity domains, especially natural kinds, artifacts, bodies, scenes, events, abstract categories, and design, and of relational domains, especially spatial relations, shows that perceptual features, especially form or structure, allow inferences to function, forming perceptual–functional units or affordances. Language abets inferences from form to function. These perceptual–functional units account for the coherence of category features and provide the basis for causal reasoning.
How can we account for spatial language? Perceptual features, functional features, schemas, context, and affordances are among the bases proposed. Here, it is argued that all can be operative. A survey of research from a variety of entity domains, especially natural kinds, artifacts, bodies, scenes, events, abstract categories, and design, and of relational domains, especially spatial relations, shows that perceptual features, especially form or structure, allow inferences to function, forming perceptual–functional units or affordances. Language abets inferences from form to function. These perceptual–functional units account for the coherence of category features and provide the basis for causal reasoning.
When questions in cognition are formulated, ‘Is it Theory A or Theory B (think: parallel vs. serial; top-down or bottom-up; prototype vs. central tendency; and here, perceptual vs. functional)?’, my usual inclination is to answer: ‘Both. And something else we haven’t thought of.’ Cognition is biological, and biological systems are partially redundant, partially overlapping. Just take the visual system in the brain as an example. To call it ‘a’ system is to vastly oversimplify; ‘it’ is a complex of overlapping systems. Even for ‘what’ and ‘where’ (e.g. Ungerleider and Mishkin, 1982), or, alternatively, ‘what’ and ‘how’ (e.g. Milner and Goodale, 1995), there are many visual and spatial systems in the brain. This is not to say that the various systems are identical; rather like so-called synonyms, they have similarities with differences. And so for functional vs. perceptual in language: both. And more.
(p. 332 ) 21.2 Accounting for Spatial Language
The challenge we have been given in the workshop is no more nor less than the basis for the meaning of spatial language. Several alternatives were proposed: perceptual features, functional features, schemas, contexts, affordances. What are these, and how are they different? Are they truly alternatives, or might each be operative? Each of these concepts, like most useful concepts, is slippery, hard to pin down. They have meanings that are situated, hence switch with situations; indeed, some of them are used to describe exactly that phenomenon of meanings. Consonant with that view, the uses of these concepts should become clearer as they are analyzed. I believe it is easy to show that all of these alternatives are operative in using spatial language, and not just spatial language.
A priori, spatial language refers to aspects of space. A priori, language serves functions, to inform or affect behavior; language is situated, context bound, dependent on schemas shared by participants. Consider in. Of a bowl full of cherries, we can say: ‘the cherries are in the bowl’ when the cherries are contained by the bowl, that is, the cherries have a certain spatial relation with respect to the bowl. So perceptual, spatial properties determine, at least in part, at least sometimes, use of spatial language. Now suppose, to parasketch an example of Garrod and Simon (1988), this is a grand display of cherries, extending above the edge of the bowl, with a sprig of cherry blossoms on top. We can say the sprig of cherry blossoms is in the bowl, as it is supported by the bowl, controlled by the bowl, functionally dependent on the bowl in the sense that the bowl’s location determines the sprig’s location. So functional properties determine, at least in part, at least sometimes, use of spatial language. In fact, evidence abounds showing that both spatial and functional properties affect comprehension and production of spatial language (e.g. Carlson-Radvansky, Covey, and Lattanzi, 1999; Carlson-Radvansky and Irwin, 1993, 1994; Carlson-Radvansky and Logan, 1997; Carlson-Radvansky and Radvansky, 1996; Coventry, 1998; Coventry, Prat-Sala, and Richards, 2001; Coventry, Carmichael, and Garrod, 1994, Franklin, Henkel, and Zangas, 1995; Franklin and Tversky, 1990; Hayward and Tarr, 1995; Logan and Sadler, 1996; Regier, 1996, 1997). What’s more, as will be seen, spatial and functional properties are often not only correlated, but perceived to be causally related, used to make and justify causal inferences.
Schema has been used by Lakoff and his collaborators among others to explicate some spatial language (e.g. Lakoff, 1987). The claim is that terms like over have several different senses, each corresponding to a situated set of spatial relations. More generally, schema is used to refer to a knowledge structure having certain attributes and values, some of them presupposed or default (e.g. Rumelhart, 1980). People may have schemas not only for senses of words like prepositions but also for the things and situations they experience and talk about. Returning to our example, it would be odd to say, ‘the cherries are in the city’, even if it were a fact (p. 333 ) that the city contains the cherries. This is because cherries and cities bear no schematic relations. Indeed, the cherries are technically in many containers, a room, a building, a neighborhood, a country, and so on, but these large remote containers are not normally used as reference frames for such small objects as cherries. Unlike bowls, they are not perceived as pertinent. Context can change all that. If hidden in this bowl of cherries were a rare diamond, if the bowl had been spirited away, if it were fitted with a GPS locator, then knowing that the bowl of cherries is in this city and not in some other place would be crucial, and make sense as an utterance. Of course, all language is bound to context and shared schemas, not just spatial language (e.g. H. Clark, 1996). Just think about ‘here’ and ‘there’, perhaps the most basic spatial terms of all.
What about affordances? Affordances are relations between perceptions and actions or potential actions. Actions are frequently enactments of functions, so affordances can link spatial features to functions. To stretch the example that has worked rather well until now, if the bowl were hanging upside down in the air, an art installation perhaps, then it would not afford in. A suspended upside down bowl can no longer function as a container. Implicit in this analysis is the idea that spatial features and relations can affect functions so that the spatial and the functional are necessarily intertwined.
It seems the challenge has been answered, and the answer is ‘all of the above’. Does this mean an early end to this chapter? Not so fast. It provides an opportunity to explore the relations among these alternatives, especially the relations between perception and potential action, between appearance and function. These considerations will be discussed first for referring to things, then for referring to spatial relations among things. The reciprocal relations will also be examined, that is, the ways referring affects perception, including perceived function.
21.3 Referring to Things
21.3.1 How shall a thing be called? Basic level categories
Half a century ago, Roger Brown (1958) posed a language puzzle, the question that heads this section. Although things may be referred to by many labels, varying in generality, one level seems preferred over a range of contexts. That which we sit on, a chair. That which warms our legs, pants. That which we eat, a carrot or an apple. Not furniture. Not easy chair. Not clothing or khaki pants. Of course more general and more specific terms exist because they are useful in many contexts. Pippins are better for pies, fujis are better for eating. Dress pants are for job interviews and jeans for weekends. Furniture make us comfortable inside and tools build or repair things, including furniture. Context and schema affect choice of terms referring to spatial entities. Yet one level is the neutral, default, frequent level, the general-purpose way of referring, the basic level.
(p. 334 ) Brown raised the question; one of his students, Eleanor Rosch, along with her collaborators, proposed an answer (Rosch, 1978). This level of reference, the basic level, is the level that maximized the tradeoff between the information conveyed at that level of abstraction given the number of category distinctions that have to be kept in mind. The more precise the category, the greater the information conveyed, but also the greater the number of contrast categories. Informativeness was indexed by features people generated to categories at three levels of abstraction. The features people generate to category names are not meant to be taken as an objective catalog of their features, but rather as the features salient to people in thinking about the categories. Notable are the countless features that are not mentioned but that are ‘true’, that is, that people would agree apply to the categories. Molecules, solid, cells, breathes are but a few examples of features rarely produced. What features are produced depends on many things: on aspects of the world our perceptual systems are tuned to, on aspects context, notably implicit or explicit contrast categories, makes salient, on aspects that are readily named. The number of features generated increased considerably from the level of furniture and fruit, the superordinate level to the level of chair or apple, the basic level, but hardly at all from the basic level to the subordinate level, the level of coffee table or fuji apple. The number of contrast categories, however, increased at both levels, so going from superordinate to basic increases infor-mativeness as well as the number of categories that must be kept in mind, but going from basic to subordinate increases cost in the form of more category contrasts, but offers little benefit of increased informativeness.
Other cognitive operations converge on the basic level. Some of these are about appearance. Notably, the basic level is the highest level for which a collective shape can be recognized or an image generated. Other operations are behavioral or function: behaviors toward all bananas or all grapes are the same, but behaviors toward all fruit are not, so the basic level is the highest level where actions on objects are uniform. Still other operations are linguistic; the basic level is the first to enter a new language or a child’s vocabulary; it is also the level of preferred reference and fastest naming.
21.3.2 Parts characterize the basic level
The work of the Rosch group emphasized the quantitative differences among the levels, the number of features relative to category distinctions. Their analysis did not consider the qualitative differences in features at different levels of abstraction nor did it analyze the coherence of features within particular categories. Consideration of the features and further research reveals that these issues are related through a special kind of feature, parts (Tversky and Hemenway, 1984 primarily; also Tversky, 1985, 1989, 1990). That work is reviewed below.
At the superordinate level, the level of vehicle and vegetable, features informants regard as characteristic were few in number and primarily functional.
(p. 335 ) Vehicles are for transporting people and things. Vegetables provide nutrients. Tools are for fixing. The basic level elicited far more features, and these were primarily perceptual; the subordinate level also elicited perceptual features, barely more than the basic level. Within perceptual features there were differences between those elicited by the basic level and those elicited by the subordinate level. One type of perceptual feature was especially prevalent at the basic level: parts (Tversky and Hemenway, 1984). For tables, legs and tops. For bananas, peel and pulp. For pants, legs, zippers, pockets, and buttons. For trees, trunk, branches, and leaves. Other informants rated parts of objects and living things for goodness. Parts rated as good tend to be perceptually salient; typically, they are discontinuous with the overall shape of the object, that is, they extend from its contour, like arms and legs. Salient parts, in fact, seem to play an important role in object recognition (e.g. Biederman, 1987; Hoffman and Richards, 1984) and parts in their proper configuration determine the shapes of objects, also important in object recognition (Rosch, 1978).
21.3.3 Parts: Features of perception and of function
Parts are not only features of appearance, they are also features of function. It is this duality that seems to privilege parts (Tversky and Hemenway, 1984; see also, Tversky, 1985, 1990). Legs have a certain appearance, they typically come in pairs, they are vertically extended, they are grounded. But they also have a typical function: they support something resting on them. Likewise, a top has a typical appearance, a flat horizontal surface, but it is also at a height and of a strength and an extent to serve certain uses by humans. Part goodness ratings support this correspondence. Parts rated high in goodness tend to be functionally significant as well as perceptually salient. The heads and legs of animals, the trunk of a tree, the wheels of a car, the handle and head of a hammer are rated as relatively good parts. They have distinctive appearances as well as significant functions. Extensions of part names from one object to another illustrate the duality of part names. The head of a committee is the organizer, figuratively at the top, whereas the head of a pin is literally at the top. Both the foot of a bed and the foot of a class are at the bottom; the former correspondence is more spatial and the latter more functional. Different parts have different appearances and different functions. The peel of an apple or peach protects the pulp, the pulp provides nutrients, the seeds germinate and grow into new plants. The table legs support the top and the top supports books, coffee cups, and laptops (which canonically sit on the tops of laps).
The form and structure of many parts seem to suggest functions. A case can be made for affordances, as the term has come to be used, though that is probably not the intention Gibson had in mind. Here, the term refers to perceptual–functional (p. 336 ) units, where form suggests function, or at a minimum, action or interaction. Round things roll, square ones do not. Long, thin things are good for reaching. Squat solid vertically extended things are good for support. Bowl-like things suggest containment. Flat thin horizontally oriented things afford placing smaller objects. Manufactured things, of course, are designed to be functional, likewise their parts, so it is no accident that their parts suggest their functions.
21.3.5 Function for thing and function for user: natural kinds vs. artifacts
Note two different senses of function, one for living things, the other for artifacts. For living things, function means in the service of the things, instrumental to its needs and in some cases wants. The leaves of a tree conduct photosynthesis, the roots bring water and nutrients; both combine to nourish the tree and keep it alive. Thus the separate parts subserve separate behaviors of the organism. Artifacts, by contrast, are created by humans for human use; their origins are to be functional for humans (see also Bloom, 1996, 1998). Their separate parts serve humans in different ways. A hammer’s handle is for grasping and its head is for pounding. The seat of a chair supports one part of the human anatomy, the back another part (not by coincidence referred to by the same names); the legs of a chair support the seat and back so they can support the human user. For artifacts, then, the separate parts subserve separate activities or behaviors of the user. People regard and in effect manufacture some living things like artifacts, so that a chicken’s legs serve people differently than they serve chickens, banana skins serve people differently than they serve bananas, and tree trunks serve people differently than they serve trees. For living things, function is what they do; for artifacts, it’s what we do with them. The Spanish poet, Antonio Machado, expressed the duality of parts elegantly, coming down in favor of function for thing: ‘The eye you see is an eye/not because you see it;/ it’s an eye because it sees you’ (Machado, 1982, p. 177).
Form and function, appearance and use, are correlated for both cases, living things and artifacts, for both senses, function for thing and function for user. Not only are form and function correlated, they can be causally related. Function can often be inferred from form, in part because form can determine function. Tilted surfaces are not as suitable for sitting or putting as horizontal ones. Spheres are not as good for reaching as cylinders. Half spheres with hollow side up allow containment; upside down ones do not. Cylinders of a certain size make good handles. Long things are better for reaching than round things. Small, jointed extended parts like fingers are good for fine manipulations; large, jointed extended parts like arms are good for coarse actions. Round things roll, square things do not. Rough surfaces create more friction than smooth ones, useful for some purposes, a hindrance for others. For these, perception and action are coupled, spatial properties afford, in the recent sense, functions.
(p. 337 ) 21.3.6 Parts promote inferences from structure to function
The dual status of parts, as perceptual and functional features, means that they can serve as a link from appearance to use, that is, they can support inferences from form to function. Perceptual features are relatively immediate, they are apparent from static solo objects. Functional features are relatively less immediate. Some may be inferable from solo static objects in interaction with other objects. Support, containment, and attachment are among them. Of course these inferences may turn out to be mistaken, a table might be built into a wall rather than supported by legs, a bowl may have no bottom. Other functional features seem to be inferred from action, pushing, pulling, lifting, pounding. Still others seem to be knowledge-based, that seeds germinate and produce new plants, that apple peel is edible but banana peel is not, that both peels protect the pulp. Function also seems to require an end, a goal, a purpose, a use. Or that’s how we talk about it, rather than in terms of the spatial or action relations (Zacks, Tversky, and Iyer, 2001).
Functional attributes, because they are less immediate, should take longer to learn than attributes based on appearance. This may be the reason that children readily form basic level categories such as apples and cars at a young age, but form superordinate categories such as fruits and vehicles later (Rosch, Mervis, Gray, Johnson, and Boyes-Braem, 1976). Basic level categories can be formed on the basis of appearance, notably shape. Superordinate categories do not share perceptual features; rather, they share function. Consistent with this idea, children tend to form superordinate categories that share parts earlier than superordinates that do not share parts (Tversky, 1989). The shared parts increase perceptual similarity, facilitating grouping. At the same time, they encourage the realization of common function, facilitating the insight that function can underlie useful categorization.
Parts, then, seem to enable the convergence of the multitude of cognitive operations on the basic level. Because they are components of appearance, they account for the operations that depend on appearance. Because they are components of function, they account for the operations that depend on behavior. Breaks in language should follow the natural—and correlated—breaks in appearance and behavior. Categories themselves serve many functions in our lives. Primary among them are referring and inferring. Referring means identifying category members. Inferring means knowing what to do with them or what they do, that is, what properties they have.
21.3.7 Parts and category coherence: parts are the core of theories, knowledge, and causal reasoning
People do not regard categories as bundles of unrelated features. For one thing, in natural categories, features are correlated; things that have feathers fly and lay (p. 338 ) eggs, things that have fur bear their young alive (Malt and Smith, 1983; Rosch, 1978). Features are perceived to cohere, to be related in sensible ways. Several accounts for the perceived coherence of attributes for common categories have been proposed. One proposal is that people have theories about categories, albeit not the formal theories of physicists (e.g. Murphy and Medin, 1985), another proposal is ‘knowledge’ (e.g. Keil, 1989; Murphy, 2000), and a third is causal models (e.g. Ahn, 1998; Rehder, 1999). Significantly, the theories, knowledge, or causal reasoning proposed to support the claims are reasoning about parts, that is, assertions or hypotheses about the different functional roles of different parts, frequently accounting for part function from part form. Wings enable birds to fly, with feathers to augment their aerodynamic function. Wheels, because they are round, enable cars to move. Legs, solid vertical extensions, support tables and horses allowing them to remain upright.
Theory-making and causal reasoning begin by noticing regularities in the world. The regularities that people notice are many; proximity in space or time, and similarity, both observed and conceptual, are but some of them. Noticing regularities promotes a search for explanation, to predict and perhaps to control. Explanations begin by dividing the entities underlying the regularities into parts, in part to simplify reasoning, in part reflecting a belief that different parts have different functions and that the coordinated interaction of the parts yields coherent action. These beliefs or conjectures relating parts to function form the core of the causal theories or knowledge or causal reasoning proposed as underlying the coherence or cognitive glue of categories. They are just that, beliefs and conjectures, they may be correct, they may be erroneous. They serve as initial hypotheses, starting points for scientific investigation. They may later be substantiated or contradicted. Given the inferential power of parts, it is not surprising that parts are the dominant feature of basic level categories and that parts perceived as good are those having perceptual salience and functional significance.
21.3.8 Functional inferences from perception: expertise
If inference from appearance to function is knowledge-based, then just as it characterizes the difference between younger and older children, it should also characterize the difference between novices and experts. Nowhere has more thought been devoted to relations between form and function than in architecture and design. Indeed, form and function can be regarded as the major theme of twentieth-century architecture. Architects use their own sketches to develop and refine design ideas (e.g. Schon, 1983). But sketches are exactly that, schematic, so understanding them requires making inferences from them, about function as well as appearance. Novice and experienced architects differ in the inferences they make from their own sketches (Suwa and Tversky, 1997, 2001). Student and expert architects were asked to design a museum with certain (p. 339 ) specifications on a particular location. The sketching sessions were videotaped. After the design session, they viewed their own videos while describing the thoughts that motivated each stroke of the pencil. These protocols were analyzed in detail, yielding detailed portraits of the act of design. As hypothesized, experts made more functional inferences from their sketches than novices. Specifically, experts were able to anticipate patterns of circulation and of light from the sketched structure. Notably, these are two of the major concerns of architects and architectural theory. Space syntax, for example, is a research enterprise that predicts circulation patterns of people from grid structures of environments (e.g. Hillier, 1999, 2000). Similarly, formal analyses of building configurations account for the distribution of natural light (Steadman, 2001).
Other novice–expert differences support the idea that with knowledge, people come to make functional inferences from perceptual form. Chess experts can ‘see’ a history of moves in a mid-game display (Chase and Simon, 1973; De Groot, 1966). Expert sight readers can ‘hear’ a musical score. Most of us are expert readers; we readily derive sound and meaning from meaningful text, provided it is in an alphabet that we know. These visual symbol systems (see Goodman, 1968) are extreme cases where function or meaning is assigned to visual marks.
21.3.9 Other categories: scenes
The coupling of appearance and function occurs for other categories important in everyday life as well as for objects, notably for environments and events. Like objects, environments, or the settings that objects appear and are used in, have a basic level, the level of school, home, and store for indoor scenes and the level of beach, mountains, and cities for outdoor scenes (Tversky and Hemenway, 1983). Significantly, people regard the different basic level scenes as sharing specifiable features of appearance, notably, parts, as well as allowing specifiable activities. Schools have desks, chairs, and books as well as cafeterias and yards. They permit a variety of educational activities as well as eating and recreation. Stores have merchandise and aisles and cash registers to support purchasing and paying. Schools and stores differ in appearance and in activity, in form and in function.
21.3.10 Other categories: people
Even more than most categories, categories of people can be naturally classified in many ways. Race, gender, and stereotype have been studied by various researchers. Inspired by observations of Brown that the categories of people important for our daily lives are those based on relationships, Shayer and I (unpublished data) asked people to generate categories for two presumed superordinates, people you know and people you don’t know. For people you know, informants listed categories such as friend, parent, and cousin; that is, the categories were based on personal, social relations. For people you don’t know, informants listed categories such as (p. 340 ) lawyer, cashier, and sales clerk. These categories were based on services and impersonal social relations. Other informants generated subcategories for each of these. Subcategories for people you know were individuals, particular friends, parents, and other relatives. Subcategories for people you don’t know were genuine subcategories, corporate lawyer, supermarket cashier, department store clerk. Notice that these categories, except for the individuals, are nearly purely functional. The categories are based on the roles these people play in our lives, various personal relationships for the known, and various service or economic or other professional relationships for the unknown.
21.3.11 Other categories: events
For events, function plays a role parallel to its role in objects. Events can be regarded as a temporal analog of objects, with parts extended in time rather than space. Like objects, they can form taxonomic hierarchies of kinds, for example, going to entertainment or a health professional or shopping or school (e.g. Morris and Murphy, 1990; Rifkin, 1985; Rosch, 1978). Like objects, events can also form partonomies, hierarchies of part-of relations (e.g. Abbott, Black, and Smith, 1985; Bower, Black, and Turner, 1979; Schank and Abelson, 1977; Tversky and Hemenway, 1984; Zacks, Tversky, and Iyer, 2001; Zacks and Tversky, 2001). Following a procedure developed by Newtson (1973), participants watched videos of mundane events like making a bed or assembling a saxophone, pressing a button every time they thought one event unit ended and another began (Zacks et al., 2001). They did this twice, in counterbalanced order, once for the largest units that made sense and once for the smallest units that made sense. Half the participants described what happened for each segment.
Ninety-five percent of the descriptions of both coarse and fine units of events were actions on objects, for example, putting on the bottom sheet or putting on the mouthpiece. The remaining 5 percent described the entrance or exit of the actor. That is, all of the segments were regarded both perceptually, as specific actions, and functionally, as accomplishing a goal. The coarse level was segmented by higher level goals or functions; the fine level, by lower level goals or functions. The natural parts of events, like the natural parts of objects and scenes, are appearance–function units.
21.3.12 Abstract categories
What about abstract categories, disciplines, such as psychology or linguistics, institutions, such as governments and corporations? These categories also have structure, analogous to appearance in objects, for example, the power and decision-making units and their interrelations in institutions. They have parts that can be identified by ‘appearance’ and that are differentiated in function. Think of the legislative, executive, and judicial branches of the government. Or perception, (p. 341 ) memory, and personality in psychology. Think of the uniforms, offices, tools of various branches of the military or an airline or a restaurant or a hospital. Think of the structures of control in corporations and governments. Theories about democracy, flexibility, conflict resolution, and more rest on analyses of the structures of the organizations, for example, hierarchical or distributed. As for objects, scenes, and events, in abstract categories as well, parts promote inferences from structure or appearance to function.
21.3.13 Categories: appearance, function, and affordance
What makes the basic level of reference special, Tversky and Hemenway proposed, is that at the basic level, knowledge about parts is most salient. And what is special about parts is that they connect appearance and function, providing category coherence. This connection accounts for the convergence of the perceptual, behavioral, and linguistic operations on the basic level. The perceptual and behavioral are directly linked to appearance and function. The linguistic operations seem to derive from the others. All other things equal, linguistic distinctions should follow salient perceptual and functional distinctions. The favored names to refer to objects, then, are those that pick out both perceptual and functional features. Basic level names are powerful, maximizing informativeness per distinction, relying on appearance for facile recognition, promoting inferences to function.
For a wide range of categories, living things, artifacts, scenes, and events, appearance and function are inextricably related. This is partly because they are often causally related, form can determine function. By evolution or by design, function can determine form. Together, appearance and function comprise affordances. Many post-Gibsonian examples of affordances are not, as he proposed, immediate. Rather, they depend on knowledge. Appearance, including spatial properties and relations, is more readily perceived than function, as appearance can be perceived from static objects whereas perceiving function typically entails observing or knowing about objects in action or interaction. With increasing experience, developing expertise in a domain, function can be inferred from appearance. Function is not simply activity, action. It is action in the service of a goal, a purpose. The same goal can be attained by different actions.
21.4 Naming Emphasizes Function
There are many routes to categories in the mind. Prominent among them are the perception of an object and the name of an object. Object names may bias different aspects of categories. In particular, there is provocative evidence that naming favors functional features of categories (Tversky, Morrison, and Zacks, 2001). This is apparent from our work on the categories of bodies and events.
(p. 342 ) 21.4.1 Bodies
Bodies are a special kind of object; like objects, bodies are experienced from the outside, but unlike objects, bodies are experienced from the inside, known kinesthetically in addition to visually. Biological motion is perceived differently from mechanical motion (Reed and Farah, 1995; Chatterjee, Freyd, and Shiffrar, 1996). Like objects, bodies have parts that vary in size, salience, and function. Which of those factors determines speed of recognizing object parts? Morrison and I (1997) asked this question in two types of task. One task was a perceptual task in which participants viewed pairs of bodies in a variety of poses and positions with a part highlighted; their task was to indicate, as rapidly as possible, whether the parts highlighted were the same. The second task entailed naming. Participants read the name of a body part and viewed a body with a part highlighted; their task was to indicate, as rapidly as possible, whether the named and highlighted parts were the same. We chose the parts widely named across languages: head, feet, arm, hand, leg, chest, and back.
Three theories may account for relative accessibility of body parts. According to an imagery account, larger parts should be verified faster, as they are identified faster in imagers (Kosslyn, 1980). According to theories of object recognition based on parts (Biederman, 1987; Hoffman and Richards, 1984), salient parts are those more discontinuous with shape contour, so parts that have greater contour discontinuity should be verified faster. Finally, according to a part significance account (Tversky and Hemenway, 1984), parts that are more significant should be verified faster. Significant parts are those that are perceptually salient, that is, with greater contour discontinuity, and functionally important. Of the three theories of recognition time, size failed miserably, contrary to notions popular in imagery that large parts are identified faster than small ones. In fact, size of part correlated negatively with speed of recognition in both tasks. As the work on part goodness showed, part salience, that is, contour distinctiveness, and functional significance are positively correlated, for bodies just as for objects. In fact, both correlated with recognition speed for both the body–body and name–body tasks. One of the parts considered appears to have functional significance without contour distinctiveness, namely, chest. It is significant because it contains important internal parts; it is the front of the body, the side that engages the world perceptually and behaviorally; it also enjoys a relatively large space in the sensori-motor homunculus in the brain. Indeed, for the name–body task, chest is recognized relatively quickly, but for the body–body task, it is recognized relatively slowly. For the perceptual task, part salience accounted for recognition speed better than part significance, but for the naming task, functional significance accounted for recognition speed better than salience. Naming seems to call attention to functional in addition to perceptual properties, more than the appearance of a body alone.
(p. 343 ) 21.4.2 Events again
Earlier, we described research in which observers segmented mundane events such as making a bed at coarse and fine levels. Some observers described what happened during each segment; others merely segmented (Zacks et al., 2001). The vast majority of descriptions were actions on objects. Another question of interest is whether the segmentation was hierarchical, that is, did the high level boundaries coincide with the low level boundaries greater than chance? It could be that the higher level units were functionally defined, by goals, and the lower level units were perceptually defined, by large changes in activity, and that these might not coincide. In fact, segmentation was hierarchical, both within and across observers. What then is the effect of describing on hierarchical structure? Describing adds another task to segmenting; it could interfere with segmentation, making it more random, hence less hierarchical. On the other hand, describing brings to mind top-down functional information; focusing attention on function could yield more hierarchical segmentation. In fact, the degree of hierarchical coding was considerably greater with concomitant describing than without. It seems that events are segmented on the basis of both bottom-up perceptual information and top-down functional information. For events, language calls attention to top-down, functional processes, reflected in tighter hierarchical organization. Language not only directs attention to functional aspects of events, but semantic access also appears to be necessary for appropriate interaction with objects (Creem and Proffitt, 2001). In the presence of a semantic distractor task, but not in the presence of a visuo-spatial distractor task, participants grasped objects awkwardly, presumably because the distractor task interfered with their ability to access functional information about the object. For events as for bodies, language arouses functional aspects that are not readily in evidence from perception.
21.4.3 Parts, categories, and functions
Categories serve many functions. Primary among them are referring and inferring. Referring picks out objects in the world as category members. Inferring provides information about what the objects can do. Referring depends on identifying instances, on appearance. Inferring supplies functions. Categories are most informative at the basic level of abstraction, the level of chair and pants and car and tree and bird. Parts constitute the vast majority of features people provide for basic level categories. Parts are simultaneously elements of appearance and function. They afford inferences from structure to function, which form the core of the theories or causal reasoning that make categories cohere. Language further facilitates reliance on function as it elicits functional aspects of things in the world.
(p. 344 ) 21.5 Referring to Spatial Relations
So far, we have concentrated on the roles of appearance, function, and affordance on spatial language, reasoning from research on objects, scenes, and events that these are related, in many cases, causally related. These causal relations serve us well, promoting inferences from appearance to function. Now we expand to influences of context and schemas, on using spatial language, drawing on our own research, neglecting other numerous fine studies. We shall first show that use of some of the most basic spatial relation terms, for example, front and back, depend not just on spatial properties, but on function, context, and schema as well.
Many languages describe the locations of objects in space by using terms that refer to spaces projected from the sides of a body or other object, for example, front, back, left, right, above, below. From purely geometric considerations, these directions should be equally accessible, that is, times to comprehend all directions should be the same. They are not. Rather, their accessibility depends on properties of the body as well as properties of the world—both perceptual and functional properties—and the relation of the body to the world, that is, context. When an observer is described as upright, objects in the regions beyond head and feet are most accessible, followed by objects to front and back. Those to left and right are slowest. This pattern changes when an observer is described as reclining. In that case, objects to front and back are more accessible than those to head and feet. Objects to left and right are least accessible as for upright.
The Spatial Framework Theory offers an explanation for this pattern of accessibility (Franklin and Tversky, 1990). In the paradigm situation, participants study narratives describing you, the observer, as situated in an environment such as an opera house or museum or barn, surrounded by objects to head, feet, front, back, left, and right. After learning the situation, participants read that they are turned to face another object. Then they are probed for the objects currently at all sides of their body with the direction terms. According to the Spatial Framework Theory, in order to keep track of the locations of objects around the body, people construct a spatial mental model consisting of extensions of the three body axes, and associate objects to them. The axes of the body vary in accessibility, depending on appearance and function. For an upright observer, the head/foot (or above/below, it makes no difference) axis is fastest because it is an asymmetric axis of the body and because it is correlated with the only asymmetric axis of the world, gravity. In both cases, the asymmetries are perceptual as well as functional. Heads and feet look different and act differently. Gravity causes asymmetries in the way the world appears and in what can be done in the world.
(p. 345 ) Front/back is next fastest because it has important asymmetries, among them, separating the world that can be viewed and manipulated from the world that is not easily viewed or manipulated. Left/right has no salient asymmetries, and is slowest.
When the observer is described as reclining and turning from side to front to back to side, no axis of the body correlates with gravity, so accessibility depends only on the asymmetries of the body axes. In this case, front/back is faster than head/feet, presumably because its perceptual and functional asymmetries are more important than those of head/feet.
In the original paradigm, the spatial mental models were established as well as tested through language. The basic situation was a person surrounded by objects to the six sides of the body. This situation lends itself to variation, both in the spatial setting and in the medium conveying the information. Many variations have in fact been tested. The spatial array can be in front of the observer rather than surrounding the observer, the external case (Bryant, Tversky, and Franklin, 1992). In the external case, all objects are in front of the observer, including those described as ‘front’ and ‘back’. This contrasts with the internal situation where the ‘front’ object is in front of the observer, but the ‘back’ object is behind, in principle, hidden. For the case where the array is external to the observer, times for front and back are about equal, but for the case where the observer is internal to the array, times to front are faster than times to back. This is complemented by a finding of Franklin, Henckel, and Zangas (1995). They asked participants to indicate where ‘front’ is relative to themselves and relative to a doll. The area for self was much larger than for other. In another variant, narratives described the environment as rotating around the observer, rather than the observer turning in the environment. Although formally the same, the situations differ conceptually. The normal situation in the world is that observers move and environments are stationary. In another variant of the spatial environment that has effects consistent with the spatial framework model, narratives described the environments as rotating around the observer, rather than the observer rotating (Tversky, Kim, and Cohen, 1999).
Other variants have two characters, surrounded by the same or different objects, facing the same or different ways (Franklin, Tversky, and Coon, 1992). Here, too, the spatial context systematically affects retrieval times. When narratives describe the observer as above the environment, viewing both characters, participants’ body axes no longer correspond to axes of the characters in the scene, and all directions become equally accessible. When narratives describe each character’s viewpoint in turn, participants adopt their viewpoints and the canonical spatial framework pattern emerges. The spatial setting can be conveyed by diagrams or models (Bryant and Tversky, 1999) or by actual perception (Bryant, Tversky, and Lanca, 2001) as well as narrative. For flat diagrams, participants adopt an external viewpoint but for three-dimensional models, participants adopt an internal viewpoint. Instructions to adopt different viewpoints can override the effect of (p. 346 ) medium and reverse those patterns. Whether the situation is acquired by perception or by description does not lead to changes in the patterns of retrieval times as long as testing is from memory.
All in all, systematic variations in spatial arrangements and in mode of presenting the spatial arrangements yield systematic variations in the patterns of retrieval times for the same basic spatial terms. These variations can be accounted for by variants in the Spatial Framework Theory. These variations illustrate contextual or schematic effects on comprehension of spatial relation terms.
21.5.2 Production: describing complex environments
Clearly, context and schemas affect comprehension of spatial language. They also affect choice of spatial language. When people describe large environments, such as a museum or a town, they typically adopt one of two perspectives, or a combination of both. Perspectives involve reference frames, reference objects, and terms of reference. In a route perspective, descriptions take listeners on an imaginary tour of the environment, adopting a viewpoint from within and locating landmarks with respect to the traveler in terms of right, left, front, and back. In a survey perspective, descriptions take a viewpoint above the environment, locating landmarks with respect to each other in terms of north, south, east, west. Features of the environment—context—influence the perspective adopted. Route perspectives are relatively more common when the environment has a single path and landmarks on a single size scale (Taylor and Tversky, 1996). For such environments, it is relatively easy to describe all the landmarks by imagining traversing a single route.
21.5.3 Production: which potted palm hides the diamonds?
A simple paradigm developed by Schober (1995) required a speaker to specify one of two identical visible objects to an interlocutor. Speakers normally adopted listeners’ perspectives, perhaps from politeness, perhaps to ease cognitive load. We enriched that situation by providing landmarks and/or cardinal directions on some trials and by varying the relative amounts of information speakers and interlocutors had about the spatial situation (Mainwaring, Tversky, Ohgishi, and Schiano, 2003; Tversky, Lee, and Mainwaring, 1999). Participants were told that they were spies, and asked to transmit to their partners which of the identical objects contained the hidden microphone or letter by providing a brief, clear message in a secret tiny communicator. Easing shared cognitive load seemed to motivate most of the choices of spatial relation terms. That in turn depended on both the spatial arrangement of objects and participants and the spatial terms that could specify the target. When the situation allowed it, participants preferred terms like ‘near’ to projective terms that require computing a direction, like ‘right’ or ‘north’. Participants preferred terms like ‘front’, that are relatively easy to produce and comprehend because of body asymmetries, to terms like ‘left’ and (p. 347 ) ‘right’, that are relatively hard to produce because of lack of body asymmetries. For some spatial layouts, the target object could be disambiguated only by taking the perspective of one of the spies. When the speaker knew more about the situation, speakers tended to adopt the other’s perspective. Alleviating the cognitive load of the interlocutor would presumably increase overall accuracy of communication. When cognitive loads of the conversational parties were equal, there was no preference for others’ perspectives. Although Japan is thought to be an especially polite culture, Japanese participants showed exactly the same pattern as Americans; that is, the Japanese did not show a greater tendency to adopt the perspective of the other.
Overall, minimizing joint cognitive load seemed to account for choice of reference frame and reference object. But the determinants of joint cognitive load depend on context and shared schemas and the difficulty of producing and comprehending terms to describe it, which in turn depend on spatial and functional features of the body and the world.
21.5.4 Spatial language and function
Spatial language is used in a variety of contexts, and use of spatial language depends on the context. Context includes more than purely spatial aspects of the setting, but also functional aspects. These functional aspects affect both use and comprehension of spatial language, especially referring expressions.
21.6 Parting Words
We are back where we started, having shown some of the ways that terms that refer to entities and spatial relations among entities reflect appearance and function, thus affordances, as well as context and schema. The world we talk about is replete with appearances and functions and contexts and more. Talk necessarily schematizes the world we experience. How simple things would be if we could reduce spatial language to spatial templates. Simpler still if we could do it by merely adding function. Yet such simplicity would just serve simple-minded language and simple-minded theories. It would not account for language use or understanding. Indeed, despite all the nuances and subtleties that our unruly languages provide, we are often at a loss for words. (p. 348 )
(*) This chapter is based in part on collaborations with Kathy Hemenway, Jeff Zacks, Masaki Suwa, Nancy Franklin, David Bryant, Julie Morrison, Holly Taylor, Scott Mainwaring, and Nancy Shayer. I have enjoyed and benefited from their thinking beyond the particular collaborations. Portions of the research and preparation of the manuscript were aided by Office of Naval Research Grants NOOO140PP-1-0649 and NOOO140110717 to Stanford University.