Jump to ContentJump to Main Navigation
Understanding EventsFrom Perception to Action$

Thomas F. Shipley and Jeffrey M. Zacks

Print publication date: 2008

Print ISBN-13: 9780195188370

Published to Oxford Scholarship Online: May 2008

DOI: 10.1093/acprof:oso/9780195188370.001.0001

Show Summary Details
Page of

PRINTED FROM OXFORD SCHOLARSHIP ONLINE (www.oxfordscholarship.com). (c) Copyright Oxford University Press, 2018. All Rights Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a monograph in OSO for personal use (for details see www.oxfordscholarship.com/page/privacy-policy). Subscriber: null; date: 18 December 2018

An Invitation to an Event

An Invitation to an Event

(p.3) 1 An Invitation to an Event
Understanding Events

Thomas F. Shipley

Oxford University Press

Abstract and Keywords

This chapter begins with a discussion of the significance of events. It then focuses on four traditional object-perception issues and considers potential analogies to event perception. These are segmentation and grouping; what makes two events similar; representation; and feature binding. An event taxonomy is then presented. It is argued that events, like objects, have boundaries. These boundaries may reflect the statistical structure of events as we have experienced them, either individually or as a species.

Keywords:   event perception, object-perception, segmentation, grouping, representation, feature binding

What is an event? An event may be miraculous, mysterious, seminal, even divine—and of course, to paraphrase Ecclesiastes, there is one event that happeneth to us all. Of what do we speak when we say event? The Oxford English Dictionary (OED) offers the following: “Anything that happens, or is contemplated as happening; an incident, occurrence . . . In mod. use chiefly restricted to occurrences of some importance . . . In sporting language: Something on the issue of which money is staked . . . That which follows upon a course of proceedings; the outcome, issue; that which proceeds from the operation of a cause; a consequence, result.” The term can also be used as a verb: “To come to pass . . . To expose to the air . . . To vent itself . . . To take part in horse trials.” There are also a few notable compound phrases that incorporate “event.” A compound event is “one that consists in the combined occurrence of two or more simple events.” An event horizon is “a notional surface from beyond which no matter or radiation can reach an external observer.” An event-particle is “one of the abstract minimal elements into which, according to A. N. Whitehead, space-time can be analysed.”

How does this help us? It illustrates the breadth of the field of inquiry. In this book we attempt to come to an understanding about how humans think about anything that happens. The hubris of such an (p.4) undertaking may be breathtaking, but this is not an attempt to provide a psychology of everything. The event set is not the set of all things. However, events are such a large portion of “everything” that it might be useful to consider what things are not events. Events are things that happen; events require a reference to a location in time (but not necessarily a point in time). Things that exist outside of such a temporal reference—let’s call them objects—are not events. Physical objects (e.g., apples, mountains, clouds) are clearly not events, and so too psychological objects (e.g., ideas, concepts, goals) are not events. An object in isolation is not an event; events occur when objects change or interact. An apple is not an event, but an apple falling is. Likewise, the idea of gravity is not an event, but having the idea was an event. This usage of “object” differs from the way many use the term; it has the advantage of avoiding some messy issues such as events being prototypically short-lived and insubstantial, whereas objects are enduring and substantial. Here the continued existence of an object is an event (albeit a rather dull one), as it requires reference to time. The changeless existence of an object may not be a prototypical event, but it appears to have some psychological importance, as some languages explicitly mark the mutability of an object (Clark, 2001); these languages distinguish between temporary and longer-lasting states (e.g., between being wet and being made of stone).

Events may be brief or long, spanning temporal scales from fractions of a second (e.g., playing a note on a piano) to millennia (e.g., forming the Grand Canyon) and beyond. Events at these different scales may fall into different functional classes. Those on the scale of seconds may be particularly relevant to issues of safety and general motor coordination, while longer events may be more relevant to goals and plans. Very long events, such species evolution, while conceivable, may be so removed from familiar human event scales that they are difficult to understand or think about.1

The variety of things that can be described as events means that events also span the domains of psychology. If one looks at the history of psychology, one finds little theoretical or empirical work that is framed in terms of events. Notable exceptions include the theories of the Gestalt (p.5) psychologists (but these led to relatively little experimental work) and Gibson’s ecological optics. I think event avoidance by both experimentalists and theorists reflects the practical difficulties of working with events. Researchers have been searching for truth where they can see. The advent of computers allows us to control events with greater flexibility than ever before. It is time to seriously consider the appropriate place for events in our science. At the risk of being overly polemical, events appear to be a fundamental unit of experience, perhaps even the atoms of consciousness, and thus should be the natural unit of analysis for most psychological domains. In perception, wasn’t Gibson mostly right in saying that the really useful information about the world is provided over time (by motion of objects and observer through space, and by moving our eyes and head)? In cognition, to understand thinking, mustn’t we try to understand how humans and other animals store information about events, retrieve that information, and use it to make plans about future events? Let me put the argument another way: Suppose humans were unable to process events. Imagine how difficult life would be. One would have to live moment to moment, without plans or expectations. Action at each instant would require an evaluation of the current static situation. How could one move out of the way of moving vehicles, or remember to bring milk home, or even have a sense of whether one is going to work early or returning home from work late? It has been hypothesized that schizophrenics are compromised in their ability to segment events (Zalla, Verlut, Franck, Puzenat, & Sirigu, 2004). If so, is it surprising that schizophrenics find it hard to interpret the actions of others? Whether the segmentation deficit is causally related to the symptoms of schizophrenia is up in the air; nevertheless, it illustrates the point: to not be able to think about events would make functional behavior nigh impossible.

Each event, each happening, reflects some interaction among objects (remember that here “objects” includes both physical and mental objects). To understand, remember, talk about, or otherwise act with respect to the event, the interaction must be perceived. Many readers may object to the use of the word perception to characterize the processes involved in event processing, wishing instead that I had used cognition, for surely any process that spans time—with environmental input at one point in time influencing processing at a latter point in time—must require memory and therefore must be cognitive. Personally, I do not find the distinction between perception and cognition very useful or easy (p.6) to make. Once one admits to perception of motion, where the location of an object at one time influences processing at a later time, one has opened the door, inviting processes that span time (and thus events) into the house of perception. Once opened, it is hard to close this door, to exclude processes that take too long and thus must be cognitive. We may each have our personal limits, when we start to get embarrassed and wish to close the door—perhaps after 200 milliseconds, 1 second, or 10 minutes; there simply is no clear break in time where perception ends and cognition begins. On the positive side, thinking of the processes as perceptual offers an interesting perspective, and potential lines of research may be developed by analogy to previous work in perception. I attempt to offer a few such examples in this chapter.

It may be useful to start by considering just what we are speaking of when we say “event.” Perceptual psychologists have often found it useful to distinguish between physical properties of the world and our experience of them by giving them different names. For example, “luminance” refers to the physical intensity of light and “brightness” to the psychological experience of an amount of light. In cases where the words for the physical and psychological aspects of the world are identical (e.g., “contrast”), particular care must be taken to be clear about which meaning, physical or psychological, is intended. The usage of “event” presents us with this problem: it is not always clear when “event” refers to an occurrence in the world and when it refers to our experience of things happening in the world.2

(p.7) What does it mean to perceive an event? In 1976, Ulric Neisser attempted a marriage of Gibson’s perception as information pickup and information processing where detection of invariants in the ambient optic array was guided by knowledge gained from past experience. The interplay of knowledge and information pickup was characterized as a perceptual cycle. The continuous nature of a cycle captures an important aspect of event processing (Hanson & Hanson, 1996). In this cycle, information is picked up from the environment, and this then leads to recall of similar objects or events from the past; this knowledge, in the form of a schema, directs attention and guides future information pickup (Neisser, 1976; see also Reynolds, Zacks, & Braver, 2004). If incoming information fails to agree with expectations (i.e., the schema has failed to predict the evolution of an event or the content of a scene), then attention will be allocated to picking up further details. A model of how this might be achieved, including candidate neural structures, has been proposed by Zacks, Speer, Swallow, Braver, and Reynolds (2007). As events unfold there will be points of decreased predictability during which the organism is particularly likely to search for information to allow more accurate future predictions. These points of decreased predictability will be points of segmentation (e.g., after a goal has been achieved) (Wilder, 1978a, b).

We perceive events in order to anticipate the future, and we use information available in the present to guide future action; in this way we attempt to maintain a perfect coordination of our actions with the world (from looking in the right direction when watching a tennis match to placing our feet so as to avoid injury to small children running around a playground). Breaks in events are, then, the inevitable failures of coordination, points where we are off balance, zigging when we should have zagged. A break in event processing is an opportunity to take in additional information about the world, to reevaluate our models of what is happening and is likely to happen. Persistent difficulty in anticipating the future may lead to use of temporally shorter event models, in effect causing an individual to worry more about the here and now than larger temporal events.3

(p.8) Coordination is not achieved by sight alone; similarly, event perception is not the exclusive domain of vision. Events are almost always multimodal. As vision can provide information about objects at great distances, it will also be able to provide information about events that evolve over space (e.g., approaching objects). Anyone who has watched a silent movie has had the experience of understanding events using exclusively visual information. What do we see during an event? When we look at objects in motion, we do not see motions; as the movie director and the psychologist know, we see actions and interactions—we see emotions and pratfalls. Visual media can communicate quite sophisticated stories. Studies of event perception highlight the level of complexity that is available to perceivers. The seminal studies in this area by Michotte (1946), Heider and Simmel (1944), and Johansson (1973), each cited over 100 times, used displays with simple geometric forms, thereby eliminating potential top-down sources of information about the past behavior of the object.

One of the simplest events—for many philosophers perhaps a prototypical event—is collision. When two objects collide, one object causes the other to move. Philosophers have found much ground to till in their consideration of the nature of causality. In a series of simple experiments, Michotte (1946) investigated the nature of the psychology of causality in simple two- and three-body interactions. He found that subjects reported the experience of causality in highly reduced visual displays, suggesting that causality was a psychological entity: certain spatiotemporal relations result in the perceptual experience of causation, just as certain combinations of wavelengths are seen as red or blue. When observers saw a small dot approach a second dot and stop upon reaching the second dot, and then saw the second dot start to move in the same direction as the first, they reported that one object caused the other to move; when spatial or temporal contiguity was broken, observers reported different sorts of causal interactions. For instance, when the first dot stopped at a distance from the second before making contact, and then the second began moving, observers reported seeing the first dot “launch” the second. When the first object did not stop but moved along with the second, subjects reported “entraining.” These latter forms of causal interaction may reflect sensitivity to basic causal interactions among animate objects. Michotte noted that often the causal experience included an experience of animacy; with entraining, subjects often reported that the (p.9) objects were animate—after a period of viewing, the objects appeared to move on their own. When there is action at a distance, the moving object will be seen as animate, as the interaction appears to reveal both perceptual abilities and an internal source of energy (because there was no spatially proximal cause for movement). Launching can occur when the launched object has perceptual abilities, allowing the anticipation of danger and avoidance of an approaching object. Similarly, entraining reflects a coordination of action with a distal object.

Like other perceptual phenomena, these experiences appear cognitively impenetrable to common sense or most world knowledge (Fodor, 1983). Subjects have experienced launching with impossible actors. For example, some subjects were shown a shadow or spot of light moving toward a marble; when it reached the marble, the marble began to move. The subjects reported the interaction as causal—their perception was that the shadow caused the marble to move.

The perception, detection, and recognition of animate biological interactions extend beyond simple following and avoiding to complex social interactions. Heider and Simmel (1944) showed observers films of a circle and triangles moving around a screen. When asked what they saw, almost none of the subjects reported just the temporal sequence of movements. Instead they described events as animate interactions (e.g., “He [triangle-two] attacks triangle-one rather vigorously [maybe the big bully said a bad word]” [p. 247]). These reports seem best characterized as reflecting perception of social causation. Perceivers interpret the pattern of motions, in particular the contingent motions (e.g., following, attacking, or defending), in terms of intents with goals and plans for the action. The perceptually identified goals in turn lead to using personality terms to describe the actors (e.g., “brave” for one triangle and “aggressive” for the other). Believing an object has a goal or intent means perceiving it as animate. Intentions and goals are generally conceived of as internal, invisible, even hypothetical objects. However, they can be seen in the interaction of an animate being with other objects. Thus, to perceive an action with a goal, such a catching a ball, requires picking up on the time-varying relationship between two or more objects. Experience with multiple instances of goal-directed actions may allow categorization, which in turn may subserve the linguistic representations (e.g., verbs and prepositions) of such actions (Hirsh-Pasek & Golinkoff, in press).

(p.10) Research on perception of animacy is relatively limited. Some recent work has focused on the perceptual classification of an object as animate (e.g., Tremoulet & Feldman, 2000). In an important paper, Bingham and colleagues (1995) attempted to characterize the spatiotemporal patterns associated with animate events and contrasted those patterns with the patterns present in spatially similar inanimate events. The majority of studies of the perceptual processing of events that involve animate actors have been conducted under the rubric of biological motion perception.

In 1973, Gunnar Johansson published a short article describing the perception of humans when the only visible elements of a scene were small lights attached to each of the major joints of the body (shoulder, elbow, wrist, hip, knee, and ankle). The experience of seeing a person moving around in these point-light displays frequently produces marvel. These displays are intuitively notable for even a naïve observer because there is a significant discrepancy between the experience of the static display, where the lights appear as a flat constellation of points that rarely resembles anything familiar, and the moving displays, where a three-dimensional moving human is readily recognized. The sense that the point-light displays are impoverished while the perceptual content is rich may be formally analyzed in terms of the spatial structure of the actor. Recovering the spatial structure of 12 points represents a significant achievement from the point of view of degrees of freedom. If each point has six degrees of freedom (location in 3-D [X, Y, and Z] and three degrees for rotation around each axis), then 12 points represents recovery of structure given 72 degrees of freedom. To give a sense of this accomplishment, pretend for a moment that each degree of freedom is (absurdly limited to be) binary (left–right, front–back, up–down, and rotated 0 or 180 degrees in the frontal, sagittal, and horizontal planes); in this case, seeing a person walking means the system arrived at the correct solution from among 4.7 × 1021 possibilities.

Beyond the recovery of structure from motion in biological motion displays, observers can also see the action, or what the object is doing. Actions provide basic class information about the objects performing them. For example, the local actions of one point (e.g., one dot that appears to move in a pattern typical of locomotion) allow the visual system to classify an actor as animate or not; such perceptual mechanisms may serve as a simple biological-actor detector (Troje & Westhoff, 2005). (p.11) Action also provides rich information about an object and its intentions. Actions may be globally classified, for example as walking or running, independent of whether the runner is Jesse Owens or someone else. The actions may be further classified in terms of the state or emotion of the actor (e.g., happy or sad) (Dittrich, Troscianko, Lea, & Morgan, 1996; Paterson & Pollick, 2003; Troje, 2002). The perception of action may also include subtle metric properties of an action, such as how far a ball was thrown or how much a lifted box weighed (Runeson & Frykholm, 1981, 1983). Such percepts are possible because there is an intimate relationship between the visible motions in an event and the forces present in an event. An actor may pretend to be lifting something heavy when the object being lifted is actually light. Observers can recognize the intent of the actor (to deceive) as well as the truth of the weight, because the forces needed to lift heavy versus light objects are quite different, and such forces are reflected in the acceleration and locations of the joints of the actors (Runeson & Frykholm, 1981, 1983). There are significant individual differences in actions; these reflect, in part, differences in the structure of the actors. We may recognize our friends based on the way they move (Koslowski & Cutting, 1977) or categorize unfamiliar actors as female or male based on how they walk or throw (Koslowski & Cutting, 1977; Pollick, Kay, Heim, & Stringer, 2005).

Objects and Events

In point-light walker displays, we can see that motion patterns dually specify both the event and the object. Visible motion patterns reflect the shape of an object and how it is moving. Despite the intimate relationship between agent and action, object and event, research has focused on only one of these areas. The field of object perception is mature, with many years of cumulative research and established overarching theories, whereas the field of action or event perception is in its infancy. Here, I take four traditional object-perception issues and, for each, consider potential analogies to event perception.

1. Segmentation and Grouping

The words used by the OED to define events reveal an important constraint on the way humans conceive of events: we think of events as (p.12) things. We use the same language to describe events and objects. The inclusion of the words “instance” and “occurrence” indicates the propensity for segmentation of events. Whether or not there might be some physical basis for segmentation is a topic of debate. For many events, the beginning and end points of the physical event are obscure. To illustrate: When does a wedding begin? When a date is set? When announcements are sent out? With the arrival of the first guest, the last guest, the wedding party, the bride, the groom? Or perhaps at the time set on the invitation? For these reasonable candidates, can the precise instant of initiation be determined? When, exactly, does someone arrive at a wedding? When he or she gets out of his or her car? When the individual first sets foot on the steps of the church? Perhaps we should borrow from horse racing and decree that arrival occurs when the nose breaks the plane of the doorway of the church. My point is not that the concept of a wedding event is incoherent; rather, it is that humans treat events as temporally well bounded, regardless of agreement about the details of the boundaries.

One might reasonably ask, “Why would events appear bounded if they are actually continuous?” The answer, I believe, is that the appearance of boundaries reflects event regularities. Within some classes of events (e.g., physical–causal interactions) the boundaries reflect physical regularities within the event; some changes in the world always occur with certain others. For example, falling precedes collision with a ground surface, which in turn precedes bouncing or breaking. The falling, colliding, and bouncing may be seen as a unit because they co-occur. When things become less predictable, a boundary will be seen. Predictable regularities influence visual processing, and event units are perceived. This idea is expanded upon in Chapter 16. Other regularities may be imperfect, perhaps learned by observation of statistical regularities among components of an event (Avrahami & Kareev, 1994). When components consistently co-occur, we come to experience them as single event unit.

Events are experienced as units—units that are the building blocks of conscious experience. Our experience of the world reflects the way we link and keep separate the various events around us. The segmenting and grouping of events in turn reflects physical regularities reflecting the physical attributes of solid objects (e.g., solidity, opacity, and continuity over time). To begin a study of event unit formation, it is useful to look (p.13) at work on object perception, as the same physical regularities directly influence object and event perception.

In object perception, the basic fact of object opacity and its prevalence in natural scenes requires visual mechanisms that can interpret scenes where near objects partially obscure more distant objects and proximal parts of an object hide its distant parts. The recovery of object boundaries in cluttered scenes is one of the major challenges in computational vision. Perception must segment (identify as separate) optically adjacent pieces of different objects, and it must group (combine) optically separate pieces of a single object. Most accounts of how grouping is achieved rely on a perceptual filling-in process whereby occluded boundaries are completed based on the visible regions (for a review see Shipley & Kellman, 2001).

Occlusion, considered from the perspective of event processing, introduces some interesting processing challenges, and the potential solutions have implications for research on object perception. The opacity of surfaces in natural scenes will cause objects to temporarily go out of sight as they (or the observer) move and the object passes, or appears to pass, behind nearer objects. Accurate perception of a distal event requires recovering the changes occurring in the world despite fluctuations in visibility of the actors. Here, the visible portions of an event must be segmented from other changes and grouped together. Similarly, visual processes are needed to stitch together events fragmented by the movement of the body and eyes of the observer.

The need for segmentation may not be phenomenally obvious, as we tend to experience only one event at a time; nevertheless, many events overlap in a scene, and our inclination to attend to just one of these leads to the impression of sequences of events. Neisser and Becklen (1975) provided a nice demonstration of our ability to segment one event from a flux of events. Subjects were asked to attend to an event (e.g., two people playing a hand-slapping game) seen by one eye while a different event (e.g., basketball players throwing a ball) was shown to the other eye. Subjects accurately reported what was happening in the attended event and had little awareness of what was happening in the unattended event. This ability was undiminished when the two events were shown superimposed to both eyes; subjects could accurately reports facts about one or the other event, but not both simultaneously. Selective attention to an ongoing event is analogous to the Gestalt notions of figure and ground in (p.14) object perception. Neisser’s work is a precursor to recent work on change blindness (see Chapter 19 in this volume); as conceptualized here, change blindness occurs because the visual system fails to register an event that is in the background. Is the figure-ground analogy merely descriptive, or does it offer any new predictions? The answer depends on how strong the analogy is—is it possible an event could be reversible, like the perceived image in the well-known face/vase figures? Is it possible that background events are completed with a default transformation, as background surfaces continue under figures?

Grouping event fragments may be thought of as a two-stage process: the first stage involves identifying changes that take place over time and indicate a change in visibility, and the second stage provides the linkage (to a later time or from an earlier time). The first stage distinguishes changes that indicate an object’s continued existence even when it is not visible from changes that indicate an object has gone out of (or come into) existence (e.g., an explosion). The optical changes associated with transitions from visibility to invisibility and vice versa are distinctive and thus can provide information about an object’s existence (Gibson, Kaplan, Reynolds, & Wheeler, 1969) and shape (Shipley & Kellman, 1993, 1994).

The second stage organizes the glimpses of different parts of an event. The perception of simple events will not be disrupted by occlusion. For example, the approach of a predator may be seen as it weaves its way through tall grasses. The linking together of the various fragments of the predator’s path reflects the achievement of basic object constancy—seeing “approach” requires seeing an object viewed at different points in time as the same object. Perception of longer or more spatially scattered events may require integration of multiple spatial and temporal relations. For example, the subevents of a good double play may include the pitched ball being hit, the hit ball being caught and thrown to an infielder, and the caught ball being thrown to another infielder. Although the path of the ball may help stitch together the subevents, observers need not keep all players in sight to appreciate the larger event. An additional complication arises as one moves up the temporal scale: on longer time scales, information about ongoing events may be interwoven. So, it is possible to follow the progress of a baseball game while purchasing a hot dog from a passing vendor. This, too, requires segmenting the unrelated event pieces and linking the spatially and temporally dispersed fragments.

(p.15) The perception of partially occluded objects entails a completion process (Kellman, Guttman, & Wickens, 2001). What little data there are suggest that there is an analogous completion process for dealing with the spatial and temporal fragmentation of events. Michotte, Thinès, and Crabbe (1964/1991) described a case of amodal completion over time. When a moving object disappears behind an occluder and then a moving object appears from behind the occluder, the perception is of a single object moving along a continuous path, if the first and second paths are spatially aligned and the time between disappearance and appearance falls within a certain range. Michotte et al. referred to this phenomenon as tunneling. Event completion is not limited to interpolation of translation. Hespos and Rochat (1997) found that infants accurately anticipate changes in a spinning object that is temporarily out of sight. Whether or not filling-in of event transformations occurs with longer, more complex events is an open and important question.

2. What Makes Two Events Similar?

Recognizing events such as Jesse Owens’s run or Neil Armstrong’s historic small step is a significant accomplishment, despite the collective familiarity of these events, and this accomplishment is not well understood. We may see two events as similar—indeed, some events may appear so similar that we identify them as the same and say we recognize a single event. However, as noted by Zacks and Tversky (2001), we never encounter the same event twice; every event occurs at a unique location in time. Thus, when we speak of event recognition and similarity, we are really in the domain of event concepts and must wrestle with two questions: how are tokens of an event type compared, and what is the nature of event concepts?

How are events compared? The unitary nature of events naturally leads to the consideration of potential parallels to object recognition and comparison. Historically, a central issue in object perception has been shape constancy—how does the system overcome the apparent loss of information when three-dimensional objects are projected onto a two-dimensional retina? The implicit, or at times explicit, assumption was that recognition could only be achieved based on matching an experience at one instant in time with an experience at a later instant in time. In other words, how can an observer recognize an object when, (p.16) inevitably, that object will not project exactly the same image when re-encountered?

Debate on this issue continues with arguments about whether or not object recognition is viewpoint specific (e.g., Beiderman & Bar, 2000, versus Hayward & Tarr, 2000), but the debate may be waning (Hayward, 2003). An analogous debate should be raging on how we recognize repeated instances of an event category. Do we encode events from the point of view from which we saw them? Or are representations of event tokens viewpoint independent? Research on event classification by infants (described in detail in Chapter 7) suggests that the ability to appreciate similarities across events develops slowly; successively greater abstraction of event properties occurs as experience of event instances accumulates.

The apparent event-category learning that occurs as infants learn verbs and prepositions offers an exciting potential for illuminating the nature of event categories. In considering how we develop categories for events, one may again look to the central questions in the object-categorization literature. For example, are event categories best conceived of as Roschian prototype-based concepts; feature-based, just-in-time concepts (Barsalou, 1983); or a mixture of different mechanisms? Work in the area of recognition from movement, where individuals may be recognized by the way they move, hints at a prototype-based mechanism (Hill & Pollick, 2000). Subjects were shown point-light actions of several actors. Some training subjects learned to differentiate between the actors based on their motions, and discrimination improved when displays were distorted to exaggerate differences in the velocities among the actors (e.g., how fast the elbow of actor A moved relative to that of actor B). These latter displays were essentially motion caricatures; the motion distortions are analogous to the shape distortions in facial caricatures, which allow better discrimination than the stimuli used for training because they emphasize the dimensions used to encode the original face as different from a central prototype. The utility of such an approach will be determined in part by success in identifying the dimensions underlying event transformations.

James J. Gibson (1979) argued that any approach to perception based on static configurations was doomed to failure; a successful theory must be based on changes (events), which provide much richer information about the world. Much of the work on the traditionally central issues of perception and cognition reflects the assumption that perceptual (p.17) experience must be based on static information about the here and now, divorced from temporal context. For example, the concern with shape constancy arose from a wrong-headed conception of a system that needed to correct for distortions introduced by projecting an image onto the retina. No correction is needed if visual input is considered over time, Gibson hypothesized, because there would always be optical information available over time for the shape of an object. The task of the vision scientist is to discover what information is present and being used (e.g., what properties of an object are preserved over changes in viewing direction to allow shape recognition). Gibson referred to a property of the environment that was constant over a temporal transformation as a structural invariant. Structural invariants allow object recognition without having to “account” for the transformation. For example, humans can recognize individuals despite aging; to do so, humans must use some structural invariant, some property of the face that is untouched by the ravages of time. This property would remain unchanged despite the surface changes that occur as a human ages. Similarly, some invariant relation must also allow face identification despite the more transient elastic changes associated with changes of facial expressions.

Just as we may recognize a face despite a change in age or emotional state, so too may we recognize the elastic changes associated with each transformation—when we meet someone new, we have a sense of his or her age and emotional state. Robert Shaw (1974, 1975) had the critical insight that the same logic used for object perception could be applied to event perception; there is some property, a transformational invariant, that is present for certain transformations and allows us to identify the transformation even when it is applied to different, novel objects (Pittenger & Shaw, 1975; Shaw, McIntyre, & Mace, 1974).

Faces age in a characteristic manner; the shape changes that occur in skulls as animals grow are relatively consistent—all skulls show similar topological changes with growth (Pittenger & Shaw, 1975). Pittenger and Shaw (1975) identified two changes—sheer and strain—in skulls and investigated observers’ use of these as transformational invariants for aging. Subjects were sensitive to the level of shear and strain applied to novel faces. Furthermore, when extreme values of the transformations were applied to faces, simulating ages that humans never achieve, the “super-aged” faces were perceived to be older than those in the normal range. The power of this approach is that it accounts for the (p.18) appearance of age in unusual objects, such as when cartoon animators imbue naturally inanimate objects with age (e.g., creating a cute baby hammer or an elderly toaster). As noted by Pittenger and Shaw, once the transformational invariant is identified, it is possible to apply the transformation to many other things.

A transformational invariant provides the viewer with access to the essence of change. It is the change, duration of life, or degree of smile, divorced from the objects interacting in a particular event. Such information effectively forms the basis for event concepts, making concept formation an essentially perceptual achievement—the discovery of the transformational invariant. Given the limits of attention, it is likely that perception of events is based on the transformation of a small number (fewer than four) of objects or properties of objects. However, expertise may broaden the potential transformations one can process. Just as a chess master may take into account and remember the relative locations of many pieces on the board, an experienced soccer player may pass a ball based on the locations, orientation, and movement of most of his own team and the opposition.

I suspect many readers may balk at the notion of a perceptual basis to complex social events like wars or elections. But complex event concepts may be built upon simpler concepts. These simple concepts may be the early event concepts, which have a perceptual basis and may serve as the building blocks for latter ones. Baldwin and colleagues have made just such an argument for the development of attribution of intention (Baldwin, Baird, Saylor, & Clark, 2001).

Where will a transformational-invariant approach lead? Initially, perhaps to many independent research strands, each attempting to characterize the transformational invariants for a particular class of events. Whether these strands weave a structure that offers broader generalizations and understanding of event psychology or an impenetrable snarl of unrelated findings remains to be seen.

3. Representation

Events may be seen as units, but how hard can one push the analogy to objects? May we use research in the object perception literature to guide us down new avenues of research—can we bootstrap research on events using object perception? Consider the critical features of many (p.19) objects—their edges, surfaces, and shape. Do events have analogous properties? To answer, consider why we perceive those features.

Visual processes tend to focus on the edges of objects because objects are relatively homogeneous and the qualities at their edges tend to be predictive of their central regions (e.g., texture and color at the edge tend to match the texture and color of the interior of the object). Thus, edges efficiently provide information about the whole object. There is a class of illusions that occur because the visual system appears to use the value for a perceptual property found at the edge of the object (e.g., lightness) and apply it to the entire object. In the Craik-O’Brien-Cornsweet illusion, for example, a circle appears to be lighter than its surround, even though the central region of the circle matches the luminance of the surround (Cornsweet, 1970). At the edge, there is a luminance cusp: from the center, which has the same luminance as the surround, there is a gradual increase in luminance up to the edge, and then an abrupt drop in luminance at the edge to the common luminance value. The visual system registers the difference at the edge—greater luminance inside the circle—and does not account for the gradual shift to a common luminance value, so the perceptual experience is one of a circle that is allover lighter than its surround.

Newtson and Engquist (1976) reported that event processing shows an analogous emphasis on the temporal regions near points of change. Subjects recall slide shows made of event breakpoints (places where one event ends and the next begins—the event analogue to objects’ edges) better than slide shows made of the “central regions” of events. Attention to the event edges may reflect something about the organization of events, such that the temporal regions near points of segmentation provide efficient information about the intervening times. Observers can use the boundary properties to interpolate the missing pieces of an event, whereas event centers do not provide useful information about how the events begin and end.

If event boundaries are particularly important in understanding a scene, then one may be able to generate event illusions by analogy to object illusions. For example, abrupt, transient changes in an object may be perceived to continue throughout an event segment. If a character’s face abruptly changes from neutral to angry and then gradually returns to neutral, the event may be perceived as someone getting angry and staying angry. As in the Craik-O’Brien-Cornsweet illusion, a gradual (p.20) decrease in anger may be noted, but it may not be experienced as a return to neutrality.

A second reason for attending to the boundaries of objects is that shape and function are related in most objects (Gibson, 1979). So, is there an event analogy to shape? The obvious analogy is to space-time paths. Just as the physics of material objects relates form and function, so too the physics of moving objects relates space-time form to function. For example, Niko Troje (Troje & Westhoff, 2005) argues that humans can recognize local features of animate locomotion based on the patterns associated with the motion of the limbs. Here we may recognize the dynamics of locomotion based on the space-time shape of the limb motion. Similarly, bouncing balls have a unique space-time trajectory that may allow recognition of the dynamics involved in bouncing, independent of the details (shape, size, color, etc.) of the objects involved (Muchisky & Bingham, 2002). Models of how we encode object shapes are being developed (Singh & Hoffman, 2001); these may serve as a starting point for a model of event “shape” recognition (see also Chapters 15, 16, and 17 for discussions of event segmentation).

We must be able to recognize the actions in an event in order to use language to describe them. Learning labels for actions, such as verbs and prepositions, requires abstracting the action—that is, being able to recognize instances of the action despite changes in the actor. This suggests that event recognition should be transpositionally invariant. If novel motion paths have the same space-time shape despite changes in the location, size, or identity of the moving object, it should be possible to recognize the path similarity across the events (Shipley, 2005). Whether an approach involving recognition of space-time shapes can be applied to other aspects of events, such as the manner of motion, remains to be seen.

4. Feature Binding

Finally, any model of the mechanisms underlying object perception must describe how information about different aspects of an object is combined; the various features of an object may come from different neural processing streams, likely from different senses. How does this feature binding occur? How are the features of an object, such as shape, texture, color, and so forth, bound together into the phenomenologically unitary object? Treisman and Gelade (1980) have argued that the visual (p.21) system employs an object file, in an analogy to a paper filing system, where all of the features of an object are kept together in a single file folder. To the extent that separate aspects of an event are processed and combined, we must consider how event-feature binding is achieved. In an event, the space-time behavior of objects must be combined. In the simple case of hitting an object, the object properties of the “hitter” object must be combined with the motion of that object, as well as with the motion and object features of the “hittee” object. Significant failures in binding would result in an observer either not experiencing or misperceiving the event. Errors within an event might lead to reversing the roles of doer and done-to. Errors across events would result in the event analogy to an illusory conjunction, where some objects or actions of one event were combined with those of another event.

Some recent attention research has considered feature binding over time. One prominent example of temporal error is the flash-lag effect (Nijhawan, 2002), in which subjects misperceive the temporal relation between punctate and ongoing events. However, most of the research on temporal errors has focused on fairly short temporal scales. How binding occurs on longer scales, particularly those relevant for coordinated action, is an open question.

One may imagine that objects, their behavior, and their spatial relations with one another are kept in a sort of event file (Hommel, 2004). The ongoing experience of events ending and beginning would then correspond to the opening and closing of such files. This is not so much a new theory as a different perspective on event-processing models put forward by Jeff Zacks and colleagues (Reynolds, Zacks, & Braver, 2004). Perhaps the analogy to work on feature binding in object perception will suggest some new directions.

In conclusion, I think it worth stepping back from our study of perceptual processing of objects so that we may consider the temporal stream in which they swim. Perhaps object perception is just a special case of the more general perceptual processes that provide information about events.

Event Taxonomy

Events, as defined here, encompass a wide range of things. The perceptual and conceptual processes that apply to one type of event may (p.22) not apply to other types in the same way. For example, cyclical events may be processed differently from discrete events. In both cases, we may wish to coordinate our behavior with the world by taking advantage of event structures to help us predict the future and enable functional behavior. However, this goal may be accomplished in different ways. The recurring structure of cyclical events may be used to anticipate their future. Recurring structure in small-scale cycles may help us coordinate walking with a friend, while at longer scales recurring structures allow coordination of sleep cycles to the coming and going of daylight, and weight gains and losses to the changing of the seasons. Coordination with discrete events, in contrast, may require categorizing an event in such a way that we may take advantage of similarities in temporal structures across events. So, adaptive behavior requires recognizing an event as a member of a particular class.

There are many ways one might divide up the domain of events beyond “cyclical” or “not cyclical.” As students of event processing, we would like to have some way to divide up the domain in a way that highlights regularities in how events are processed by humans. In this last section, I offer some thoughts on event taxonomies.

Perhaps the most obvious place to start is with temporal categories, as time is a key component of our definition of an event. Is it useful to group events by their duration? At the extremes, time scale must matter: events that evolve over long time scales, such as erosion of the Grand Canyon, are surely processed differently than events that occur within a single human lifetime. Furthermore, one might expect that relatively instantaneous events, such as hitting or breaking an object, might be processed differently than minute-, hour-, or day-long events. To the extent that events on different time scales are used to guide different functional classes of behavior (e.g., danger avoidance or action coordination at short scales and more vegetative or goal-directed behaviors at longer scales), we would expect an examination of observers’ reactions to events of varying durations to bear fruit. In addition, the neural structures responsible for tracking longer-time-scale events may be distinct from those used for shorter scales (Gibbon, Malapani, Dale, & Gallistel, 1997). How do humans reason about events at very long time scales? I suspect by analogy (e.g., erosion of the Grand Canyon is like the erosion caused by a wave at the beach), so understanding the processing of short events may provide some basis for conceptually (p.23) understanding longer time events. But is time scale the best basis for dividing up events?

As the science of events develops, the need for a functional taxonomy will become paramount. One route to take would be to attempt to construct an exhaustive list of more or less non-overlapping types of events and then see if they form any sort of hierarchy. Gibson offered an initial taxonomy, shown in Table 1.1, based in part on function, in his musings about events in The Ecological Approach to Visual Perception (1979). This scheme puts many of the traditional areas of perception (e.g., perception of lightness, color, and motion) into what Gibson viewed as their appropriate context—information that specified some class of events. Color perception, for example, allows perception of changes in plant, animal, and terrestrial surfaces. Motion perception allows us to perceive complex changes in layouts. The research offered by Johansson (1973) and Michotte (1946) may be seen as specifying, to some extent, the details of what information is important in each category.

A related route to developing a taxonomy has been offered by Bingham, Schmidt, and Rosenblum (1995). Building on the work of Runeson

Table 1.1. James J. Gibson’s Event Taxonomy

Changes in Layout

  1. Rigid translation and rotations of an object (displacements [e.g., falling], turns [opening a door], combinations [rolling a ball])

  2. Collisions of an object (with rebound and without)

  3. Nonrigid deformations of an object (inanimate and animate)

  4. Surface deformations (waves, flow, elastic, or plastic changes)

  5. Surface disruptions (rupturing, disintegration, explosion)

Changes in Color and Texture

  1. Plant surfaces (greening, fading, ripening, flowering)

  2. Animal surfaces (coloration of skin, changes of plumage, changes of fur)

  3. Terrestrial surfaces (weathering of rock, blackening of wood, rust)

Changes in Surface Existence

  1. Melting, dissolving, evaporating, sublimating, precipitating

  2. Disintegration, biological decay, destruction, aggregation, biological growth

  3. Construction

Adapted from Gibson, J. J. (1979). The ecological approach to visual perception (p. 99). Boston: Houghton Mifflin.

(p.24) and Frykholm (1983), which suggests that event dynamics (the forces involved in an event) are available for perception through kinematics (the visible motions of objects), Bingham et al. (1995) argued that if event dynamics are primarily responsible for event recognition, then an event taxonomy should map onto taxonomies in the underlying physics. Bingham et al. noted the fundamental division in the physical laws describing four types of events: rigid body dynamics, hydrodynamics (the physics of fluids), aerodynamics (the physics of gases), and biodynamics. Subjects were indeed sensitive to the patterns of motions present in each type of event when shown path-light versions of events in each category, and a cluster analysis of descriptions of the events confirmed that the dynamics were more important in determining similarity than surface features of the kinematics.

How might one evaluate Gibson’s taxonomy? Taxonomies should both organize existing knowledge about a domain and guide research in the domain. Gibson’s categories may make sense from the perspective of physics. The information specifying the dynamics in each of the categories is likely to differ. However, it is not clear that there are any constraints on the number of categories (although Bingham et al.’s use of cluster analysis is a nice way to approach this problem). Additionally, it is hard to assess how well this taxonomy classifies the events humans care about.

This raises a question: what events do humans care about? As a rough-and-ready measure of the importance of various events, one may look to the usage frequency of English verbs. Table 1.2 lists the top hundred verbs from the British National Corpus according to frequency (Leech, 1992). The most frequent verbs (to be and to have) have achieved their standing in part by virtue of the roles they play in marking tense. As with these two, the frequency of each subsequent verb presumably reflects its relative importance. These are the events people talk about and presumably attend to most. Several things are notable about this list. The first is that many of the events described by these verbs are not obviously classifiable within the categories offered by Gibson. To be fair to Gibson, the taxonomy in Table 1.1 was not intended to be complete. Nevertheless, the nature of the unclassifiable verbs is revealing; for want of a more precise characterization, they might generally be classified as “social” verbs. Gibson seems to need a fourth class of transformations: changes in cognitive state (or, if you are a behaviorist, a subclass of changes in layout with very long-term temporal implications).


Table 1.2. The 100 Most Frequent English Verbs

1. be

26. feel

51. believe

76. fall

2. have

27. may

52. allow

77. speak

3. do

28. ask

53. lead

78. open

4. say

29. show

54. stand

79. buy

5. go

30. try

55. live

80. stop

6. get

31. call

56. happen

81. send

7. make

32. keep

57. carry

82. decide

8. see

33. provide

58. talk

83. win

9. know

34. hold

59. sit

84. understand

10. take

35. follow

60. appear

85. develop

11. think

36. turn

61. continue

86. receive

12. come

37. bring

62. let

87. return

13. give

38. begin

63. produce

88. build

14. look

39. like

64. involve

89. spend

15. use

40. write

65. require

90. describe

16. find

41. start

66. suggest

91. agree

17. want

42. run

67. consider

92. increase

18. tell

43. set

68. read

93. learn

19. put

44. help

69. change

94. reach

20. work

45. play

70. offer

95. lie

21. become

46. move

71. lose

96. walk

22. mean

47. pay

72. add

97. die

23. leave

48. hear

73. expect

98. draw

24. seem

49. meet

74. remember

99. hope

25. need

50. include

75. remain

100. create

Many of the most common verbs describe mental events. This does not mean that a perception-based approach to event understanding should be relegated to remote regions of this list where physical-action verbs appear. Perception of mental events may appear to be an oxymoron, but recall that that is essentially what Heider and Simmel’s subjects reported. For example, one may perceive that an object can “see” based on its behavior. If another organism moves out of the path of an object, we take this as evidence of distal perceptual abilities—it saw and avoided an approaching object. So, I think a relatively atheoretical way to begin working on understanding events would be to try to understand how humans process the events that are represented in the high-frequency verbs.

The disconnect between frequently spoken verbs and frequently researched verbs is striking. The most extensively studied event (hitting) (p.26) does not make the top 100 (it is #218),4 and the most extensively researched biological motion (walking) barely made it into the top 100 (it is #96).

Verb frequency presumably reflects the importance of events as well as the breadth of the event category (the broadest categories will tend to appear more frequently, as they can be used to refer to more events in the world). Many of the most frequent verbs are very general; they include a broad array of events (e.g., do and know, which are the broadest physical and mental event descriptors). Other verbs in the top 10 also cover very broad classes of human action (go, get, make, take). A few of these verbs are more specific (say, see, know), referring to particular types of activity (verbal action, perception, and cognition); their frequency presumably reflects their importance in human interactions.

Beyond frequency of usage, language offers some intriguing hints about how humans represent events, and its study might be a (rocky) road to a hierarchical, inclusive taxonomy of events. Hierarchical organization may be discovered by consideration of cross-linguistic patterns of verb structure. Talmy (2000) offers some potential categories for types (or features) of motion events that appear to be represented in many languages, such as path of motion, manner of motion, and whether or not the motion was causal. Their cross-cultural prevalence suggests they may reflect some of the basic units of events (see also Chapters 7 and 8 in this volume).

Finally, rather than attempting to construct a taxonomy that would encompass all events and offer meaningful subdivisions, one might try to build the taxonomy from the bottom up. Here one might start with taxonomies that have been constructed for local domains and see if they can be combined or broadened to encompass more general categories of events. For example, one might begin with dance notation or music theory. In both cases, detailed taxonomies exist for the “molecular” events, such as musical notes and body movements, as well as the more “molar” events of a movement or a pas de deux. With both music and dance, psychologists and students of the disciplines have come to some understanding of how (p.27) humans process these large-temporal-scale events. Perhaps we may use what we know about the development of dance and music appreciation to understand how humans learn about long, complex events.


Events, like objects, are seen to have boundaries. These boundaries may reflect the statistical structure of events as we have experienced them, either individually or as a species. Similarities in the way we conceive of events and objects may reflect underlying similarities in the function of treating objects and events as bounded. Just as we can mentally complete the form of an object partially obscured from sight (because it is not the tail of the tiger that will kill you but the rest of the tiger hidden behind the bushes), so may we anticipate the future (because it is not the crouching of the tiger that kills you but what happens beyond it). So, if you have made it this far, dear reader, you must be game; I challenge you to use the rest of this book to develop new and fruitful research programs on events.


Bibliography references:

Avrahami, J., & Kareev, Y. (1994). The emergence of events. Cognition, 53, 239–261.

Baldwin, D. A., Baird, J. A., Saylor, M. M., & Clark, M. A. (2001). Infants parse dynamic action. Child Development, 72, 708–717.

Barsalou, L. W. (1983). Ad hoc categories. Memory and Cognition, 11(3), 211–227.

Biederman, I., & Bar, M. (2000). Views on views: Response to Hayward & Tarr (2000). Vision Research, 40(28), 3901–3905.

Bingham, G. P., Schmidt, R. C., & Rosenblum, L. D. (1995). Dynamics and the orientation of kinematic forms in visual event recognition. Journal of Experimental Psychology: Human Perception and Performance, 21(6), 1473–1493.

Clark, E. V. (2001). Emergent categories in first language acquisition. In M. Bowerman & S. C. Levinson (Eds.), Language acquisition and conceptual development (pp 379–405). Cambridge: Cambridge University Press.

Cornsweet, T. N. (1970). Visual perception. New York: Academic Press.

Dittrich, W. H., Troscianko, T., Lea, S., & Morgan, D. (1996). Perception of emotion from dynamic point-light displays represented in dance. Perception, 25, 727–738.

Fodor, J. (1983). The modularity of mind: An essay on faculty psychology. Cambridge, MA: MIT Press.

(p.28) Gibbon, J., Malapanic, C., Daleb, C. L., & Gallistel, C. R. (1997). Toward a neurobiology of temporal cognition: Advances and challenges. Current Opinion in Neurobiology, 7(2), 170–184.

Gibson, J. J. (1979). The ecological approach to visual perception. Boston: Houghton Mifflin.

Gibson, J. J., Kaplan, G. A., Reynolds, H. N., & Wheeler, K. (1969). The change from visible to invisible: A study of optical transitions. Perception & Psychophysics, 5, 113–116.

Hanson, C., & Hanson, S. J. (1996). Development of schemata during event parsing: Neisser’s perceptual cycle as a recurrent connectionist network. Journal of Cognitive Neuroscience, 8, 119–134.

Hayward, W. G. (2003). After the viewpoint debate: Where next in object recognition. Trends in Cognitive Sciences, 7, 425–427.

Hayward, W. G., & Tarr, M. J. (2000). Differing views on views: Comments on Biederman and Bar (1999). Vision Research, 40, 3895–3899.

Heider, F., & Simmel, M. (1944). An experimental study of apparent behavior. American Journal of Psychology, 57, 243–259.

Hespos, S. J., & Rochat, P. (1997). Dynamic representation in infancy. Cognition, 64, 153–189.

Hill, H., & Pollick, F. E. (2000). Exaggerating temporal differences enhances recognition of individuals from point light displays. Psychological Science, 11, 223–228.

Hirsh-Pasek, K., & Golinkoff, R. (in press). Action meets words: How children learn verbs. New York: Oxford University Press.

Hommel, B. (2004). Event files: feature binding in and across perception and action. Trends in Cognitive Sciences, 8(11), 494–500.

Johansson, G. (1973). Visual perception of biological motion and a model for its analysis. Perception & Psychophysics, 14, 201–211.

Kellman, P. J., Guttman, S. E., & Wickens, T. D. (2001). Geometric and neural models of object perception. In T. F. Shipley & P. J. Kellman (Eds.), From fragments to objects: Segmentation and grouping in vision (pp 181–246). Amsterdam: Elsevier Science.

Klinkenborg, V. (2005, Aug. 23). Grasping the depth of time as a first step in understanding evolution. New York Times.

Koslowski, L. T., & Cutting, J. E. (1977). Recognizing the sex of a walker from a dynamic point-light display. Perception and Psychophysics, 21, 575–580.

Leech, G. (1992). 100 million words of English: The British National Corpus. Language Research, 28(1), 1–13.

Michotte, A. (1946). La perception de la causalité. Louvain, France: Institut Superieur de Philosophie.

Michotte, A., Thines, G., & Crabbe, G. (1964/1991). Amodal completion of perceptual structures (E. Miles & T. R. Miles, trans.). In G. Thines, A. Costall, & G. Butterworth (Eds.), Michotte’s experimental phenomenology of perception (pp. 140–167). Hillsdale, NJ: Erlbaum.

Muchisky, M., & Bingham, G. (2002). Trajectory forms as a source of information about events. Perception & Psychophysics, 64(1), 15–31.

Neisser, U. (1976). Cognition and reality: Principles and implications of cognitive psychology. San Francisco: W. H. Freeman.

(p.29) Neisser, U., & Becklen, R. (1975). Selective looking: Attending to visually specified events. Cognitive Psychology, 7, 480–494.

Newtson, D., & Engquist, G. (1976). The perceptual organization of ongoing behaviour. Journal of Experimental Social Psychology, 12, 436–450.

Nijhawan, R. (2002). Neural delays, visual motion and the flash-lag effect. Trends in Cognitive Sciences, 6(9), 387–393.

Paterson, H. M., & Pollick, F. E. (2003). Perceptual consequences when combining form and biological motion. Journal of Vision, 3(9), 786a.

Pittenger, J. B., & Shaw, R. E. (1975). Aging faces as viscal-elastic events: Implications for a theory of nonrigid shape perception. Journal of Experimental Psychology: Human Perception and Performance, 1, 374–382.

Pollick, F. E., Kay, J., Heim, K., & Stringer, R. (2005). Gender recognition from point-light walkers. Journal of Experimental Psychology: Human Perception and Performance, 31, 1247–1265.

Reynolds, J. R., Zacks, J. M., & Braver, T. S. (2004). A computational model of the role of event structure in perception. Annual Meeting of the Cognitive Neuroscience Society.

Riccio, G. E., & Stoffregen, T. A. (1991). An ecological theory of motion sickness and postural instability. Ecological Psychology, 3, 195–240.

Runeson, S., & Frykholm, G. (1981). Visual perception of lifted weight. Journal of Experimental Psychology: Human Perception & Performance, 7(4), 733–740.

Runeson, S., & Frykholm, G. (1983). Kinematic specification of dynamics as an informational basis for person-and-action perception. Journal of Experimental Psychology: General, 112(4), 585–615.

Shaw, R. E., McIntyre, M., & Mace, W. M. (1974). The role of symmetry in event perception. In R. B. MacLeod & H. L. Pick (Eds.), Perception: Essays in honor of James J. Gibson (pp. 276–310). Ithaca: Cornell University Press.

Shipley, T. F. (2005). Event path perception: Recognition of transposed spatiotemporal curves. Abstracts of the Psychonomics Society, 10, 46.

Shipley, T. F., & Kellman, P. J. (1993). Optical tearing in spatiotemporal boundary formation: When do local element motions produce boundaries, form and global motion? Spatial Vision, 7(4), 323–339.

Shipley, T. F., & Kellman, P. J. (1994). Spatiotemporal boundary formation: Boundary, form, and motion perception from transformations of surface elements. Journal of Experimental Psychology: General, 123(1), 3–20.

Shipley, T. F., & Kellman, P. J. (Eds.) (2001). From fragments to objects: Segmentation and grouping in vision. Amsterdam: Elsevier Science.

Singh, M., & Hoffman, D. D. (2001). Parts-based representations of visual shape and implications for visual cognition. In T. F. Shipley & P. J. Kellman (Eds.), From fragments to objects: Segmentation and grouping in vision (pp. 401–459). Amsterdam: Elsevier Science.

Talmy, L. (2000). Toward a cognitive semantics: Language, speech, and communication. Cambridge, MA: MIT Press.

Treisman, A. M., & Gelade, G. (1980). A feature-integration theory of attention. Cognitive Psychology, 12(1), 97–136.

Tremoulet, P., & Feldman, J., (2000). Perception of animacy from the motion of a single object. Perception, 29, 943–951.

(p.30) Troje, N. (2002). The little difference: Fourier-based synthesis of gender specific biological motion. In R. P. Würtz & M. Lappe (Eds.), Dynamic perception (pp. 115–120). Berlin: AKA Press.

Troje, N. F., & Westhoff, C. (2005). Detection of direction in scrambled motion: A simple “life detector”? Journal of Vision, 5(8), 1058a.

Wilder, D. A. (1978a). Effect of predictability on units of perception and attribution. Personality and Social Psychology Bulletin, 4, 281–284.

Wilder, D. A. (1978b). Predictability of behaviors, goals, and unit of perception. Personality and Social Psychology Bulletin, 4, 604–607.

Zacks, J. M., Speer, N. K., Swallow, K. M., Braver, T. S., & Reynolds, J. R. (2007). Event perception: A mind/brain perspective. Psychological Bulletin, 133(2), 273–293.

Zacks, J. M., & Tversky, B. (2001). Event structure in perception and conception. Psychological Bulletin, 127(1), 3–21.

Zalla, T., Verlut, I., Franck, N., Puzenat, D., & Sirigu, A. (2004). Perception of dynamic action in patients with schizophrenia. Psychiatry Research, 128(1), 39–51.


(1) . The attraction of some creationist arguments may lie in their shortening time scales to a more familiar and cognitively manageable range (Klinkenborg, 2005).

(2) . The lack of terminology may reflect the youth of the field. Arguably, this field does not even have a name. When I try to describe the scope of this book to colleagues, I say, “It is about the psychology of events.” This gets me everything from blank looks to very narrow conceptions of the content. Perhaps the name should be taken from the field’s rightful precursor, event perception, which I understand can be traced to Gunnar Johansson’s 1950 dissertation, and, to be fair, researchers in that area have been making significant contributions for a while. The problem with that name is that it evokes either a narrow conception or an unfair dismissal as a “flaky field” due to its association with the ecological approach to perception. Finding a name for a domain is hard. Event representation also seems too narrow. A wag proposed flurb, which has the distinct advantage of being better than the more obvious eventology, but I am not sure it meets the more stringent criterion of being better than nothing. More seriously, the field may profit from a few terms of art. It would be useful to distinguish between an event as something that occurs in the environment and an event unit as the corresponding psychological representation of a part of a physical event.

(3) . Our normally exquisite ability to coordinate action makes persistent failures of coordination meaningful. Such failures have been hypothesized to be a source of information for impaired sensory systems and used as evidence of poisoning, thus leading to motion sickness (Riccio & Stoffregen, 1991).

(4) . I suspect the verb hitting is most commonly used in normal speech to refer to an animate actor hitting something, not the inanimate collisions studied by psychologists in event perception. The most frequent verb for inanimate collisions would be bouncing, which comes in at #1137.