Jump to ContentJump to Main Navigation
The Case for Mental Imagery$

Stephen M. Kosslyn, William L. Thompson, and Giorgio Ganis

Print publication date: 2006

Print ISBN-13: 9780195179088

Published to Oxford Scholarship Online: April 2010

DOI: 10.1093/acprof:oso/9780195179088.001.0001

Show Summary Details
Page of

PRINTED FROM OXFORD SCHOLARSHIP ONLINE (www.oxfordscholarship.com). (c) Copyright Oxford University Press, 2017. All Rights Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a monograph in OSO for personal use (for details see http://www.oxfordscholarship.com/page/privacy-policy). Subscriber: null; date: 27 February 2017

(p.185) Appendix Neuroimaging Studies of Visual Mental Imagery

(p.185) Appendix Neuroimaging Studies of Visual Mental Imagery

Source:
The Case for Mental Imagery
Publisher:
Oxford University Press

What follows is a review of the neuroimaging studies of visual mental imagery that were included in the meta-analysis summarized in chapter 4. The inclusion criteria for this review were as follows: (a) Only studies of visual mental imagery were considered, and thus studies of auditory and motor imagery were excluded;1 (b) only positron emission tomography (PET), functional magnetic resonance imaging (fMRI), or single photon emission computed tomography (SPECT) studies were included (the spatial resolution of electroencephalography is too low for present purposes; similarly, findings from studies of brain-damaged patients were not treated as primary data because the lesions typically are too large and diffuse to localize precisely their effects in the brain); (c) each study had to include at least one condition in which participants were specifically instructed to use visual mental imagery or were compelled to do so by the nature of the task; (d) studies had to include a baseline condition, whether rest, an “off condition,” or an appropriately matched control condition, and the analysis technique contrasted the imagery condition with at least one baseline; (e) studies that did not require generating an imagined pattern on the basis of stored information were excluded; this criterion led to the exclusion of studies of mental rotation in which stimuli were presented purely perceptually (for a review of such studies, see Thompson & Kosslyn, 2000); and (f) only studies that were reported in enough detail to allow further analysis were included. Thus, the following review includes only findings that appeared in full reports, not those from abstracts or short descriptions that appeared as part of a larger review. Studies were located by searching PsycINFO and MEDLINE and reviewing all appropriate journals from 1987 through the end of 2002.2

(p.186) When we coded the studies, each reported imagery condition was compared with one baseline condition. If only a resting baseline was reported, it was coded accordingly. If, in addition, one or more other baseline conditions were reported, the comparison between resting baseline and the imagery condition was not considered because Kosslyn, Thompson, Kim, et al. (1995) previously showed that this baseline leads to hypermetabolism in the early visual cortex; instead, the baseline condition judged most appropriate for isolating activation due to imagery per se was coded. If more than one imagery condition was reported in a given study, each condition was considered separately in the analysis. However, the same data (i.e., the same participants performing the same imagery condition) were not coded more than once.

The review is divided into two major parts. We first consider studies that reported activation of the early visual cortex during visual mental imagery; following this, we consider studies that did not report such activation. Early visual cortex is defined as areas 17 and 18, which were taken to include primary visual cortex, secondary visual cortex, and medial occipital cortex. In many studies, the Talairach coordinates were provided, which allowed verification of the location of the areas; when these coordinates were not provided, the authors’ classification of the location of the activation was respected. In some cases in which the authors’ classification was vague, the data were coded in accordance with figures (if they were provided), descriptions of the activated regions, or specifications of the regions of interest (typically provided in SPECT studies).

Within each of the two major parts of this appendix, we review first those studies that relied on PET (initially the most common neuroimaging method used to study imagery), then fMRI, and then SPECT. Within each of these subsections, we present the studies in chronological order. In what follows, we do not report lateralized results because the lateralized differences were typically very small in these medial brain structures (often well within the spatial resolution of the techniques); however, such information is provided in table A.1 for other reported regions of activation.

Studies Reporting Activation of the Early Visual Cortex

The following studies reported activation of the early visual cortex during visual mental imagery.

PET Studies

For the most part the PET studies were analyzed using parametric statistics, averaging over participants. The early visual cortex was treated as having been activated if there was statistically greater activation in this area during imagery than during a baseline condition.

(p.187)

Table A.1 Summary of Key Variables and Results of the Studies

Study/Comparison

Talairach coordinates

IT

N

HR

NS

EI

NR

Other activated regionsa

Charlot et al. (1992)

 

S

11

N

N

Y

N

Chen et al. (1998), hometown walking task

4M

9

Y

N

Y

Y

18, 39

Chen et al. (1998), flashing light task

4M

12

Y

Y

Y

Y

D’Esposito et al. (1997)

 

1M

7

N

Y

N

Y

L-3d, L-10d, L-38b

Formisano et al. (2002)

 

1M

6

N

N

N

Y

B-5, B-9, B-23, B-34a, B-37c, 46b

Ghaem et al. (1997), VIL Task

 

P

5

N

Y

Y

N

B-12b, L- 12c, L-15d, L-22b, R-33b, L-35a

Goebel et al. (1998), contours task

1M

5

N

Y

Y

Y

B-14b, 24, B-44d, 50b

Goebel et al. (1998), dots task

1M

5

Y

Y

Y

Y

B-14b, 24, B-44d, 50b

Goebel et al. (1998), stripes task

1M

5

Y

Y

Y

Y

B-14b, 24, B-44d, 50b

Goldenberg, Podreka, Steiner, & Willmes (1987)

 

S

11

N

Y

N

Y

L-21a, B-42a, L-45a

Goldenberg, Podreka, Steiner, et al. (1989) experiment 1, high-imagery sentences

S

14

Y

Y

N

Y

B-49a

Goldenberg, Podreka, Steiner, et al. (1989) experiment 2, corner counting task

 

S

18

Y

N

Y

Y

Goldenberg, Podreka, Uhl, et al. (1989), colors

 

S

10

N

Y

Y

N

Goldenberg, Podreka, Uhl, et al. (1989), faces

 

S

10

N

Y

Y

N

L-12a, R-15a

Goldenberg, Podreka, Uhl, et al. (1989), map

 

S

10

N

N

Y

N

Goldenberg, Podreka, Steiner, Franzen, & Deecke (1991)

 

S

14

Y

Y

Y

Y

L-49a

Goldenberg, Steiner, et al. (1992), high-imagery sentences

S

10

Y

Y

N

Y

Gulyás (2001), alphabet task

 

P

10

N

Y

N

N

B-17c, L-20a, L-25c, L-35c, R-42e

Gulyás (2001), anthem task

 

P

10

N

Y

N

N

R-8, L-13d, B-17c, L-20a

Handy et al. (2004), nouns

1M

14

N

Y

N

N

L-10d, R-13e, L-13f, L-14h, L-20a, L-21f, L-35c, L-36c

Handy et al. (2004), pictures

1M

14

Y

Y

Y

N

L-10d, L-13c, L-14h, L-16d, L- 21f, R-21g, L-21h, L-36c, R-42f

Ishai, Ungerleider, & Haxby (2000)

 

1M

9

N

Y

Y

Y

B-3f, B-7a, B-10e, B-10f, B-10g, B-13a, B-15d, B-17e, B-21d, B-25b, B-36a, B-42d, B-42h

Ishai, Haxby, & Ungerleider (2002), LTM

12, 61, 5

3M

9

Y

Y

N

Y

R-1, L-10c, B-13a, B-17a, B-36a, B-45dc

Ishai, Haxby, & Ungerleider (2002), LTM + attention

12, 61, 5

3M

9

Y

Y

N

Y

R-1, L-10c, B-13a, B-17a, L-36a, B-45dc

Ishai, Haxby, & Ungerleider (2002), STM

12, 61, 5

3M

9

Y

Y

Y

Y

R-1, L-10c, B-12a, B-13a, B-17a, B-36a, B-45dc

Ishai, Haxby, & Ungerleider (2002), STM + attention

12, 61, 5

3M

9

Y

Y

Y

Y

B-1, L-10c, B-13a, B-17a, B-36a, B-45dc

Klein, Paradis, et al. (2000), abstract event 1

3M

8

N

Y

N

Y

Klein, Paradis, et al. (2000), concrete event 1

3M

8

Y

Y

N

Y

Klein, Paradis, et al. (2000), abstract event 2

3M

8

N

Y

N

Y

Klein, Paradis, et al. (2000), concrete event 2

3M

8

Y

Y

N

Y

Knauff et al. (2000)

1M

10

Y

N

Y

Y

L-14e, B-25d, B-44i

Kosslyn, Alpert, Thompson, Maljkovic, et al. (1993), experiment 1

−1, −65, 12

P

7

Y

N

Y

Y

M-3a, M-33a, B-37c, L-39

Kosslyn, Alpert, Thompson, Maljkovic, et al. (1993), experiment 2

24, −63, 8; −18, −63, 28

P

5

Y

N

Y

Y

B-2a, R-4a, M-9, L-10a, R-12a, L-14a, L-15a, B-22b, M-36a, B-37c, B-44a, R-49a

Kosslyn, Alpert, Thompson, Maljkovic, et al. (1993), experiment 3

15,–89, 0; 8, −69, 8

P

16

Y

Y

N

N

L-22a, L-35a, R-37c, L-45a

Kosslyn, Thompson, Kim, et al. (1995)

–13, −102, 8; 15, −102, −4; 6, −88 −8; 8, −79, 4; 2, −83, 0; −4, −83, −8; −2, −60, 16

P

12

Y

Y

Y

Y

Kosslyn, Shin, et al. (1996)

15, −74, 20; 22, −88, 8; −2, −76, −4

P

7

Y

Y

Y

Y

B-16e, L-21j

Kosslyn, Pascual-Leone, et al. (1999)

−2, −88, −12

P

8

Y

Y

Y

Y

L-4b

Lambert et al. (2002)

0, −92, 4; 0, −72, −4; 0, −84, −10; −12, −84, 0; −4, −102, 10; −14, −96, −10; 12, −76, 10; 18, −82, 8; 2, −84, 18

1M

6

Y

Y

N

Y

B-7a, L-14d, L-14h, L-43c, B-44f, R-44h, R-45c

Le Bihan et al. (1993)

1M

7

N

Y

Y

Y

Mazard et al. (2002), no fMRI noise

 

P

6

N

N

Y

Y

B-7b, M-7d, R-13h, L-13i, L-14c, L-15e, R-22c, L-35f, B-36e, B-42h, L-42i, R-44e, M-49a

Mazard et al. (2002), with fMRI noise

 

P

6

N

N

Y

Y

R-3b, R-3e, L-7b, RM-7d, R-13h, L-13i, L-15e, B-16b, R-20c, R-21d, R-22c, M-30, L-35a, L-35f, B-36e, B-42h, M-49a

Mellet, Tzourio, Denis, & Mazoyer (1995)

 

P

8

Y

N

Y

N

7d, R-43b, 46a

Mellet, Tzourio, Crivello, et al. (1996)

 

P

9

N

N

Y

Y

R-13a, R-15d, R-36a, B-38c, B-43b, M-46a, R-47b

Mellet, Tzourio, Denis, & Mazoyer (1998)

 

P

8

N

Y

N

Y

L-10c, L-14f, B-15c, L-15d, L-35a, L-35e, L-38a

Mellet, Bricogne, et al. (2000), mental map

 

P

6

N

N

Y

N

R-12a, L-16b, L-17a, B-17d, B-19, R-21d, R-22d, B-35h, LM-42c, R-42h, B-45b, R-47b

Mellet, Bricogne, et al. (2000), mental navigation

 

P

5

N

N

Y

N

L-17d, R-21d, B-26, B-27, R-33b, B-35h, R-42c

Mellet, Tzourio-Mazoyer, et al. (2000), verbal encoding

 

P

7

N

N

Y

Y

B-7c, R-7d, R-13a, R-15d, L-16b, B-17a, R-21d, R-21k, RM-31, L-35e, B-41, R-42h, B-43b, L-49b

Mellet, Tzourio-Mazoyer, et al. (2000), visual encoding

 

P

7

N

N

Y

Y

R-3b, B-7c, L-7d, R-13a, R-15d, L-16c, B-17a, R-21k, M-31, R-42d, B-43b, L-49b

O’Craven & Kanwisher (2000), places

9, −48, 6; −21, −60, 18

1M

8

N

Y

Y

Y

Roland, Eriksson, et al. (1987)

 

P

10

Y

N

Y

N

B-6a, B-6b, L-7a, B-9, R-13b, B-13g, B-15f, L-16a, L-17b, B-17f, R-21b, R-21l, L-21m, B-37b, B-37d, R-37e, B-37f, B-37g, R-40, B-43a, B-44b, B-44c, R-45a, B-47a, B-49a

Roland & Gulyás (1995)

 

P

11

N

Y

Y

N

L-2c, R-3c, B-13h, L-15b, L-16b, B-21d, R-36b, R-42b, R-42h, L-42j, R-43a, R-44j, R-45e, L-48

Sabbah et al. (1995)

1M

10

N

Y

Y

Y

Sack et al. (2002)

 

1M

6

N

N

N

Y

B-11a, L-14i, M-20b, B-21i, L-32, R-35b, L-35d, L-35g, BM-36d, L-43d, B-44h

Shin et al. (1999)

−8, −68, −12; −16, −100, 0

P

8

Y

Y

Y

Y

L-7a, R-10c, R-14f, R-21e, L-35a, B-44g

Suchan et al. (2002)

 

P

10

N

N

Y

Y

M-8, L-14h, M-42f

Thompson et al. (2001)

−2, −90, −12

P

8

Y

Y

Y

Y

L-2b, L-7a

Trojano et al. (2000), experiment 1

 

1M

7

N

N

N

Y

L-14g, B-29, B-34b

Trojano et al. (2000), experiment 2

 

1M

4

N

N

N

Y

B-11b, L-14g, B-29, B-34b, L-35b, R-37a, R-42g

Wheeler et al. (2000)

 

1M

18

N

Y

Y

Y

L-10b, B-21c, B-25a, L-28a, R-28b, B-36d

aOther activated brain regions are reported when they were reported in the original study. Each number corresponds to a brain region reported in at least one study. We were faithful to the terminology the authors used to report the regions, which are listed alphabetically with their corresponding numbers. Brain areas are organized hierarchically, with a broad category defined and then subregions within that category listed under the general heading. L, R, or M before the region number indicates the laterality of the reported region (left, right, or midline, respectively). B indicates bilateral activation. If a number is presented without an accompanying letter, the laterality of the brain region was not reported. Regions more than five millimeters from the midline were considered to be lateralized. Reported brain regions and corresponding numbers are shown below.

b The comparison of all three motion imagery conditions against fixation also revealed the following regions to be activated: dorsolateral prefrontal cortex (area 9/46), precentral sulcus/ superior frontal gyrus (frontal eye field, area 6), anterior cingulate gyrus, and insular gyrus.

c Estimates of significance of other reported regions were based on effect size and error bars depicted in figures 3, 4, 5, and 6 of Ishai, Haxby, & Ungerleider (2002).

Notes. A dash indicates that data were not reported. The Talairach coordinates specify only the location in area 17 or 18 when such activation occurred and coordinates were reported. IT = imaging technique; HR = high-resolution details; NS = nonspatial images; EI = exemplar images; NR = nonresting baseline; seconds = SPECT (single photon emission computed tomography); N = no; Y = yes; 4M = 4-Tesla (T) fMRI (functional magnetic resonance imaging); 1M = 1T fMRI; VIL = visual imagery of landmarks; P = PET (positron emission tomography); LTM = long-term memory; 3M = 3T fMRI; STM = short-term memory.

Reported brain regions and corresponding numbers:

1. Amygdala

2a. Angular gyrus

2b. Angular gyrus (occipito-temporo-parietal junction; areas 19/39/7)

2c. Anterior angular gyrus

3a. Anterior cingulate

3b. Anterior cingulate cortex

3c. Anterior cingulate gyrus

3d. Anterior cingulate gyrus (area 24)

3e. Anterior/median cingulate cortex

3f. Caudal anterior cingulate

4a. Area 19

4b. Area 19/18

5. Auditory cortex

6a. Caudate/putamen

6b. Head of caudate

7a. Cerebellum

7b. Cerebellar cortex

7c. Cerebellar hemisphere

7d. Cerebellar vermis

8. Cingulate gyrus (area 24)

9. Frontal eye field

10a. Fusiform

10b. Fusiform (area 19)

10c. Fusiform gyrus

10d. Fusiform gyrus (area 37)

10e. Lateral fusiform gyrus

10f. Medial fusiform gyrus

10g. Posterior fusiform gyrus

11a. Heschl’s gyrus (area 41)

11b. Heschl’s gyrus (area 41/42)

12a. Hippocampus

12b. Middle hippocampal regions

12c. Posterior hippocampal regions

13a. Inferior frontal gyrus

13b. Anterior inferior frontal

13c. Inferior frontal gyrus (area 11)

13d. Inferior frontal gyrus (area 44)

13e. Inferior frontal gyrus (area 45)

13f. Inferior frontal gyrus (area 47)

13g. Inferior frontal pole

13h. Inferior frontal sulcus

13i. Inferior frontal sulcus/precentral sulcus

14a. Inferior parietal

14b. Inferior parietal cortex

14c. Inferior parietal gyrus

14d. Inferior parietal gyrus (area 40)

14e. Inferior parietal lobe (area 40)

14f. Inferior parietal lobule

14g. Inferior parietal lobule (area 39/40)

14h. Inferior parietal lobule (area 40)

14i. Inferior parietal lobule (area 7)

15a. Inferior temporal

15b. Inferior posterior temporal gyrus

15c. Inferior temporal/fusiform gyrus

15d. Inferior temporal gyrus

15e. Inferior temporal gyrus (posterior part)

15f. Posterior inferior temporal

16a. Insula

16b. Anterior insula

16c. Anterior insula/inferior frontal gyrus

16d. Insula (area 13)

16e. Insular cortex

17a. Intraparietal sulcus

17b. Anterior intraparietal

17c. Intraparietal sulcus, banks (area 40)

17d. Intraparietal sulcus/precuneus

17e. Intraparietal sulcus/superior parietal

17f. Posterior intraparietal

18. Lateral geniculate nucleus

19. Lenticular nucleus

20a. Medial frontal gyrus/medial frontal gyrus (area 6)

20b. Medial frontal gyrus (supplementary motor area; area 6)

20c. Median frontal gyrus

21a. Middle frontal

21b. Anterior midfrontal

21c. Middle frontal (area 6)

21d. Middle frontal gyrus

21e. Middle frontal gyrus (area 10)

21f. Middle frontal gyrus (area 46)

21g. Middle frontal gyrus (area 46/9)

21h. Middle frontal gyrus (area 47)

21i. Middle frontal gyrus (area 9)

21j. Middle frontal gyrus (area 9/8)

21k. Middle frontal sulcus

21l. Middle midfrontal

21m. Posterior midfrontal

22a. Middle temporal

22b. Middle temporal gyrus

22c. Middle temporal/middle occipital gyrus

22d. Middle temporal sulcus

23. Motor cortex

24. Middle temporal/medial superior temporal visual motion area

25a. Occipital gyrus (area 19)/middle occipital (area 19)

25b. Dorsal occipital

25c. Lateral occipital gyrus (area 19)

25d. Medial occipital gyrus/inferior temporal gyrus (area 19)

26. Occipitoparietal sulcus

27. Parahippocampal gyrus

28a. Parietal (area 7)

28b. Parietal (area 7/40)

29. Perisylvian cortex (area 45/insula)

30. Pons

31. Pontomesencephalic tegmentum

32. Postcentral gyrus (area 2)

33a. Posterior cingulate

33b. Posterior cingulate gyrus

34a. Posterior parietal cortex

34b. Posterior parietal cortex (area 7)

35a. Precentral gyrus

35b. Precentral gyrus (area 4)

35c. Precentral gyrus (area 6)

35d. Precentral gyrus (frontal eye field, area 6)

35e. Precentral/middle frontal sulcus

35f. Precentral sulcus

35g. Precentral sulcus (area 4)

35h. Precentral/superior frontal sulcus

36a. Precuneus

36b. Posterior precuneus

36c. Precuneus (area 19)

36d. Precuneus (area 7)

36e. Precuneus/parietooccipital sulcus

37a. Prefrontal cortex

37b. Anterior intermedial prefrontal

37c. Dorsolateral prefrontal

37d. Posterior intermedial prefrontal

37e. Superior anterior prefrontal

37f. Superior middle prefrontal

37g. Superior posterior prefrontal

38a. Premotor

38b. Premotor area (area 6)

38c. Premotor cortex

39. Pulvinar

40. Putamen/pallidum

41. Rectal gyrus

42a. Superior frontal

42b. Lateral superior frontal gyrus

42c. Median superior frontal gyrus

42d. Superior frontal gyrus

42e. Superior frontal gyrus (area 10)

42f. Superior frontal gyrus (area 6)

42g. Superior frontal gyrus (area 6/8)

42h. Superior frontal sulcus

42i. Superior frontal sulcus (anterior part)

42j. Superior medial frontal gyrus

43a. Superior occipital

43b. Superior occipital gyrus

43c. Superior occipital gyrus (area 19)

43d. Superior occipital gyrus/superior parietal lobule (areas 19/7)

44a. Superior parietal

44b. Posterior lateral superior parietal

44c. Posterior medial superior parietal

44d. Superior parietal cortex

44e. Superior parietal gyrus

44f. Superior parietal gyrus (area 19)

44g. Superior parietal lobule

44h. Superior parietal lobule (area 7)

44i. Superior parietal lobule/precuneus (area 7)

44j. Superior posterior parietal lobule

45a. Superior temporal

45b. Superior temporal gyrus

45c. Superior temporal gyrus (area 39)

45d. Superior temporal sulcus

45e. Superior temporal sulcus/posterior angular gyrus

46a. SMA

46b. Anterior SMA

47a. Supramarginal

47b. Supramarginal gyrus

48. Temporal pole

49a. Thalamus

49b. Medial thalamus

50. V3

(p.188) (p.189) (p.190) (p.191) (p.192) (p.193) (p.194)

(p.195) Kosslyn, Alpert, Thompson, Maljkovic, Weise, Chabris, et al. (1993, experiment 1). This article reported three experiments, each of which was analyzed separately in this review. In experiment 1, two groups of seven participants were tested. One group completed an imagery condition and an analogous perception condition; the other group completed a sensorimotor control condition and the perception condition. The participants in the imagery condition visualized previously learned block letters as they had appeared in grids, and decided whether an X would fall on the letter if it were in fact present in the grid. The stimuli subtended 2.6 degrees horizontally by 3.4 degrees vertically, and hence relatively high resolution was required to resolve the X probe (which was one-quarter as wide and one-fifth as high as the overall stimulus). In the perception condition, the participants decided whether an X fell on a letter that was present in a grid. In the sensorimotor control condition (which also controlled for the effects of attention per se), participants simply alternated responses, making a response as soon as an X mark was removed from the grid. When activation during imagery was compared with that during perception, area 17 was found to be activated as well as a portion of the cuneus that is part of area 18; no such finding occurred in the sensorimotor control condition. The authors noted that the portion of area 17 that was activated was unexpectedly anterior; use of the perception baseline may have removed some of the activation in area 17. Patterns of response times validated that imagery was used.

Kosslyn et al. (1993, experiment 2). In experiment 2, the task was the same as in experiment 1, but the perceptual stimuli and X probe were visually degraded. Moreover, all stimuli were presented for only 200 milliseconds. The baseline condition required the five participants to view and respond to (alternating responses) the appearance of grid stimuli from the imagery condition; participants performed this baseline before learning the letters and without instructions to image. Distinctive patterns of error rates validated the use of imagery. For the imagery versus baseline comparison, area 17 and area 18 were activated.

Kosslyn et al. (1993, experiment 3). In experiment 3, sixteen participants were asked to visualize the appearance of all twenty-six letters of the alphabet in a standard block font. In one condition, the participants were to visualize the letters as small as possible while still being able to distinguish their parts. In the other condition, they were to visualize the letters as large as possible, without overflowing the imaginal visual field. Participants were told to visualize and retain the image until they heard a cue, which they received four seconds after hearing the name of the letter. The cues required the participants to judge the shape of the letters (e.g., whether any curved lines were present, whether a vertical line was on the left, or whether an enclosed space was present), and response times and errors were recorded. As found in previous imagery studies (e.g., Kosslyn, 1975), the participants required more time to judge letters imaged at a small size; indeed, when data from three participants who did not show this expected response time pattern were removed, a stronger pattern was evident in the PET blood flow data. Each condition (large and small imagery) served as a control for the other. When activation in the small-image condition was subtracted from that in the large-image (p.196) condition, there was evidence of activation at a very posterior location in area 17; the reverse subtraction produced evidence of activation at a relatively anterior location in area 17. Thus, analogous to what has been found in perception (e.g., Fox et al., 1986), visualizing objects at larger sizes activated more anterior regions of early cortex.

Kosslyn, Thompson, Kim, and Alpert (1995). Twelve participants memorized the appearance of line drawings of objects and later were asked to visualize them at three different sizes, very small (subtending 0.25 degrees of visual angle), medium (4 degrees), or large (16 degrees). Each size was visualized during a separate condition, and both the order of conditions and assignment of stimuli to conditions were counterbalanced over participants. During neuroimaging, participants closed their eyes and heard the names of objects, one at a time. They were asked to form an image of the corresponding object at the appropriate size, and hold the image for four seconds after the cue was read. At this point they heard the name of a cue, which required the participants to evaluate the shapes of the particular drawings they had memorized (e.g., to determine whether the left side of the object in the drawing was higher than the right side). Two baselines were included: listening and resting. In the listening baseline, participants listened to the names of objects and cue words and alternated responses to the cue words. This baseline was administered before the participants knew the meaning of the cues; these stimuli had the same form as those in the imagery conditions. In the resting baseline, participants were simply told to rest and “have it black in front of your mind’s eye” (p. 496). Response times and accuracy rates were recorded in the imagery conditions. As was found in experiment 3 of Kosslyn, Alpert, Thompson, Maljkovic, et al. (1993), larger mental images evoked activation at increasingly anterior locations along the calcarine sulcus (which defines area 17). In sharp contrast, no effects of imagery were observed when the resting baseline was used. Indeed, when regional cerebral blood flow (rCBF) in the two baselines was directly compared, it was clear that there was significantly more blood flow in area 17 in the resting baseline than in the listening baseline—and using the data from the resting baseline as the comparison cancelled out activation due to imagery. Note that the size-specific results eliminate the possibility that the imagery results are simply an artifact of hypometabolism in the listening baseline.

Kosslyn, Shin, Thompson, McNally, Rauch, Pitman, and Alpert (1996). After studying a set of neutral or aversive photographs, seven participants viewed or visualized the pictures (with each type of stimulus presented in a separate block of trials). After each stimulus was presented, the participants heard a statement (delivered by the computer) and determined whether it correctly described the stimulus. The statements described subtle visual or spatial aspects of the stimulus, of the sort that previous research has demonstrated typically are recalled through imagery (e.g., Kosslyn & Jolicoeur, 1980). Neutral pictures were used as the baseline for the aversive pictures. Participants had their eyes closed while they visualized the pictures. Response times and error rates were recorded. Visualizing aversive stimuli, relative to visualizing neutral stimuli, enhanced rCBF in areas 17 and 18.

(p.197) Kosslyn, Pascual-Leone, Felician, Camposano, Keenan, Thompson, et al. (1999). Eight participants memorized the appearance of a pattern containing four quadrants, each of which was labeled by a number. Each quadrant contained a set of stripes, and the stripes varied in their length, width, spacing, and tilt. In the imagery condition, participants heard the numbers naming two of the quadrants followed by a cue word; the cue word directed them to compare the two sets of stripes along one of the four dimensions in which they differed. If the stripes in the quadrant named first had more of the property than those in the quadrant named second, the participants pressed one button; if the stripes in the quadrant named second had more of the property, they pressed another button. The participants maintained the image for at least three seconds. The baseline consisted of listening to similar words while alternating responses from side to side (and not visualizing). Areas 17 and 18/ 19 were activated. An important feature of this study was the demonstration that rTMS delivered to the medial occipital lobe prior to the task (performed outside the scanner) disrupted subsequent performance, which is evidence that activation in early visual cortex played a functional role in this task.

Shin, McNally, Kosslyn, Thompson, Rauch, Alpert, et al. (1999). In one condition, eight participants recalled a traumatic event from their past and imagined it vividly for the entire duration of the scan; in the other condition, they recalled a neutral event from their past and imagined it for the entire scan. The order of conditions was counterbalanced. Activation was compared between the two conditions. Areas 17 and 18 were activated during imagery for the neutral event compared with imagery of the traumatic event. Imagery vividness ratings were obtained and participants tended to have the highest ratings of visual imagery in the neutral condition, which is consistent with the brain activation results. In contrast, the participants reported that their imagery during the traumatic condition was “most prominent” in the tactile modality.

Thompson, Kosslyn, Sukel, and Alpert (2001). Thompson et al. designed a study to examine the effects of using high resolution to visualize a shape (not a spatial relationship such as relative heights, as examined by Mellet, Tzourio-Mazoyer, et al., 2000). Eight participants visualized sets of four stripes, just as did the participants in the study reported by Kosslyn, Pascual-Leone, et al. (1999). As in the earlier study, the participants were asked to compare the stripes in two named quadrants according to different dimensions, such as the spacing between the stripes or their length (both of which required focusing on only part of the overall pattern). In this study, different amounts of resolution were required to distinguish the sets of stripes; one set was composed of thin (high-spatial-frequency) stripes, whereas another was composed of relatively thick (low-spatial-frequency) stripes. The baseline consisted of listening to words that were similar to those used as cues in the experimental conditions and alternating yes and no responses. As expected, area 17 was activated in the imagery task. However, there was no difference in activation between the two types of striped patterns. The response times and error rates indicated that—although the stimuli differed in the resolution (p.198) required to resolve the stripes—the discriminations needed for the two sets of stripes required comparable resolution.

fMRI Studies

In most of the fMRI studies, the data from individual participants were analyzed separately. In the analysis, area 17 or 18 was treated as activated if at least half of the participants showed such activation. It was not possible to compute the precise probability that the voxels in early visual cortex would be activated due to chance; this probability depends on many parameters that affect analyses, such as the total number of voxels, spatial normalization procedures, motion-correction algorithms, and corrections for multiple comparisons. However, by any measure the probability that chance alone could account for activation in this region in at least half the participants is very small, and thus it makes sense to try to discern the factors that led to such activation. The issue was whether the task could produce activation; viewed from the other perspective, if at least half the participants did have activation in early visual cortex, it would not be reasonable to classify that study as showing no activation in this region.

Le Bihan, Turner, Zeffiro, Cuénod, Jezzard, and Bonnerot (1993; 1.5T fMRI). Seven participants were asked to visualize red flashing lights that they had seen previously in a perceptual condition. The lights were diodes geometrically arranged as two square patterns, flashing at a rate of 16 Hz. The lights were visible for twenty-four seconds, followed by an equal period of darkness. During some “off” blocks, the participants were asked to recall the lights as they had seen them. During the baseline condition, the participants completed an off block but were not asked to form images. No behavior was measured. All participants had bilateral activation in area 17 in the perceptual condition; five of the seven participants also had area 17 activation in the imagery condition (this activation also extended to area 18).

Sabbah, Simond, Levrier, Habib, Trabaud, Murayama, Mazoyer, Briant, Raybaud, and Salamon (1995; 1.5T fMRI). Ten participants were asked to view or visualize a flashing white light. The light flashed at a rate of 8 Hz for twenty-eight seconds, followed by an equal period of darkness. During some of these periods of darkness, the participants were asked to visualize the light. During other periods of darkness, the participants were not asked to form images. No behavior was recorded. Area 17 was activated during imagery.

Goebel, Khorram-Sefat, Muckli, Hacker, and Singer (1998; 1.5T fMRI). Three conditions were administered. In one, the five participants were asked to visualize a rotating striped wheel. The diameter subtended 16.7 degrees of visual angle, and the wheel rotated at a rate of 170 degrees per second. The wheel was covered with thirty alternating black and white stripes, each subtending about 0.5 degrees of visual angle. In another condition, the participants were asked to visualize a field of rotating dots. In a third condition, the participants visualized shifting square contours that were created by opening notches in the sides of four filled circles (thus generating Pac-Man–type stimuli). To create the illusion of appearing and (p.199) disappearing squares, the participants were to visualize the notches in the four circles as if they were opened or closed. Moreover, the stimulus alternated from right to left of the fixation point. In each condition, the images were formed for twenty-four seconds, and the baseline condition was fixation on a cross. No behavior was measured. In all three conditions, area 18 was activated bilaterally, but area 17 was not activated.

Chen, Kato, Zhu, Ogawa, Tank, and Ugurbil (1998; 4T fMRI). There were two tasks: First, nine participants were asked to imagine walking through their hometowns; they were instructed to focus on objects that they would see along the route, not on the walking component. The task was similar to that of Roland, Eriksson, Stone-Elander, & Widen (1987), except that no starting or ending point was given. The baseline was an “off” condition. All participants had robust activation in area 17. Indeed, eight participants also showed activation in the lateral geniculate nucleus (a subcortical structure that receives input directly from the eyes). Second, twelve participants were asked to visualize a flashing pattern of lights (that they previously had seen). Seven of these participants had activation in area 17.

Klein, Paradis, Poline, Kosslyn, and Le Bihan (2000; 3T fMRI). On each trial, participants heard the name of an animal, which cued them to visualize that animal; fourteen seconds later they heard the name of a characteristic, which could either be concrete (e.g., “has pointy ears”) or abstract (e.g., “is affectionate”). When they heard the characteristic, participants decided, as quickly and accurately as possible, whether the named animal has the characteristic. Concrete and abstract characteristics were presented in separate blocks of trials, and participants were told in advance what sorts of characteristics would be queried. This was an event-related fMRI study, with the blood oxygen level–dependent response being monitored relative to the initial cue to form the image and, separately, relative to presentation of the name of a characteristic. For all eight participants, area 17 was activated whenever an image was formed, regardless of whether it was in anticipation of a concrete or an abstract characteristic. Moreover, area 17 was activated when both sorts of characteristics were evaluated. Most participants (seven of eight) had greater activation when they were evaluating a characteristic than when they initially were generating the image.

Handy, Miller, Schott, Shroff, Janata, Van Horn, et al. (2004; 1.5T fMRI). In a separate perceptual condition, the fourteen participants heard the name of each of a set of common objects and studied the corresponding pictures. They were warned that they would soon be asked to recall these objects. Participants took part in two imagery conditions. In the pictures imagery condition, the participants were asked to visualize the pictures that they studied during the perceptual condition; in contrast, in the nouns condition, they were asked to visualize a general version of the object, on the basis of their semantic knowledge, instead of visualizing a specific instance. During neuroimaging, the names of objects were presented every 3.5 seconds; participants passively listened to abstract words during the baseline condition. Although a group analysis did not reveal activation in early visual cortex in either imagery condition, an analysis of the data from individual participants revealed that in the pictures condition, nine of the fourteen participants had (p.200) activation in the early visual cortex, whereas in the nouns condition ten of the participants had such activation. These results underscore the importance of examining data from individual participants, and they illustrate how a group analysis may mask significant activation, in part because of small differences in the location of activation from one participant to another.

O’Craven and Kanwisher (2000; 1.5 T fMRI.) Only experiment 1 is discussed here (the other two experiments did not examine activation in the early visual cortex). In the perception condition, eight participants viewed faces of famous people or buildings. In the imagery condition, the participants were asked to form vivid, detailed visual mental images of both faces and places. Participants were given two seconds between stimuli, and an “off’ condition (lasting twelve seconds) was interleaved with the experimental conditions. The authors reported activation in anterior calcarine cortex (part of area 17) and area 18 when the participants visualized places compared with activation when they visualized faces (the activation overlapped in imagery and perception). The authors suggested that this activation may occur in peripheral retinotopic cortex when it processes larger images (because the place scenes were larger than the faces in this study). The comparisons between the face and place imagery conditions and the off baseline were not reported.

Ishai, Haxby, and Ungerleider (2002; 3T fMRI). In this study, nine participants either perceived or visualized famous faces. The study was conducted with a block design. In the perception condition, famous faces were presented at a rate of one every four seconds. The perception control consisted of viewing images of scrambled faces that were presented at this same rate. The study was designed to examine the differences in imagery processing when images are retrieved from short-term memory (STM) versus when they are retrieved from long-term memory (LTM). Participants were instructed to visualize vivid images of famous faces in all of the following four imagery conditions: (1) imagery from STM, in which participants were asked to visualize specific pictures of famous faces that they had seen and memorized a short time before; (2) imagery from LTM, in which participants were asked to visualize famous faces without having seen them previously during the experiment; (3) imagery from STM plus attention, which was the same as in the STM condition except that the participants were also asked to focus on a facial feature and answer a question about the face (e.g., “small nose?”); and (4) imagery from LTM plus attention, which was the same as the LTM condition except that the participants were asked to focus on and answer a question about a facial feature. In the imagery control condition, the participants passively viewed letter strings at the same rate that the images were to be visualized during the imagery conditions (0.5 seconds to see the stimulus, followed by 3.5 seconds during which the screen was black). In the imagery plus attention conditions, behavioral data (response times and error rates) were collected. Participants achieved 96 percent accuracy, which indicates that they were in fact performing the tasks. Imagery of faces generated from STM generated more activation than imagery from LTM. Compared with the imagery control condition, activation of area 17 during imagery was found in all four imagery conditions. This activation was generally bilateral. However, there was (p.201) no additional activation in this area when participants were asked to answer a question about a particular facial feature (which suggests that imagery-based activation cannot be ascribed solely to the effects of attention).

Lambert, Sampaio, Scheiber, and Mauss (2002; 1.5 T fMRI). Six participants listened to names of familiar animals and were asked to visualize each one in color, in its environment, and in a dynamic situation. In the baseline condition, the participants simply listened to abstract words. The participants were given two seconds between cue words, and the entire set of stimuli was presented twice. Four of the six participants reported during debriefing that they could not form elaborate mental images within the allotted time, but that their images were more detailed during the second set of trials. After testing, the participants were asked to recall the stimulus words, and they did so far better than chance (which is indirect evidence that the participants did in fact perform the task). Activation of calcarine cortex was found during mental imagery for five out of six participants.

Finally, case studies that examined a single participant were not included in this analysis, simply because it seemed unreasonable to allow them to have the same weight as group studies. Nevertheless, it is worth noting two published case studies briefly. Pütz et al. (1996; 1.5T fMRI) tested an expert user of a soroban (Japanese abacus) while she visualized the device to make calculations. The researchers found bilateral activation of area 17 near the region that registers the center of the visual field (where the participant was asked to visualize the soroban). In addition, Tootell, Hadjikani, et al. (1998; 3T fMRI) essentially conducted a variant of experiment 3 of Kosslyn, Alpert, Thompson, Maljkovic, et al. (1993), asking a participant to visualize letters at the smallest “visible” size or visualize a field of letters (sparing a small central region). A cortical unfolding algorithm was used to show definitively that areas 17 and 18 were activated during both conditions, although—as reported by Kosslyn, Alpert, Thompson, Maljkovic, et al., (1993) and Kosslyn, Thompson, et al. (1995)—there was greater activation in the foveal region when images were formed at a small size than when they covered the field.

SPECT Studies

SPECT has both lower spatial and temporal resolution than either PET or fMRI, but Goldenberg and colleagues have reported that the spatial resolution is 12 millimeters full width half-maximum in the axial plane. Although this resolution may not be sufficient to distinguish activation in area 17 versus area 18, it should be sufficient to document activation in one of the two areas as distinct from other nearby regions.

Goldenberg, Podreka, Steiner, Willmes, Suess, and Deecke (1989, experiment 1). In experiment 1, twenty-eight participants were divided into two equal-sized groups. One group performed two tasks. In one task, they verified statements that required imagery to evaluate, such as “Elephants have little eyes with many wrinkles around them,” flashing a light each time they disagreed. In the other task, these participants evaluated statements requiring motor imagery, such as “One can touch the left ear (p.202) with the right index and the nose with the right thumb at the same time.” The other group also performed two tasks. In one, they verified statements that did not require imagery to evaluate, and in the other they simply pressed a button when they heard “no.” The results from the different conditions were compared with each other. No behavior was reported. Inferior area 17 or 18 was activated during the visual imagery condition (compared with activation engendered when participants evaluated low-imagery sentences, the baseline of interest here).

Goldenberg, Steiner, Podreka, and Deecke (1992). Ten participants were asked to evaluate the truth of twenty-five statements about shape that required imagery to verify, such as “The ears of a bear are pointed.” Ten additional participants evaluated twenty-five sentences about color that required imagery to verify, such as “The interior of a watermelon is violet,” and both groups also evaluated twenty-five sentences that did not require imagery, such as “Leap years have 366 days.” Statements were presented auditorily every fifteen seconds. For a baseline, rCBF for high-imagery statements was compared with that for low-imagery statements. Participants had to evaluate the statements, and comparable numbers of errors were committed in the two imagery conditions. The SPECT results were comparable in the shape and color imagery tasks: area 17 or 18 had higher rCBF in the high-imagery conditions than in the low-imagery condition.

Studies Not Reporting Activation of the Early Visual Cortex

No evidence of area 17 or 18 activation during visual mental imagery was reported in the following studies, again organized by technique and chronologically within each technique. Note that various other areas were activated in these studies, as summarized in table A.1.3

PET Studies

The following studies were analyzed using parametric statistics. No activation in early visual cortex was assumed if the comparison to the baseline condition did not reach statistical significance.

Roland, Eriksson, Stone-Elander, and Widen (1987). Ten participants were asked to close their eyes and imagine walking through their hometowns. They were instructed to imagine going out the door and taking the first left turn, then alternating right and left turns while paying attention to their surroundings. The participants were to imagine their surroundings vividly and in full color, and they were not to pay attention to their own movements. Imagery was monitored continuously for 180–200 seconds. The baseline was rest; participants were instructed to avoid thinking about anything in particular and especially to avoid mental images. All participants reported that during rest, it had been “dark in their mind’s eye” (p. 2376). After participants finished the task, they indicated the location where they had arrived, (p.203) and this location was looked up on a map. The authors did not report whether the participants made the appropriate left and right alternations, but did claim that the participants were never lost and always able to recall images of their surroundings. There was no activation of area 17 or 18, but there was activation of bilateral superior occipital cortex (probably precuneus or area 19) as well as other areas.

(p.204) Mellet, Tzourio, Denis, and Mazoyer (1995). Eight participants were selected for good imagery abilities, as determined by spatial abilities tests. These participants were asked to scan continuously over an image of a previously memorized map of a fictional island, shifting their focus from one object to another in a clockwise, then counterclockwise, direction around the island. The task was performed in total darkness. The baseline was rest. After the study, they were to point to landmarks on a visualized map to ensure that they knew where each of the landmarks had been. Although neither area 17 nor area 18 was activated, there was increased activation in the superior occipital lobe in the imagery versus rest comparison (as well as in other areas) and a moderate but insignificant increase in activation in the precuneus during imagery. The participants also performed a perceptual task in which they explored the same map visually, as seen through a mirror.

Roland and Gulyás (1995). Eleven participants first memorized the appearance of a series of ten colored, geometric patterns. Following this, they were scanned as they recalled the patterns in sequence, visualizing each until it began to fade and then moving on to the next pattern. The participants were scanned while performing this task after studying the patterns twice, and were scanned again after studying them twenty to twenty-four times (the number of exposures varied for different participants). Participants were told not to use verbal strategies to memorize the patterns. Judging from the example provided in the article, the figures included complex patterns and small dots and subtended 33 degrees (the entire visual field of participants in the scanner). In the recognition phase, the memorized patterns were randomly intermixed with new patterns. The patterns were exposed for 100 milliseconds, followed by 900 milliseconds of darkness, during which the participants indicated recognizing a pattern by slight extension of the right thumb. All participants were able to learn and recall the patterns satisfactorily. The baseline was rest; participants were blindfolded and asked to “have it black in front of your mind’s eye” (p. 80). PET results from each participant were examined to determine whether they were truly visualizing; the brain patterns were similar across participants, leading the authors to infer that the participants were engaged in the same process (whatever it may have been). The recall (imagery) condition was compared with the rest condition and also with the learning and recognition conditions. The left inferior posterior temporal lobe was more strongly activated during recall (imagery) than during rest. No occipital regions were activated more during recall (imagery) than during learning or recognition.

Roland, Gulyás, Seitz, Bohm, and Stone-Elander (1990) published a preliminary report of these data, using a different statistical analysis. For present purposes, the results from that report did not differ from those reported here, and in any case we cannot justify treating an earlier analysis of the same data as if it were a separate study.

Mellet, Tzourio, Crivello, Joliot, Denis, and Mazoyer (1996). Nine participants constructed images of three-dimensional objects. Each figure was composed of twelve cubes, and participants mentally built up the figures as they heard eleven directional words. These words were presented at the rate of one every two seconds; twenty-two seconds were required to complete the description of an object, and another five seconds were provided for the participants to visualize the completed object. Each participant mentally constructed four objects in the imagery condition. In one of the baseline conditions, the participants listened to four lists of abstract words, presented at the rate of one word every two seconds; these words were phonetically matched to the direction words used in the test condition. There was also a resting baseline. After scanning, the participants were presented with the four objects they had just mentally built and were asked to identify the correct order of construction. In only 5.5 percent of cases they chose the order where all the cubes were out of sequence (this was taken as evidence that the participants did pay attention to the task); the authors did not report the percentage of time that the participants were actually correct. There was no area 17 or 18 activation; however, the imagery versus rest comparison revealed activation in a large portion of bilateral occipitoparietal cortex, with local maxima in the inferior parietal lobule. The imagery versus listening baseline comparison revealed bilateral activation of the superior occipital cortex (the border of precuneus and area 19), as well as other areas.

Ghae¨m, Mellet, Crivello, Tzourio, Mazoyer, Berthoz, and Denis (1997). Five participants were taken to an unfamiliar suburban area and asked to walk an 800-meter route and memorize what they saw, including seven landmarks pointed out by the investigator. The participants walked the route three times. The following day, a few hours prior to PET scanning, the participants were trained to perform a task requiring visual imagery of landmarks. In this task, the participants were to visualize the named landmark and hold the image for ten seconds until the name of another landmark was presented, which then was to be visualized. The study also included a resting baseline. Both the imagery and baseline were replicated, but no behavior was required.

Participants also performed a mental simulation of routes task, which was interspersed in a random order with the landmarks visual imagery task and the resting baseline. That task and its results were reported in a subsequent article, Mellet, Bricogne, et al. (2000), as the “mental navigation” task, summarized below.

Mellet, Tzourio, Denis, and Mazoyer (1998). Eight participants listened to dictionary definitions of concrete versus abstract words. Each word and its definition were presented for a total of six seconds, followed by two seconds of silence before the next word and definition were presented. In the concrete condition, participants heard common words that referred to objects or animals that were easy to visualize, such as a bottle, lion, or guitar. In this condition, the participants were explicitly encouraged to form a mental image and perfect it according to the definition. In the abstract condition, the words referred to concepts that were unlikely to produce mental imagery; moreover, participants were instructed not to produce any mental images during this condition. After the PET scanning session was over, (p.205) the participants were able to recall more concrete words than abstract words and could retrieve more concrete words from their definitions. There was also a resting baseline.

Mellet, Bricogne, Tzourio-Mazoyer, Ghaëm, Petit, Zago, et al. (2000). Two groups were tested. The five participants in one group learned the spatial layout of a park with various landmarks by physically navigating through it; the six participants in the other group learned a spatial layout by studying a map. Both groups were then scanned while they performed two different tasks: In the “mental navigation task,” the participants first were trained on the task a few hours before the PET session. They heard the names of two landmarks, and they were asked to visualize the path between them. In the PET session, they pressed a key when the second landmark was reached. A total of five segments were presented. In the mental scanning task the participants first learned a map of the park, using slides projected onto a blank screen. The map was presented for study for a total of three minutes at the beginning and end of the training sessions; in the middle, each of seven landmarks was presented individually on its own map for five seconds. During the training session, participants learned the task, which was to visualize a laser dot moving between two landmarks (colored dots), the names of which they heard presented through earphones. Participants pressed a button when they reached the second landmark. The time to press the button increased with distance traversed, which provided evidence that participants did in fact perform the task. There was a resting baseline. Area 18/19 was activated in common in the two tasks (using a “conjunction analysis”). In the mental navigation task, another part of area 18/19 was activated more than in mental scanning; however, given the Talairach coordinates provided in the article, these regions are most likely within area 19, not medial, and not part of early visual cortex.

Results from the mental navigation task were also reported in Ghaëm et al., (1997), referred to there as the “mental simulation of routes” task. Ghaëm et al. (1997) used an alternative method of data analysis in that report. The results were essentially unchanged; again, as in the 1997 report, there was evidence of occipital activation on the border of areas 18 and 19. Mellet et al. did not consider this activation to fall within the scope of the early visual cortex (E. Mellet, personal communication, October 8, 2001). The region was not medial and in this later report, the analyses indicated to these authors that it is clearly within area 19, rather than area 18.

Mellet, Tzourio-Mazoyer, Bricogne, Mazoyer, Kosslyn, and Denis (2000). Seven participants took part in two conditions. In one, they studied and learned a series of simple colored geometric forms arranged in order by seeing the scenes (visual encoding condition); in the other, they listened to a series of verbal descriptions that specified how the geometric forms were arranged (verbal encoding condition). In the verbal encoding condition they were asked to form a visual mental image of the scenes that had been described. In both conditions, geometric forms were visualized on a bar that was divided into regular units, each labeled by a letter. Two scenes were memorized prior to scanning in each condition; the scenes (p.206) were named by numbers. During scanning, participants formed an image of a named scene and then decided whether the portion of the scene over a named letter was higher than the portion over another named letter. During the baseline, the participants heard similar cues and simply alternated between the two responses. Although at this point they had gone through the training phase, participants reported being able to refrain from forming visual mental images during the baseline task. Response times and error rates were recorded.

Gulya´s (2001). This report described results from two visual imagery tasks, as compared with a resting baseline. In one task, ten participants visualized capital letters of the Hungarian alphabet. In this self-paced task, the participants were asked to inspect each imaged letter for straight or curved lines. If they arrived at the end of the alphabet, they were to start over at the beginning, and continue the task until the end of the PET scan. After the session was completed, participants were asked how many times they went through the alphabet and at which letter they stopped. This report was used to calculate the number of letters they imaged during the scan. The participants were not told to visualize the letters at any particular size, and thus probably visualized them at the large “default size” assessed by Kosslyn (1978). In the second imagery task, the same participants visualized capital letters from the Hungarian national anthem at their own pace. They were instructed to visualize each letter in the Hungarian national anthem in sequence until they reached the end of the anthem. They were to inspect each imaged letter’s straight or curved lines. At the end of the task, the participants again indicated where they were focusing, and the author computed the rate of visualizing the letters on the basis of these data.

Mazard, Mazoyer, Etard, Tzourio-Mazoyer, Kosslyn, and Mellet (2002). Prior to scanning, six participants memorized the appearance of a set of simple geometric stimuli as well as two orders in which the stimuli were lined up on a surface. During scanning, the participants were cued to visualize the objects in one order and then to compare the shapes in terms of their relative height at specific points along the surface. During imagery, the participants either did or did not hear a recording of fMRI scanner-like noise. The participants had four seconds to visualize a cued arrangement. Two baselines were included; in one, the participants heard the probes and alternated responses without visualizing, whereas in the other they simply rested. Response times and error rates were collected. The participants performed far better than chance, but had greater difficulty forming a clear and vivid image when auditory noise was presented—which was also reflected in the higher error rates in the noise imagery condition.

Suchan, Yágüez, Wunderlich, Canavan, Herzog, Tellman, et al. (2002). Ten participants visualized lines connecting six visible circles, each of which contained a number. The numbers ranged from 1 to 6, and the participants visualized the lines connecting the numbered circles in ascending order (i.e., a line going from circle 1 to circle 2 to circle 3, and so on). The participants then decided whether three or more lines crossed; if so, they were to press one button, if not, the other. Response times and error rates were recorded. During the baseline task, the participants fixated their gaze on the circled numbers in ascending order until they reached the (p.207) final number. At this point, they were to press a button and to fixate on the circles in descending order until they reached the lowest number, indicating this with a button press, and so on. The participants probably made more eye movements during this baseline task than during the experimental task, but no measures were taken.

fMRI Studies

The fMRI studies were sometimes analyzed by group and sometimes in terms of individual participants. For group designs, if the comparison to baseline (typically an “off” state) did not reach statistical significance, early visual cortex was assumed not to have been activated. For individual designs, if fewer than half the participants exhibited activation in the early visual cortex, relative to baseline, the early visual cortex was assumed not to have been reliably activated.

D’Esposito, Detre, Aguirre, Stallcup, Alsop, Tippet, et al. (1997; 1.5T fMRI). Seven participants were asked to listen to words in two conditions. The words in the imagery condition were judged by the authors to be concrete, whereas the words in the baseline condition were judged to be abstract. A new word was presented every second. In the imagery condition, the participants were instructed to visualize the appearance of each named object; the appearance of the object necessarily was recalled from LTM. In the abstract word condition, the participants were instructed to listen passively to the words. Data analyses were performed for individual participants. No behavior was required. The authors note that if they had used a more complex task requiring a behavioral response, this might have increased the number of neural systems involved and rendered it more difficult to interpret the data. They also noted that results from their previous event-related potential studies were correlated with individual differences in imagery, which suggests that participants were in fact performing the tasks.

Ishai, Ungerleider, and Haxby (2000; 1.5 T fMRI). Nine participants visualized specific examples of houses, faces, or chairs that they recalled from LTM. At the same time, they viewed a gray square. In a corresponding perception condition, stimuli were presented every second; the timing was not provided for the imagery condition, but it seems safe to assume that it was the same. The control condition for the perception task consisted of viewing scrambled pictures. In the baseline condition for the imagery task, the participants simply viewed a gray square. No behavior was required. One of the major goals of this study was to demonstrate that the network of brain areas activated during visual mental imagery differs depending on whether one visualizes chairs, houses or faces. Here, the focus is on the comparison between imagery (the combination of the houses, faces and chairs imagery conditions) and the imagery control condition, as reported in the text and in tables 1, 3, and 4 of Ishai, Ungerleider, & Haxby (2000). Although there was no evidence that the early visual cortex was activated during imagery, regions in the ventral temporal cortex responded differently depending on whether participants visualized houses, faces, or chairs.

(p.208) Knauff, Kassubek, Mulack, and Greenlee (2000; 1.5T fMRI). In the perception condition, ten participants, who were first trained on a computer outside the scanner, looked at a grid and decided whether a highlighted cell fell on or off an abstract figure filling half the cells of the grid. In the imagery condition, the participants visualized the abstract figure in the grid and decided whether a highlighted cell would have fallen on the figure if it were actually present. In both conditions, the highlighted cell fell on the figure half the time, and half the time it fell off the figure. In the baseline condition, participants saw blank grids with one highlighted cell and pressed a button when they saw it. Finally the investigators included a standard fMRI off condition. Stimuli were presented once every 4.1 seconds, and each condition was repeated twice for each participant. Response times and error rates were measured. Participants achieved an accuracy rate of 85 percent inside the scanner.

Trojano, Grossi, Linden, Formisano, Hacker, Zanella, et al. (2000; 1.5T fMRI). Two visual imagery experiments were reported in this article. For present purposes, they differed primarily in the baseline condition; the imagery task was the same in both experiments. In experiment 1, the researchers asked the seven participants to listen to pairs of times (e.g., 5:30 and 8:00) and to visualize the corresponding clock faces; the participants then decided on which clock face the angle between the two hands would be larger. In the control task for experiment 1, the participants decided which of two times was numerically greater. In experiment 2, a perception condition was also included, as well as a control condition that consisted of an off period between imagery sessions. The four participants were also asked, in another task that the authors used for comparison, to count the number of syllables in each pair of times as it was presented verbally and to decide whether that number was odd or even. In experiment 1, participants closed their eyes, whereas in experiment 2, they kept them open and fixated on a point in order to avoid eye movements. In all tasks, the participants responded by pressing a key under the left or right hand, as appropriate to indicate their judgments. Responses were collected during fMRI scanning and were analyzed for accuracy. The accuracy rate was greater than 90 percent in both experiments 1 and 2.

Wheeler, Petersen, and Buckner (2000; 1.5T fMRI). Eighteen participants included in the final analysis studied pictures and sounds over two days. Each of the stimuli was paired with an appropriate label. Participants were scanned on the third day. During the imagery (recall) task, the participants were presented with visual labels of the previously studied stimuli and were asked to retrieve each stimulus from LTM. After they had completely retrieved the stimulus, the participants were to indicate with a button press whether “their memory was of a picture or sound” (Wheeler et al., 2000, p. 11126). The participants achieved an accuracy rate of almost 99 percent. This event-related fMRI study featured two sets of perception trials and two sets of imagery trials. The visual, auditory, and baseline (fixation) trials were randomly intermixed. The participants were given the perception task during the first two sets of trials, and the recall (imagery) trials during the next two. The results from the picture and sound imagery tasks were contrasted with each other.

(p.209) Formisano, Linden, Di Salle, Trojano, Esposito, Sack, et al. (2002; 1.5T fMRI). In this event-related fMRI study, twelve participants heard pairs of words, each of which specified a specific time of day, and visualized the corresponding analog clock faces; they then decided on which face the hands would form a larger angle. All stimuli specified full hours or half-hours (e.g., 8:00 or 9:30). The imaged clock hands were either in the right or left hemifield, and stimuli were balanced for this factor. The group analysis was based on results from a first group of participants (n = 6). A second group of six participants, which performed twenty-three trials at a higher sampling rate (higher temporal resolution) but with fewer brain slices, essentially confirmed the finding of sequential activation in the left, then right, posterior parietal cortex. Data from this second group also served in an fMRI and TMS study of mental imagery, reported by Sack et al., (2002; see below). Here, data only from the first group of six participants is considered. The participants completed a total of sixty-four trials, presented in four fMRI blocks of 256 seconds.

Sack, Sperling, Prvulovic, Formisano, Goebel, Di Salle, et al. (2002; 1.5T fMRI). In this event-related fMRI study, six participants performed the same task used by Formisano et al. (2002). They were asked to form a mental image of two analog clock faces, each cued by an auditory probe, and to report on which clock face the hands formed a larger angle. The temporal resolution was higher in Sack et al.’s study than in Formisano et al.’s study, although fewer slices of the brain were imaged. The results revealed a bilateral posterior parietal activation in imagery and no activation of early visual cortex. An additional group of sixty participants received rTMS prior to the task. The rTMS results suggest that the right parietal lobule may play a critical functional role in this spatially based task.

SPECT Studies

The SPECT studies were all analyzed using parametric statistics in group designs. In all cases, activation in early visual cortex was not statistically greater than activation in a baseline condition.

Goldenberg, Podreka, Steiner, and Willmes (1987). Participants were asked to listen to and memorize a series of words, one every five seconds; thirty seconds later they decided whether a probe word was in the memorized set. There were three different types of words, with different participants receiving each type: meaningless (seven participants), concrete (eighteen participants), and abstract (eight participants); eighteen additional participants performed a resting baseline. The participants assigned to learn concrete words, and only these participants, were divided into two groups—eleven were instructed to use imagery to remember the words whereas seven were given no explicit strategy (participants in this group were asked to memorize a list of words during a pilot test and were questioned about which technique they used to memorize them—one participant was excluded because of having used imagery). After the task, the participants were questioned again, and one participant who had switched to an imagery strategy after the first list was (p.210) reassigned to the imagery instruction group. The participants were scanned as they memorized the words. Although no behavior was recorded during scanning, afterwards the participants recognized 84.4 percent of the meaningless words, 73.1 percent of the abstract words, 73.1 percent of the concrete words when imagery was not used, and 96.9 percent of the concrete words when imagery was used. Although no area 17 or 18 activation was observed, some participants who used imagery to learn concrete words had strong activation in inferior occipital regions. From the diagrams of the regions of interest, it appears that this region is perhaps within twenty-five millimeters of midline.

Goldenberg, Podreka, Uhl, Steiner, Willmes, and Deecke (1989). Ten participants visualized a previously studied color, another ten visualized a previously studied face, and another ten visualized a previously studied set of geometric figures arranged to symbolize a map. In the first two conditions, the participants were to form a vivid image and retain it until the name of the next stimulus was presented. The investigators did not tell the participants to visualize the stimuli exactly as they had actually appeared, but instead told them to modify each face or color to make it easier to visualize. In the face condition, participants were told not to visualize the actual living person’s face, but rather a static, achromatic rendition. In the map condition, participants visualized a path connecting two points on the map. Participants were blindfolded, with eyes closed, and received an auditory cue as to which color, face, or route to visualize; they visualized a stimulus for fifteen seconds before the next one was presented. The baseline condition was rest, but twelve of thirty participants reported spontaneously forming visual images during the resting state (seven in the color group, two in the face group, and three in the map group). No behavior was recorded.

Goldenberg, Podreka, Steiner, Willmes, Suess, and Deecke (1989, experiment 2). In experiment 2, eighteen participants memorized the appearance of twenty-three letters in Helvetica font. Following this, they were scanned while they were auditorily cued to visualize each letter and count its corners, flashing a light for each corner. (Two participants later reported either that they had not experienced any imagery, or only vague imagery, during this task.) For each letter, participants had an image for eight to fifteen seconds, depending on the number of corners. Participants were 93.1 percent accurate in the corner counting task. In the baseline condition, participants were given the names of two letters and counted the number of letters in the alphabet between those two. Eight participants reported having had visual images of the letters during this baseline task, but all denied using imagery as a strategy to perform it. No brain area was activated more strongly in the imagery condition than in the control condition. It is possible that the control condition not only involved imagery (and hence removed it in the comparison) but also was more challenging than the imagery task.

Goldenberg, Podreka, Steiner, Franzen, and Deecke (1991). Fourteen participants first memorized color photos of five different objects. They subsequently heard the names of these objects in pseudo-random order, one every fifteen seconds, while their brains were being scanned. Participants were asked to form vivid (p.211) mental images of each object. In the baseline condition, participants listened to five low-imagery words, which were presented at the same rate as the words in the imagery condition. No behavior was recorded. There was no significant increase in occipital cortex during the visual imagery task; the authors attributed this failure to find significant activation to inter-participant variability.

Charlot, Tzourio, Zilbovicius, Mazoyer, and Denis (1992). Eleven participants were assigned to a high-imagery group, and another eleven were assigned to a low-imagery group. Scores on the Minnesota Paper Form Board and the Mental Rotations Test (both of which assess spatial imagery) were used to determine group membership; the highest-scoring third were considered to be high imagers (the group of interest here) whereas the lowest-scoring third were considered low imagers. Participants were asked to imagine landmarks on a fictional island and then to explore the visualized island from point to point. Participants continuously explored the island for 4.5 minutes, which is the time it takes to obtain an image with SPECT, but they started from a new landmark every forty-five seconds. The baseline was rest, and no behavior was assessed. (Although activation in area 17 or 18 was found in low-imagers, this activation is likely to be an artifact of general whole-brain increases for low imagers.) There was very little increase in any location for high imagers, and no hint of an increase in activation in early visual cortex.

At first glance, this review would seem to paint a very muddy picture. Thus, it is impressive that the analyses described in chapter 4 brought order to this seemingly complex set of results.

Notes:

(1.) Studies in which area 17 or 18 was not monitored were also excluded. For example, Roland and Friberg (1985) used Xe-133 in a route-finding task, but the recording sites would not allow them to detect area 17 or 18 activation. In addition, studies with special populations were excluded. For example, Rauch et al. (1996) used a script-driven imagery task with PTSD patients and found area 17 or 18 activation when participants visualized traumatic scripts compared to neutral ones. Kosslyn and Thompson also examined but did not include in the analysis nine additional studies that did not meet the inclusion criteria. In three of the studies (Decety, Kawashima, Gulya´s, & Roland, 1992; Goldberg, Berman, Randolph, Gold, & Weinberger, 1996; Kawashima, Roland, & O’Sullivan, 1995), imagery instructions were not used, and we had no firm grounds for inferring that the tasks actually required the use of imagery; in two of the studies (Menon et al., 1993) the lack of details in a short report prevented us from coding the results; two of the papers reported case studies (Pu¨tz et al., 1996; Tootell, Hadjikani, et al., 1998); and two of the reports were one-page abstracts (Damasio et al., 1993; Kuo, Chen, Humg, Tzeng, & Hsieh, 2000). Of these, seven studies reported activation in early visual cortex and two did not.

(2.) The journals were American Journal of Psychiatry, Behavioural Brain Research, Brain, Brain and Cognition, Brain Research, Brain Research Bulletin, Cerebral Cortex, (p.212) Cortex, European Journal of Cognitive Psychology, European Journal of Neuroscience, European Neurology, Human Brain Mapping, Journal of Cognitive Neuroscience, Journal of Neuroscience, Nature, Nature Neuroscience, NeuroImage, Neuron, Neuropsychologia, Neuropsychology, NeuroReport, Proceedings of the National Academy of Sciences (USA), Proceedings of the Royal Society, Science, Trends in Cognitive Sciences, Trends in Neuroscience.

(3.) In three of the results reported as not finding activation in the early visual cortex, the researchers actually found more activation in the baselines than in the imagery condition: Goldenberg, Podreka, Uhl, et al. (1989), when color imagery was compared to rest, and Mazard, Mazoyer, et al. (2002), for both imagery conditions (noise and no noise).