Scientific Understanding of Consciousness
Consciousness as an Emergent Property of Thalamocortical Activity

Perceptual Categorization -- Recent Research



Nature 452, 352-355 (20 March 2008)

Identifying natural images from human brain activity

Kendrick N. Kay, Thomas Naselaris, Ryan J. Prenger & Jack L. Gallant

Department of Psychology, University of California, Berkeley, California 94720, USA

Helen Wills Neuroscience Institute, University of California, Berkeley, California 94720, USA

Department of Physics, University of California, Berkeley, California 94720, USA


A challenging goal in neuroscience is to be able to read out, or decode, mental content from brain activity. Recent functional magnetic resonance imaging (fMRI) studies have decoded orientation, position and object category from activity in visual cortex. However, these studies typically used relatively simple stimuli (for example, gratings) or images drawn from fixed categories (for example, faces, houses), and decoding was based on previous measurements of brain activity evoked by those same stimuli or categories. To overcome these limitations, here we develop a decoding method based on quantitative receptive-field models that characterize the relationship between visual stimuli and fMRI activity in early visual areas. These models describe the tuning of individual voxels for space, orientation and spatial frequency, and are estimated directly from responses evoked by natural images. We show that these receptive-field models make it possible to identify, from a large set of completely novel natural images, which specific image was seen by an observer. Identification is not a mere consequence of the retinotopic organization of visual areas; simpler receptive-field models that describe only spatial tuning yield much poorer identification performance. Our results suggest that it may soon be possible to reconstruct a picture of a person's visual experience from measurements of brain activity alone.

(end of paraphrase)



Perceptual Categorization based on nonlinear interpolation among stored orthographics

(paraphrase of Logothetis, Object Recognition in Primates, 148ff)

Viewer-centered representations can explain human recognition performance and may accomplish viewpoint invariance relying on a small number of two-dimensional views. For example, under conditions of orthographic projection, all possible views of an object can be expressed simply as the linear combination of as few as three distinct two-dimensional views, given that the same features remain visible in all three views. The model of linear combinations of views, however, relies only on geometrical features, and fails to predict human behavior for recognizing objects at the subordinate levels.

Generalization could be accomplished by nonlinear interpola­tion among stored orthographic or perspective views that can be determined on the basic of geometric features or material properties of the object. A simple network can achieve viewpoint invariance by interpolating among a small number of stored views. Computationally, such a network uses a small set of sparse data, corresponding to an object's training views to synthesize an approximation to a multivariate function representing the object.

(end of paraphrase)


Science 3 July 2009: Vol. 325. no. 5936, pp. 87 - 89

Role of Layer 6 of V2 Visual Cortex in Object-Recognition Memory

Manuel F. López-Aranda,1,2,4 Juan F. López-Téllez,1,2,4 Irene Navarro-Lobato,1 Mariam Masmudi-Martín,1 Antonia Gutiérrez,3,4 Zafar U. Khan1,2,4

1 Laboratory of Neurobiology, Centro de Investigaciones Médico-Sanitarias, University of Malaga, Campus Teatinos s/n, 29071 Malaga, Spain.
2 Department of Medicine, Faculty of Medicine, University of Malaga, Campus Teatinos s/n, 29071 Malaga, Spain.
3 Department of Cell Biology, Faculty of Science, University of Malaga, Campus Teatinos s/n, 29071 Malaga, Spain.
4 Centro de Investigación Biomédica en Red sobre Enfermedades Neurodegenerativas (CIBERNED), Institute of Health Carlos III, Madrid, Spain.


Cellular responses in the V2 secondary visual cortex to simple as well as complex visual stimuli have been well studied. However, the role of area V2 in visual memory remains unexplored. We found that layer 6 neurons of V2 are crucial for the processing of object-recognition memory (ORM). Using the protein regulator of G protein signaling–14 (RGS-14) as a tool, we found that the expression of this protein into layer 6 neurons of rat-brain area V2 promoted the conversion of a normal short-term ORM that normally lasts for 45 minutes into long-term memory detectable even after many months. Furthermore, elimination of the same-layer neurons by means of injection of a selective cytotoxin resulted in the complete loss of normal as well as protein-mediated enhanced ORM.

Recognition memory, one of the most studied examples of declarative memory, is generally considered to consist of two components, recollection and familiarity, and depends on the medial temporal lobe (MTL), a structure composed of the hippocampus and adjacent perirhinal, entorhinal, and parahippocampal cortices. It is argued that the entire ventral visual-to-hippocampal stream is important for visual memory. This theory predicts that object-recognition memory (ORM) alterations could result from the manipulation in V2, an area that is highly interconnected within the ventral stream of visual cortices. In the monkey brain, this area receives strong feedforward connections from the primary visual cortex (V1) and sends strong projections to other secondary visual cortices (V3, V4, and V5). Most of the neurons of this area are tuned to simple visual characteristics such as orientation, spatial frequency, size, color, and shape. V2 cells also respond to various complex shape characteristics, such as the orientation of illusory contours and whether the stimulus is part of the figure or the ground. Anatomical studies implicate layer 3 of area V2 in visual-information processing. In contrast to layer 3, layer 6 of the visual cortex is composed of many types of neurons, and their response to visual stimuli is more complex. But the importance of layer 6 in visual-information processing remains an enigma.

Our results show that layer 6 of area V2 is implicated in ORM formation but not in its storage. After passing through area V2, visual information continues ventrally through other visual areas to the MTL, a domain where ORM is thought to be processed. Our findings of the role of layer 6 neurons in the formation of both normal (short-term) and long-term ORM emphasize the importance of V2, an area localized outside of MTL. It is proposed that layer 6 neurons of area V2 modulate the processing of visual information flow by either direct or indirect intrinsic connections within this area from layer 6 to other layers. Our results show that layer 6 of area V2, an area thought to be involved in perception and perceptual learning, not only plays a critical role in the formation of short- and long-term visual memory but also supports the view that the entire stream of ventral visual-to-hippocampus, and not the MTL alone, is important for visual memory processing.

(end of paraphrase)



Nature 460, 94-97 (2 July 2009)

Neural mechanisms of rapid natural scene categorization in human visual cortex

Marius V. Peelen, Li Fei-Fei & Sabine Kastner

Department of Psychology, Princeton University, Princeton, New Jersey 08540, USA

Center for the Study of Brain, Mind, and Behavior, Princeton University, Princeton, New Jersey 08540, USA

Princeton Neuroscience Institute, Princeton University, Princeton, New Jersey 08540, USA

Department of Computer Science, Princeton University, Princeton, New Jersey 08540, USA


The visual system has an extraordinary capability to extract categorical information from complex natural scenes. For example, subjects are able to rapidly detect the presence of object categories such as animals or vehicles in new scenes that are presented very briefly. This is even true when subjects do not pay attention to the scenes and simultaneously perform an unrelated attentionally demanding task, a stark contrast to the capacity limitations predicted by most theories of visual attention. Here we show a neural basis for rapid natural scene categorization in the visual cortex, using functional magnetic resonance imaging and an object categorization task in which subjects detected the presence of people or cars in briefly presented natural scenes. The multi-voxel pattern of neural activity in the object-selective cortex evoked by the natural scenes contained information about the presence of the target category, even when the scenes were task-irrelevant and presented outside the focus of spatial attention. These findings indicate that the rapid detection of categorical information in natural scenes is mediated by a category-specific biasing mechanism in object-selective cortex that operates in parallel across the visual field, and biases information processing in favour of objects belonging to the target object category.

In daily life we often look for particular object categories in our environment that are relevant for ongoing behaviour. For example, before crossing a street we look whether cars are near, perhaps not noticing other objects in the visual scene present at the same time, such as people walking on the other side of the street. Behavioural experiments have shown that such detection of familiar object categories in natural scenes is extremely rapid, and can be done even without focal attention. These results indicate the existence of selection mechanisms for familiar object categories that operate independently of spatial attention.

(end of paraphrase)



Nature, Vol 443, 7 September 2006, p.85.

Experience-dependent representation of visual categories in parietal cortex

David J. Freedman and John A. Assad

Department of Neurobiology, Harvard Medical School, 220 Longwood Avenue, Boston, Massachusetts 02115, USA


Categorization is a process by which the brain assigns meaning to sensory stimuli. Through experience, we learn to group stimuli into categories, such as 'chair', 'table' and 'vehicle', which are critical for rapidly and appropriately selecting behavioural responses. Although much is known about the neural representation of simple visual stimulus features (for example, orientation, direction and colour), relatively little is known about how the brain learns and encodes the meaning of stimuli. We trained monkeys to classify 360° of visual motion directions into two discrete categories, and compared neuronal activity in the lateral intraparietal (LIP) and middle temporal (MT) areas, two interconnected brain regions known to be involved in visual motion processing. Here we show that neurons in LIP—an area known to be centrally involved in visuo-spatial attention, motor planning and decision-making—robustly reflect the category of motion direction as a result of learning.

(end of paraphrase)




Science 12 September 2008: Vol. 321. no. 5895, pp. 1502 - 1507

Unsupervised Natural Experience Rapidly Alters Invariant Object Representation in Visual Cortex

Nuo Li and James J. DiCarlo

McGovern Institute for Brain Research and Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139, USA.


Neurons in the most complex area of the brain's visual cortex can respond to a particular object in any orientation by rapidly learning to associate multiple views of that object.

When presented with a visual image, primates can rapidly (<200 ms) recognize objects despite large variations in object position, scale, and pose. This ability likely derives from the responses of neurons at high levels of the primate ventral visual stream. But how are these powerful "invariant" neuronal object representations built by the visual system? On the basis of theoretical and behavioral work, one possibility is that tolerance ("invariance") is learned from the temporal contiguity of object features during natural visual experience, potentially in an unsupervised manner. Specifically, during natural visual experience, objects tend to remain present for seconds or longer, while object motion or viewer motion (e.g., eye movements) tends to cause rapid changes in the retinal image cast by each object over shorter time intervals (hundreds of ms). The ventral visual stream could construct a tolerant object representation by taking advantage of this natural tendency for temporally contiguous retinal images to belong to the same object. If this hypothesis is correct, it might be possible to uncover a neuronal signature of the underlying learning by using targeted alteration of those spatiotemporal statistics.

Object recognition is challenging because each object produces myriad retinal images. Responses of neurons from the inferior temporal cortex (IT) are selective to different objects, yet tolerant ("invariant") to changes in object position, scale, and pose. How does the brain construct this neuronal tolerance? We report a form of neuronal learning that suggests the underlying solution. Targeted alteration of the natural temporal contiguity of visual experience caused specific changes in IT position tolerance. This unsupervised temporal slowness learning (UTL) was substantial, increased with experience, and was significant in single IT neurons after just 1 hour. Together with previous theoretical work and human object perception experiments, we speculate that UTL may reflect the mechanism by which the visual stream builds and maintains tolerant object representations.

We term this effect "unsupervised temporal slowness learning" (UTL), because the selectivity changes depend on the temporal contiguity of object images on the retina and are consistent with the hypothesis that the natural stability (slowness) of object identity instructs the learning without external supervision. The brain's saccade-generation mechanisms or the associated attentional mechanisms may also be needed. Indeed, eye movement signals are present in the ventral stream.

The time course and task independence of UTL are consistent with synaptic plasticity, but our data do not constrain the locus of plasticity, and changes at multiple levels of the ventral visual stream are likely. Hebbian-like learning rules are consistent with spike-timing–dependent plasticity.

(end of paraphrase)



Return to — Perceptual Categorization