Perceptual Categorization -- Recent Research

Scientific Understanding of Consciousness
Consciousness as an Emergent Property of Thalamocortical Activity

Perceptual Categorization -- Recent Research

Nature 452, 352-355 (20 March 2008)

Identifying natural images from human brain activity

Kendrick N. Kay, Thomas Naselaris, Ryan J. Prenger& Jack L. Gallant

Department of Psychology, University of California, Berkeley, California 94720, USA

Helen Wills Neuroscience Institute, University of California, Berkeley, California 94720, USA

Department of Physics, University of California, Berkeley, California 94720, USA

(paraphrase)

A challenging goal in neuroscience is to be able to read out, or decode, mental content from brain activity. Recent functional magnetic resonance imaging (fMRI) studies have decoded orientation, position and object categoryfrom activity in visual cortex. However, these studies typically used relatively simple stimuli (for example, gratings) or images drawn from fixed categories (for example, faces, houses), and decoding was based on previous measurements of brain activity evoked by those same stimuli or categories. To overcome these limitations, here we develop a decoding method based on quantitative receptive-field models that characterize the relationship between visual stimuli and fMRI activity in early visual areas. These models describe the tuning of individual voxels for space, orientation and spatial frequency, and are estimated directly from responses evoked by natural images. We show that these receptive-field models make it possible to identify, from a large set of completely novel natural images, which specific image was seen by an observer. Identification is not a mere consequence of the retinotopic organization of visual areas; simpler receptive-field models that describe only spatial tuning yield much poorer identification performance. Our results suggest that it may soon be possible to reconstruct a picture of a person's visual experience from measurements of brain activity alone.

(end of paraphrase)

Perceptual Categorization based on nonlinear interpolation among stored orthographics

(paraphrase of Logothetis, Object Recognition in Primates, 148ff)

Viewer-centered representations can explain human recognition performance and may accomplish viewpoint invariance relying on a small number of two-dimensional views. For example, under conditions of orthographic projection, all possible views of an object can be expressed simply as the linear combination of as few as three distinct two-dimensional views, given that the same features remain visible in all three views. The model of linear combinations of views, however, relies only on geometrical features, and fails to predict human behavior for recognizing objects at the subordinate levels.

Generalization could be accomplished by nonlinear interpolation among stored orthographic or perspective views that can be determined on the basic of geometric features or material properties of the object. A simple network can achieve viewpoint invariance by interpolating among a small number of stored views. Computationally, such a network uses a small set of sparse data_, corresponding to an object's training views to synthesize an approximation to a multivariate function representing the object.

(end of paraphrase)

Science 3 July 2009: Vol. 325. no. 5936, pp. 87 - 89

Role of Layer 6 of V2 Visual Cortex in Object-Recognition Memory

Manuel F. López-Aranda,^1,2,4 Juan F. López-Téllez,^1,2,4 Irene Navarro-Lobato,¹ Mariam Masmudi-Martín,¹ Antonia Gutiérrez,^3,4 Zafar U. Khan^1,2,4

¹ Laboratory of Neurobiology, Centro de Investigaciones Médico-Sanitarias, University of Malaga, Campus Teatinos s/n, 29071 Malaga, Spain.
² Department of Medicine, Faculty of Medicine, University of Malaga, Campus Teatinos s/n, 29071 Malaga, Spain.
³ Department of Cell Biology, Faculty of Science, University of Malaga, Campus Teatinos s/n, 29071 Malaga, Spain.
⁴ Centro de Investigación Biomédica en Red sobre Enfermedades Neurodegenerativas (CIBERNED), Institute of Health Carlos III, Madrid, Spain.

(paraphrase)

Cellular responses in the V2 secondary visual cortex to simpleas well as complex visual stimuli have been well studied. However,the role of area V2 in visual memory remains unexplored. Wefound that layer 6 neurons of V2 are crucial for the processingof object-recognition memory (ORM). Using the protein regulatorof G protein signaling–14 (RGS-14) as a tool, we foundthat the expression of this protein into layer 6 neurons ofrat-brain area V2 promoted the conversion of a normal short-termORM that normally lasts for 45 minutes into long-term memorydetectable even after many months. Furthermore, eliminationof the same-layer neurons by means of injection of a selectivecytotoxin resulted in the complete loss of normal as well asprotein-mediated enhanced ORM.

Recognition memory, one of the most studied examples of declarative memory, is generally considered to consist of two components, recollection and familiarity, and depends on the medial temporal lobe (MTL), a structure composed of the hippocampus and adjacent perirhinal, entorhinal, and parahippocampal cortices. It is argued that the entire ventral visual-to-hippocampal stream is important for visual memory. This theory predicts that object-recognition memory (ORM) alterations could result from the manipulation in V2, an area that is highly interconnected within the ventral stream of visual cortices. In the monkey brain, this area receives strong feedforward connections from the primary visual cortex (V1) and sends strong projections to other secondary visual cortices (V3, V4, and V5). Most of the neurons of this area are tuned to simple visual characteristics such as orientation, spatial frequency, size, color, and shape. V2 cells also respond to various complex shape characteristics, such as the orientation of illusory contours and whether the stimulus is part of the figure or the ground. Anatomical studies implicate layer 3 of area V2 in visual-information processing. In contrast to layer 3, layer 6 of the visual cortex is composed of many types of neurons, and their response to visual stimuli is more complex. But the importance of layer 6 in visual-information processing remains an enigma.

Our results show that layer 6 of area V2 is implicated in ORM formation but not in its storage. After passing through area V2, visual information continues ventrally through other visual areas to the MTL, a domain where ORM is thought to be processed. Our findings of the role of layer 6 neurons in the formation of both normal (short-term) and long-term ORM emphasize the importance of V2, an area localized outside of MTL. It is proposed that layer 6 neurons of area V2 modulate the processing of visual information flow by either direct or indirect intrinsic connections within this area from layer 6 to other layers. Our results show that layer 6 of area V2, an area thought to be involved in perception and perceptual learning, not only plays a critical role in the formation of short- and long-term visual memory but also supports the view that the entire stream of ventral visual-to-hippocampus, and not the MTL alone, is important for visual memory processing.

(end of paraphrase)

Nature 460, 94-97 (2 July 2009)

Neural mechanisms of rapid natural scene categorization in human visual cortex

Marius V. Peelen, Li Fei-Fei & Sabine Kastner

Department of Psychology, Princeton University, Princeton, New Jersey 08540, USA

Center for the Study of Brain, Mind, and Behavior, Princeton University, Princeton, New Jersey 08540, USA

Princeton Neuroscience Institute, Princeton University, Princeton, New Jersey 08540, USA

Department of Computer Science, Princeton University, Princeton, New Jersey 08540, USA

(paraphrase)

The visual system has an extraordinary capability to extract categorical information from complex natural scenes. For example, subjects are able to rapidly detect the presence of object categories such as animals or vehicles in new scenes that are presented very briefly. This is even true when subjects do not pay attention to the scenes and simultaneously perform an unrelated attentionally demanding task, a stark contrast to the capacity limitations predicted by most theories of visual attention. Here we show a neural basis for rapid natural scene categorization in the visual cortex, using functional magnetic resonance imaging and an object categorization task in which subjects detected the presence of people or cars in briefly presented natural scenes. The multi-voxel pattern of neural activity in the object-selective cortex evoked by the natural scenes contained information about the presence of the target category, even when the scenes were task-irrelevant and presented outside the focus of spatial attention. These findings indicate that the rapid detection of categorical information in natural scenes is mediated by a category-specific biasing mechanism in object-selective cortex that operates in parallel across the visual field, and biases information processing in favour of objects belonging to the target object category.

In daily life we often look for particular object categories in our environment that are relevant for ongoing behaviour. For example, before crossing a street we look whether cars are near, perhaps not noticing other objects in the visual scene present at the same time, such as people walking on the other side of the street. Behavioural experiments have shown that such detection of familiar object categories in natural scenes is extremely rapid, and can be done even without focal attention. These results indicate the existence of selection mechanisms for familiar object categories that operate independently of spatial attention.

(end of paraphrase)

Nature, Vol 443, 7 September 2006, p.85.

Experience-dependent representation of visual categories in parietal cortex

David J. Freedman and John A. Assad

Department of Neurobiology, Harvard Medical School, 220 Longwood Avenue, Boston, Massachusetts 02115, USA

(paraphrase)

Categorization is a process by which the brain assigns meaning to sensory stimuli. Through experience, we learn to group stimuli into categories, such as 'chair', 'table' and 'vehicle', which are critical for rapidly and appropriately selecting behavioural responses. Although much is known about the neural representation of simple visual stimulus features (for example, orientation, direction and colour), relatively little is known about how the brain learns and encodes the meaning of stimuli. We trained monkeys to classify 360° of visual motion directions into two discrete categories, and compared neuronal activity in the lateral intraparietal (LIP) and middle temporal (MT) areas, two interconnected brain regions known to be involved in visual motion processing. Here we show that neurons in LIP—an area known to be centrally involved in visuo-spatial attention, motor planning and decision-making—robustly reflect the category of motion direction as a result of learning.

(end of paraphrase)

Science 12 September 2008: Vol. 321. no. 5895, pp. 1502 - 1507

Unsupervised Natural Experience Rapidly Alters Invariant Object Representation in Visual Cortex

Nuo Li and James J. DiCarlo

McGovern Institute for Brain Research and Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139, USA.

(paraphrase)

Neurons in the most complex area of the brain's visual cortex can respond to a particular object in any orientation by rapidly learning to associate multiple views of that object.

When presented with a visual image, primates can rapidly (<200ms) recognize objects despite large variations in object position,scale, and pose. This ability likely derives from theresponses of neurons at high levels of the primate ventral visualstream. But how are these powerful "invariant" neuronalobject representations built by the visual system? On the basisof theoretical and behavioral work, onepossibility is that tolerance ("invariance") is learned fromthe temporal contiguity of object features during natural visualexperience, potentially in an unsupervised manner. Specifically,during natural visual experience, objects tend to remain presentfor seconds or longer, while object motion or viewer motion(e.g., eye movements) tends to cause rapid changes in the retinalimage cast by each object over shorter time intervals (hundredsof ms). The ventral visual stream could construct a tolerantobject representation by taking advantage of this natural tendencyfor temporally contiguous retinal images to belong to the sameobject. If this hypothesis is correct, it might be possibleto uncover a neuronal signature of the underlying learning byusing targeted alteration of those spatiotemporal statistics.

Object recognition is challenging because each object producesmyriad retinal images. Responses of neurons from the inferiortemporal cortex (IT) are selective to different objects, yettolerant ("invariant") to changes in object position, scale,and pose. How does the brain construct this neuronal tolerance?We report a form of neuronal learning that suggests the underlyingsolution. Targeted alteration of the natural temporal contiguityof visual experience caused specific changes in IT positiontolerance. This unsupervised temporal slowness learning (UTL)was substantial, increased with experience, and was significantin single IT neurons after just 1 hour. Together with previoustheoretical work and human object perception experiments, wespeculate that UTL may reflect the mechanism by which the visualstream builds and maintains tolerant object representations.

We term this effect "unsupervised temporal slowness learning"(UTL), because the selectivity changes depend on the temporalcontiguity of object images on the retina and are consistentwith the hypothesis that the natural stability (slowness) ofobject identity instructs the learning without external supervision. The brain's saccade-generation mechanisms or the associatedattentional mechanisms may also be needed. Indeed,eye movement signals are present in the ventral stream.

The time course and task independenceof UTL are consistent with synaptic plasticity, butour data do not constrain the locus of plasticity, and changesat multiple levels of the ventral visual stream are likely. Hebbian-like learning rules are consistentwith spike-timing–dependent plasticity.

(end of paraphrase)

Perceptual Categorization -- Recent Research

Return to — Perceptual Categorization