Scientific Understanding of Consciousness
Functional Magnetic Resonance Imaging (fMRI)
Nature 453, 869-878 (12 June 2008)
What we can do and what we cannot do with fMRI
Nikos K. Logothetis
Max Planck Institute for Biological Cybernetics, 72076 Tuebingen, Germany, and Imaging Science and Biomedical Engineering, University of Manchester, Manchester M13 9PL, UK
Functional magnetic resonance imaging (fMRI) is currently the mainstay of neuroimaging in cognitive neuroscience. Advances in scanner technology, image acquisition protocols, experimental design, and analysis methods promise to push forward fMRI from mere cartography to the true study of brain organization. However, fundamental questions concerning the interpretation of fMRI data abound, as the conclusions drawn often ignore the actual limitations of the methodology. Here I give an overview of the current state of fMRI, and draw on neuroimaging and physiological data to present the current understanding of the haemodynamic signals and the constraints they impose on neuroimaging data interpretation.
Magnetic resonance imaging (MRI) is the most important imaging advance since the introduction of X-rays by Conrad Röntgen in 1895. Since its introduction in the clinic in the 1980s, it has assumed a role of unparalleled importance in diagnostic medicine and more recently in basic research. In medicine, MRI is primarily used to produce structural images of organs, including the central nervous system, but it can also provide information on the physico-chemical state of tissues, their vascularization, and perfusion. Although all of these capacities have long been widely appreciated, it was the emergence of functional MRI (fMRI) — a technique for measuring haemodynamic changes after enhanced neural activity — in the early 1990s that had a real impact on basic cognitive neuroscience research.
The principal advantages of fMRI lie in its noninvasive nature, ever-increasing availability, relatively high spatiotemporal resolution, and its capacity to demonstrate the entire network of brain areas engaged when subjects undertake particular tasks. One disadvantage is that, like all haemodynamic-based modalities, it measures a surrogate signal whose spatial specificity and temporal response are subject to both physical and biological constraints. A more important shortcoming is that this surrogate signal reflects neuronal mass activity. Although this fact is acknowledged by the vast majority of investigators, its implications for drawing judicious conclusions from fMRI data are most frequently ignored. The aim of this review is first to describe briefly the fMRI technology used in cognitive neuroscience, and then discuss its neurobiological principles that very often limit data interpretation. I hope to point out that the ultimate limitations of fMRI are mainly due to the very fact that it reflects mass action, and much less to limitations imposed by the existing hardware or the acquisition methods.
The beautiful graphics MRI and fMRI produce, and the excitement about what they imply, often mask the immense complexity of the physical, biophysical and engineering procedures generating them. The actual details of MRI can only be correctly described via quantum mechanics, but a glimpse of the method's foundation can be also afforded with the tools of classical physics using a few simple equations. Here I offer a brief overview that permits an understandable definition of the terms and parameters commonly used in magnetic resonance imaging. Functional activation of the brain can be detected with MRI via direct measurements of tissue perfusion, blood-volume changes, or changes in the concentration of oxygen. The blood-oxygen-level-dependent (BOLD) contrast mechanism is currently the mainstay of human neuroimaging.
Critical factors determining the utility of fMRI for drawing conclusions in brain research are signal specificity and spatial and temporal resolution. Signal specificity ensures that the generated maps reflect actual neural changes, whereas spatial and temporal resolution determine our ability to discern the elementary units of the activated networks and the time course of various neural events, respectively. The interpretability of BOLD fMRI data also depends critically on the experimental design used.
Spatiotemporal properties of BOLD fMRI
The spatiotemporal properties of fMRI can be highlighted briefly. Spatial specificity increases with increasing magnetic field strength and for a given magnetic field can be optimized by using pulse sequences that are less sensitive to signals from within and around large vessels. Spatiotemporal resolution is likely to increase with the optimization of pulse sequences, the improvement of resonators, the application of high magnetic fields, and the invention of intelligent strategies such as parallel imaging.
Research shows that the subcortical input to cortex is weak; the feedback is massive, the local connectivity reveals strong excitatory and inhibitory recurrence, and the output reflects changes in the balance between excitation and inhibition, rather than simple feedforward integration of subcortical inputs. The properties of these excitation–inhibition networks (EIN) deserve special attention and are discussed in the following sections.
Feedforward and feedback cortical processing
Brain connectivity is mostly bidirectional. To the extent that different brain regions can be thought of as hierarchically organized processing steps, connections are often described as feedforward and feedback, forward and backward, ascending and descending, or bottom-up and top-down. Although all terms agree on processing direction, endowing backward connections with a role of engineering-type or functional 'feedback' might occasionally be misleading, as under a theoretical generative model perspective on brain function, it is the backward connections that generate predictions and the forward connections that convey the traditional feedback, in terms of mismatch or prediction error signals.
In the sensory systems, patterns of long-range cortical connectivity to some extent define feedforward and feedback pathways. The main thalamic input mainly goes to middle layers, whereas second-order thalamic afferents and the nonspecific diffuse afferents from basal forebrain and brain-stem are, respectively, distributed diffusely regionally or over many cortical areas, making synapses mainly in superficial and/or deep layers. Cortical output has thalamic and other subcortical projections originating in layers VI and V, respectively, and corticocortical projections mostly from supragranular layers. The primary thalamic input innervates both excitatory and inhibitory neurons, and communication between all cell types includes horizontal and vertical connections within and between cortical layers. Such connections are divergent and convergent, so that the final response of each neuron is determined by all feedforward, feedback and modulatory synapses.
Very few of the pyramid synapses are thalamocortical (less than 10–20% in the input layers of cortex, and less than 5% across its entire depth; in the primary visual cortex the numbers are even lower, with the thalamocortical synapses on stellate cells being about 5%), with the rest originating from other cortical pyramidal cells. Pyramidal axon collateral branches ascend back to and synapse in superficial layers, whereas others distribute excitation in the horizontal plane, forming a strongly recurrent excitatory network.
The strong amplification of the input signal caused by this kind of positive feedback loop is set under tight control by an inhibitory network interposed among pyramidal cells and consisting of a variety of GABAergic interneurons. These can receive both excitatory and inhibitory synapses on to their somata, and have only local connections. About 85% of them in turn innervate the local pyramidal cells. Different GABAergic cells target different subdomains of neurons. Some (for example, basket cells) target somata and proximal dendrites, and are excellent candidates for the role of gain adjustment of the integrated synaptic response; others (for example, chandelier cells) target directly the axons of nearby pyramidal neurons, and appear to have a context-dependent role — they can facilitate spiking during low activity periods, or act like gatekeepers that shunt most complex somatodendritic integrative processes during high activity periods. Such nonlinearities might generate substantial dissociations between subthreshold population activity and its concomitant metabolic demand and the spiking of pyramidal cells.
Modules and their microcircuits
A large number of structural, immunochemical and physiological studies, in all cortical areas examined so far, suggested that the functional characteristics of a cortical module are instantiated in a simple basic EIN, referred to as a canonical microcircuit. Activation of a microcircuit sets in motion a sequence of excitation and inhibition in every neuron of the module, rather than initiating a sequential activation of separate neurons at different hypothetical processing stages. Re-excitation is tightly controlled by local inhibition, and the time evolution of excitation–inhibition is far longer than the synaptic delays of the circuits involved. This means the magnitude and timing of any local mass activation arise as properties of the microcircuits.
Computational modelling suggested that EIN microcircuits, containing such a precisely balanced excitation and inhibition, can account for a large variety of observations of cortical activity, including amplification of sensory input, noise reduction, gain control, stochastic properties of discharge rates, modulation of excitability with attention, or even generation of persisting activity during the delay periods of working memory tasks.
The principle of excitation–inhibition balance implies that microcircuits are capable of large changes in activity while maintaining proportionality in their excitatory and inhibitory synaptic conductances. This hypothesis has been tested directly in experiments examining conductance changes during periods of high (up) and low (down) cortical activity. Alternating up states and down states can be readily observed in cerebral cortex during natural sleep or anaesthesia, but they can be also induced in vitro by manipulating the ionic concentrations in a preparation so that they match those found in situ. Research showed that the up state is characterized by persisting synaptically mediated depolarization of the cell membranes owing to strong barrages of synaptic potentials, and a concomitant increase in spiking rate, whereas the down state is marked by membrane hyperpolarization and reduction or cessation of firing. Most importantly, the excitation–inhibition conductances indeed changed proportionally throughout the duration of the up state despite large changes in membrane conductance.
Microcircuits therefore have the following distinct features: (1) the final response of each neuron is determined by all feedforward, feedback and modulatory synapses; (2) transient excitatory responses may result from leading excitation, for example, due to small synaptic delays or differences in signal propagation speed, whereupon inhibition is rapidly engaged, followed by balanced activity; (3) net excitation or inhibition might occur when the afferents drive the overall excitation–inhibition balance in opposite directions; and (4) responses to large sustained input changes may occur while maintaining a well balanced excitation–inhibition. Microcircuits—depending on their mode of operation—can, in principle, act either as drivers, faithfully transmitting stimulus-related information, or as modulators, adjusting the overall sensitivity and context-specificity of the responses.
This important driver/modulator distinction was initially drawn in the thalamus, in which the afferents in the major sensory thalamic relays were assigned to one of two major classes on the basis of the morphological characteristics of the axon terminals, the synaptic relationships and the type of activated receptors, the degree of input convergence, and the activity patterns of postsynaptic neurons. The same concept also broadly applies to the afferents of the cerebral cortex, wherein the thalamic or corticocortical axons terminating in layer IV can be envisaged as drivers, and other feedback afferents terminating in the superficial layers as modulators. It can also be applied to the cortical output, whereby the projections of layer VI back to the primary relays of the thalamus are modulatory, whereas the cortico-thalamo-cortical paths originating in layer V of cortex, reaching higher-order thalamic nuclei (for example, pulvinar), and then re-entering cortex via layer IV, are drivers.
The initial information reaching a cortical region is elaborated and evaluated in a context-dependent manner, under the influence of strong intra- and cross-regional cortical interactions. The cortical output reflects ascending input but also cortico-thalamo-cortical pathways, whereas its responsiveness and SNR reflect the activity of feedback, and likely input from the ascending diffuse systems of the brain-stem. The neuromodulation afforded by these systems, which is thought to underlie the altered states of cognitive capacities, such as motivation, attention, learning and memory, is likely to affect large masses of cells, and potentially induce larger changes in the fMRI signal than the sensory signals themselves.
Predicting neural activity from the fMRI signals
In humans, there are about 90,000–100,000 neurons under 1 mm2 of cortical surface. This number is relatively constant for all structurally and functionally distinct areas, including the somatosensory, temporal, parietal, frontal and motor cortical areas. An exception is the primary visual cortex of certain primates, including monkey and human, which has approximately twice as many neurons. The number of cortical neurons under unitary cortical surface is also similar across many species, including mouse, rat, cat, monkey and human. Its small variability is the result of a trade-off between cortical thickness and neural density. The former varies from area to area and from species to species (for example, from mouse to human the cortex becomes approximately three times thicker). Neural density varies inversely to cortical thickness. On average, density is 20,000 to 30,000 neurons per mm3; it peaks in the primary visual cortex by a factor of 4, and it is minimal in the motor cortex. Synaptic density ranges from 0.4 to 1 109 per mm3. Depending on the thickness of the cortex (2–4 mm), the number of synapses beneath 1 mm2 surface is around 109 (0.8–4 109). Although the number of synapses and the axonal length per neuron increases with increasing cortical thickness, the overall length of neuronal processes remains relatively constant, with axonal length being approximately 4 km mm-3 and dendrite length 0.4 km mm-3. Overall, synaptic density and the ratio of excitatory to inhibitory synapses also remain constant.
Given these neuro-statistical data, what are the actual contents of a neuroimaging voxel? An examination of the 300 top-cited cognitive fMRI studies suggests that the commonly used in-plane resolution is 9–16 mm2, for slice thicknesses of 5–7 mm. The average voxel size before any pre-processing of the data is thus 55 l (or 55 mm3). Often the effective size is 2–3 times larger due to the spatial filtering that most investigators apply to improve the functional SNR. Less than 3% of this volume is occupied by vessels and the rest by neural elements. A typical unfiltered fMRI voxel of 55 l in size thus contains 5.5 million neurons, 2.2–5.5 1010 synapses, 22 km of dendrites and 220 km of axons.
Conclusions and perspectives
The limitations of fMRI are not related to physics or poor engineering, and are unlikely to be resolved by increasing the sophistication and power of the scanners; they are instead due to the circuitry and functional organization of the brain, as well as to inappropriate experimental protocols that ignore this organization. The fMRI signal cannot easily differentiate between function-specific processing and neuromodulation, between bottom-up and top-down signals, and it may potentially confuse excitation and inhibition. The magnitude of the fMRI signal cannot be quantified to reflect accurately differences between brain regions, or between tasks within the same region. The origin of the latter problem is not due to our current inability to estimate accurately cerebral metabolic rate of oxygen (CMRO2) from the BOLD signal, but to the fact that haemodynamic responses are sensitive to the size of the activated population, which may change as the sparsity of neural representations varies spatially and temporally. In cortical regions in which stimulus- or task-related perceptual or cognitive capacities are sparsely represented (for example, instantiated in the activity of a very small number of neurons), volume transmission — which probably underlies the altered states of motivation, attention, learning and memory — may dominate haemodynamic responses and make it impossible to deduce the exact role of the area in the task at hand. Neuromodulation is also likely to affect the ultimate spatiotemporal resolution of the signal.
This having been said, and despite its shortcomings, fMRI is currently the best tool we have for gaining insights into brain function and formulating interesting and eventually testable hypotheses, even though the plausibility of these hypotheses critically depends on used magnetic resonance technology, experimental protocol, statistical analysis and insightful modelling. Theories on the brain's functional organization (not just modelling of data) will probably be the best strategy for optimizing all of the above. Hypotheses formulated on the basis of fMRI experiments are unlikely to be analytically tested with fMRI itself in terms of neural mechanisms, and this is unlikely to change any time in the near future.
Of course, fMRI is not the only methodology that has clear and serious limitations. Electrical measurements of brain activity, including invasive techniques with single or multiple electrodes, also fall short of affording real answers about network activity. Single-unit recordings and firing rates are better suited to the study of cellular properties than of neuronal assemblies, and field potentials share much of the ambiguity discussed in the context of the fMRI signal. None of the above techniques is a substitute for the others. Today, a multimodal approach is more necessary than ever for the study of the brain's function and dysfunction. Such an approach must include further improvements to MRI technology and its combination with other non-invasive techniques that directly assess the brain's electrical activity, but it also requires a profound understanding of the neural basis of haemodynamic responses and a tight coupling of human and animal experimentation that will allow us to fathom the homologies between humans and other primates that are amenable to invasive electrophysiological and pharmacological testing. Claims that computational methods and non-invasive neuroimaging (that is, excluding animal experimentation) should be sufficient to understand brain function and disorders are, in my opinion, naive and utterly incorrect. If we really wish to understand how our brain functions, we cannot afford to discard any relevant methodology, much less one providing direct information from the actual neural elements that underlie all our cognitive capacities.
(end of paraphrase)
Return to — Brain Imaging