All Publications


  • Increasing neural network robustness improves match to macaque V1 eigenspectrum, spatial frequency preference and predictivity. PLoS computational biology Kong, N. C., Margalit, E., Gardner, J. L., Norcia, A. M. 2022; 18 (1): e1009739

    Abstract

    Task-optimized convolutional neural networks (CNNs) show striking similarities to the ventral visual stream. However, human-imperceptible image perturbations can cause a CNN to make incorrect predictions. Here we provide insight into this brittleness by investigating the representations of models that are either robust or not robust to image perturbations. Theory suggests that the robustness of a system to these perturbations could be related to the power law exponent of the eigenspectrum of its set of neural responses, where power law exponents closer to and larger than one would indicate a system that is less susceptible to input perturbations. We show that neural responses in mouse and macaque primary visual cortex (V1) obey the predictions of this theory, where their eigenspectra have power law exponents of at least one. We also find that the eigenspectra of model representations decay slowly relative to those observed in neurophysiology and that robust models have eigenspectra that decay slightly faster and have higher power law exponents than those of non-robust models. The slow decay of the eigenspectra suggests that substantial variance in the model responses is related to the encoding of fine stimulus features. We therefore investigated the spatial frequency tuning of artificial neurons and found that a large proportion of them preferred high spatial frequencies and that robust models had preferred spatial frequency distributions more aligned with the measured spatial frequency distribution of macaque V1 cells. Furthermore, robust models were quantitatively better models of V1 than non-robust models. Our results are consistent with other findings that there is a misalignment between human and machine perception. They also suggest that it may be useful to penalize slow-decaying eigenspectra or to bias models to extract features of lower spatial frequencies during task-optimization in order to improve robustness and V1 neural response predictivity.

    View details for DOI 10.1371/journal.pcbi.1009739

    View details for PubMedID 34995280

  • Ultra-high-resolution fMRI of human ventral temporal cortex reveals differential representation of categories and domains. The Journal of neuroscience : the official journal of the Society for Neuroscience Margalit, E., Jamison, K. W., Weiner, K. S., Vizioli, L., Zhang, R., Kay, K. N., Grill-Spector, K. 2020

    Abstract

    Human ventral temporal cortex (VTC) is critical for visual recognition. It is thought that this ability is supported by large-scale patterns of activity across VTC that contain information about visual categories. However, it is unknown how category representations in VTC are organized at the sub-millimeter scale and across cortical depths. To fill this gap in knowledge, we measured BOLD responses in medial and lateral VTC to images spanning ten categories from five domains (written characters, bodies, faces, places, and objects) at an ultra-high spatial resolution of 0.8 mm using 7 Tesla functional magnetic resonance imaging (fMRI) in both male and female participants. Representations in lateral VTC were organized most strongly at the general level of domains (e.g., places), whereas medial VTC was also organized at the level of specific categories (e.g., corridors and houses within the domain of places). In both lateral and medial VTC, domain-level and category-level structure decreased with cortical depth, and downsampling our data to standard resolution (2.4mm) did not reverse differences in representations between lateral and medial VTC. The functional diversity of representations across VTC partitions may allow downstream regions to read out information in a flexible manner according to task demands. These results bridge an important gap between electrophysiological recordings in single neurons at the micron scale in nonhuman primates and standard-resolution fMRI in humans by elucidating distributed responses at the submillimeter scale with ultra-high-resolution fMRI in humans.SIGNIFICANCE STATEMENTVisual recognition is a fundamental ability supported by human ventral temporal cortex (VTC). However, the nature of fine-scale, sub-millimeter distributed representations in VTC is unknown. Using ultra-high-resolution fMRI of human VTC, we found differential distributed visual representations across lateral and medial VTC. Domain representations (e.g. faces, bodies, places, characters) were most salient in lateral VTC whereas category representations (e.g., corridors/houses within the domain of places) were equally salient in medial VTC. These results bridge an important gap between electrophysiological recordings in single neurons at a micron scale and fMRI measurements at a millimeter scale.

    View details for DOI 10.1523/JNEUROSCI.2106-19.2020

    View details for PubMedID 32094202

  • A critical assessment of data quality and venous effects in sub-millimeter fMRI NEUROIMAGE Kay, K., Jamison, K. W., Vizioli, L., Zhang, R., Margalit, E., Ugurbil, K. 2019; 189: 847–69
  • Visual noise consisting of X-junctions has only a minimal adverse effect on object recognition. Attention, perception & psychophysics Margalit, E. n., Herald, S. B., Meschke, E. X., Irawan, I. n., Maarek, R. n., Biederman, I. n. 2019

    Abstract

    In 1968, Guzman showed that the myriad of surfaces composing a highly complex and novel assemblage of volumes can readily be assigned to their appropriate volumes in terms of the constraints offered by the vertices of coterminating edges. Of particular importance was the L-vertex, produced by the cotermination of two contours, which provides strong evidence for the termination of a 2-D surface. An X-junction, formed by the crossing of two contours without a change of direction at the crossing, played no role in the segmentation of a scene. If the potency of noise elements to affect recognition performance reflects their relevancy to the segmentation of scenes, as was suggested by Guzman, gaps in an object's contours bounded by irrelevant X-junctions would be expected to have little or no adverse effect on shape-based object recognition, whereas gaps bounded by L-junctions would be expected to have a strong deleterious effect when they disrupt the smooth continuation of contours. Guzman's roles for the various vertices and junctions have never been put to systematic test with respect to human object recognition. By adding identical noise contours to line drawings of objects that produced either L-vertices or X-junctions, these shape features could be compared with respect to their disruption of object recognition. Guzman's insights that irrelevant L-vertices should be highly disruptive and irrelevant X-vertices would have only a minimal deleterious effect were confirmed.

    View details for DOI 10.3758/s13414-019-01840-2

    View details for PubMedID 31728925