Academic Appointments


Honors & Awards


  • Schmidt Science Polymath Award, Schmidt Foundation (2023)
  • Outstanding Paper Award, Neural Information Processing Systems Foundation (2022)
  • NSF Career Award, National Science Foundation (2019)
  • Investigator Award in Mathematical Modeling of Living Systems, Simons Foundation (2016)
  • McKnight Scholar Award, McKnight Endowment Fund for Neuroscience (2015)
  • Scholar Award in Human Cognition, James S. McDonnell Foundation (2014)
  • Outstanding Paper Award, Neural Information Processing Systems Foundation (2014)
  • Sloan Research Fellowship, Alfred P. Sloan Foundation (2013)
  • Terman Award, Stanford University (2012)
  • Career Award at the Scientific Interface, Burroughs Wellcome Foundation (2009)
  • Swartz Fellow in Computational Neuroscience, Swartz Foundation (2004)

Professional Education


  • Ph.D., UC Berkeley, Theoretical Physics (2004)
  • M.A., UC Berkeley, Physics (2000)
  • M.A., UC Berkeley, Mathematics (2004)
  • M.Eng., MIT, Electrical Engineering and Computer Science (1998)
  • B.S., MIT, Physics (1998)
  • B.S., MIT, Mathematics (1998)
  • B.S., MIT, Electrical Engineering and Computer Science (1998)

Current Research and Scholarly Interests


Theoretical / computational neuroscience

2025-26 Courses


Stanford Advisees


Graduate and Fellowship Programs


All Publications


  • A tale of two algorithms: Structured slots explain prefrontal sequence memory and are unified with hippocampal cognitive maps. Neuron Whittington, J. C., Dorrell, W., Behrens, T. E., Ganguli, S., El-Gaby, M. 2024

    Abstract

    Remembering events is crucial to intelligent behavior. Flexible memory retrieval requires a cognitive map and is supported by two key brain systems: hippocampal episodic memory (EM) and prefrontal working memory (WM). Although an understanding of EM is emerging, little is understood of WM beyond simple memory retrieval. We develop a mathematical theory relating the algorithms and representations of EM and WM by unveiling a duality between storing memories in synapses versus neural activity. This results in a formalism of prefrontal WM as structured, controllable neural subspaces (activity slots) representing dynamic cognitive maps without synaptic plasticity. Using neural networks, we elucidate differences, similarities, and trade-offs between the hippocampal and prefrontal algorithms. Lastly, we show that prefrontal representations in tasks from list learning to cue-dependent recall are unified as controllable activity slots. Our results unify frontal and temporal representations of memory and offer a new understanding for dynamic prefrontal representations of WM.

    View details for DOI 10.1016/j.neuron.2024.10.017

    View details for PubMedID 39577417

  • Stochastic collapse: how gradient noise attracts SGD dynamics towards simpler subnetworks JOURNAL OF STATISTICAL MECHANICS-THEORY AND EXPERIMENT Chen, F., Kunin, D., Yamamura, A., Ganguli, S. 2024; 2024 (10)
  • One-shot entorhinal maps enable flexible navigation in novel environments. Nature Wen, J. H., Sorscher, B., Aery Jones, E. A., Ganguli, S., Giocomo, L. M. 2024

    Abstract

    Animals must navigate changing environments to find food, shelter or mates. In mammals, grid cells in the medial entorhinal cortex construct a neural spatial map of the external environment1-5. However, how grid cell firing patterns rapidly adapt to novel or changing environmental features on a timescale relevant to behaviour remains unknown. Here, by recording over 15,000 grid cells in mice navigating virtual environments, we tracked the real-time state of the grid cell network. This allowed us to observe and predict how altering environmental features influenced grid cell firing patterns on a nearly instantaneous timescale. We found evidence that visual landmarks provide inputs to fixed points in the grid cell network. This resulted in stable grid cell firing patterns in novel and altered environments after a single exposure. Fixed visual landmark inputs also influenced the grid cell network such that altering landmarks induced distortions in grid cell firing patterns. Such distortions could be predicted by a computational model with a fixed landmark to grid cell network architecture. Finally, a medial entorhinal cortex-dependent task revealed that although grid cell firing patterns are distorted by landmark changes, behaviour can adapt via a downstream region implementing behavioural timescale synaptic plasticity6. Overall, our findings reveal how the navigational system of the brain constructs spatial maps that balance rapidity and accuracy. Fixed connections between landmarks and grid cells enable the brain to quickly generate stable spatial maps, essential for navigation in novel or changing environments. Conversely, plasticity in regions downstream from grid cells allows the spatial maps of the brain to more accurately mirror the external spatial environment. More generally, these findings raise the possibility of a broader neural principle: by allocating fixed and plastic connectivity across different networks, the brain can solve problems requiring both rapidity and representational accuracy.

    View details for DOI 10.1038/s41586-024-08034-3

    View details for PubMedID 39385034

    View details for PubMedCentralID 3007674

  • Geometric Landscape Annealing as an Optimization Principle Underlying the Coherent Ising Machine PHYSICAL REVIEW X Yamamura, A., Mabuchi, H., Ganguli, S. 2024; 14 (3)
  • Adaptation of retinal discriminability to natural scenes. bioRxiv : the preprint server for biology Ding, X., Lee, D., Melander, J. B., Ganguli, S., Baccus, S. A. 2024

    Abstract

    Sensory systems discriminate stimuli to direct behavioral choices, a process governed by two distinct properties - neural sensitivity to specific stimuli, and stochastic properties that importantly include neural correlations. Two questions that have received extensive investigation and debate are whether visual systems are optimized for natural scenes, and whether noise correlations contribute to this optimization. However, the lack of sufficient computational models has made these questions inaccessible in the context of the normal function of the visual system, which is to discriminate between natural stimuli. Here we take a direct approach to analyze discriminability under natural scenes for a population of salamander retinal ganglion cells using a model of the retinal neural code that captures both sensitivity and stochasticity. Using methods of information geometry and generative machine learning, we analyzed the manifolds of natural stimuli and neural responses, finding that discriminability in the ganglion cell population adapts to enhance information transmission about natural scenes, in particular about localized motion. Contrary to previous proposals, noise correlations reduce information transmission and arise simply as a natural consequence of the shared circuitry that generates changing spatiotemporal visual sensitivity. These results address a long-standing debate as to the role of retinal correlations in the encoding of natural stimuli and reveal how the highly nonlinear receptive fields of the retina adapt dynamically to increase information transmission under natural scenes by performing the important ethological function of local motion discrimination.

    View details for DOI 10.1101/2024.09.26.615305

    View details for PubMedID 39386466

  • Entanglement and Replica Symmetry Breaking in a Driven-Dissipative Quantum Spin Glass PHYSICAL REVIEW X Marsh, B. P., Kroeze, R. M., Ganguli, S., Gopalakrishnan, S., Keeling, J., Lev, B. L. 2024; 14 (1)
  • The Limiting Dynamics of SGD: Modified Loss, Phase-Space Oscillations, and Anomalous Diffusion. Neural computation Kunin, D., Sagastuy-Brena, J., Gillespie, L., Margalit, E., Tanaka, H., Ganguli, S., Yamins, D. L. 2023: 1-25

    Abstract

    In this work, we explore the limiting dynamics of deep neural networks trained with stochastic gradient descent (SGD). As observed previously, long after performance has converged, networks continue to move through parameter space by a process of anomalous diffusion in which distance traveled grows as a power law in the number of gradient updates with a nontrivial exponent. We reveal an intricate interaction among the hyperparameters of optimization, the structure in the gradient noise, and the Hessian matrix at the end of training that explains this anomalous diffusion. To build this understanding, we first derive a continuous-time model for SGD with finite learning rates and batch sizes as an underdamped Langevin equation. We study this equationin the setting of linear regression, where we can derive exact, analytic expressions for the phase-space dynamics of the parameters and their instantaneous velocities from initialization to stationarity. Using the Fokker-Planck equation, we show that the key ingredient driving these dynamics is not the original training loss but rather the combination of a modified loss, which implicitly regularizes the velocity, and probability currents that cause oscillations in phase space. We identify qualitative and quantitative predictions of this theory in the dynamics of a ResNet-18 model trained on ImageNet. Through the lens of statistical physics, we uncover a mechanistic origin for the anomalous limiting dynamics of deep neural networks trained with SGD. Understanding the limiting dynamics of SGD, and its dependence on various important hyperparameters like batch size, learning rate, and momentum, can serve as a basis for future work that can turn these insights into algorithmic gains.

    View details for DOI 10.1162/neco_a_01626

    View details for PubMedID 38052080

  • Singular vectors of sums of rectangular random matrices and optimal estimation of high-rank signals: The extensive spike model. Physical review. E Landau, I. D., Mel, G. C., Ganguli, S. 2023; 108 (5-1): 054129

    Abstract

    Across many disciplines spanning from neuroscience and genomics to machine learning, atmospheric science, and finance, the problems of denoising large data matrices to recover hidden signals obscured by noise, and of estimating the structure of these signals, is of fundamental importance. A key to solving these problems lies in understanding how the singular value structure of a signal is deformed by noise. This question has been thoroughly studied in the well-known spiked matrix model, in which data matrices originate from low-rank signal matrices perturbed by additive noise matrices, in an asymptotic limit where matrix size tends to infinity but the signal rank remains finite. We first show, strikingly, that the singular value structure of large finite matrices (of size ∼1000) with even moderate-rank signals, as low as 10, is not accurately predicted by the finite-rank theory, thereby limiting the application of this theory to real data. To address these deficiencies, we analytically compute how the singular values and vectors of an arbitrary high-rank signal matrix are deformed by additive noise. We focus on an asymptotic limit corresponding to an extensive spike model, in which both the signal rank and the size of the data matrix tend to infinity at a constant ratio. We map out the phase diagram of the singular value structure of the extensive spike model as a joint function of signal strength and rank. We further exploit these analytics to derive optimal rotationally invariant denoisers to recover the hidden high-rank signal from the data, as well as optimal invariant estimators of the signal covariance structure. Our extensive-rank results yield several conceptual differences compared to the finite-rank case: (1) as signal strength increases, the singular value spectrum does not directly transition from a unimodal bulk phase to a disconnected phase, but instead there is a bimodal connected regime separating them; (2) the signal singular vectors can be partially estimated even in the unimodal bulk regime, and thus the transitions in the data singular value spectrum do not coincide with a detectability threshold for the signal singular vectors, unlike in the finite-rank theory; (3) signal singular values interact nontrivially to generate data singular values in the extensive-rank model, whereas they are noninteracting in the finite-rank theory; and (4) as a result, the more sophisticated data denoisers and signal covariance estimators we derive, which take into account these nontrivial extensive-rank interactions, significantly outperform their simpler, noninteracting, finite-rank counterparts, even on data matrices of only moderate rank. Overall, our results provide fundamental theory governing how high-dimensional signals are deformed by additive noise, together with practical formulas for optimal denoising and covariance estimation.

    View details for DOI 10.1103/PhysRevE.108.054129

    View details for PubMedID 38115511

  • Interpreting the retinal neural code for natural scenes: From computations to neurons. Neuron Maheswaranathan, N., McIntosh, L. T., Tanaka, H., Grant, S., Kastner, D. B., Melander, J. B., Nayebi, A., Brezovec, L. E., Wang, J. H., Ganguli, S., Baccus, S. A. 2023

    Abstract

    Understanding the circuit mechanisms of the visual code for natural scenes is a central goal of sensory neuroscience. We show that a three-layer network model predicts retinal natural scene responses with an accuracy nearing experimental limits. The model's internal structure is interpretable, as interneurons recorded separately and not modeled directly are highly correlated with model interneurons. Models fitted only to natural scenes reproduce a diverse set of phenomena related to motion encoding, adaptation, and predictive coding, establishing their ethological relevance to natural visual computation. A new approach decomposes the computations of model ganglion cells into the contributions of model interneurons, allowing automatic generation of new hypotheses for how interneurons with different spatiotemporal responses are combined to generate retinal computations, including predictive phenomena currently lacking an explanation. Our results demonstrate a unified and general approach to study the circuit mechanisms of ethological retinal computations under natural visual scenes.

    View details for DOI 10.1016/j.neuron.2023.06.007

    View details for PubMedID 37451264

  • Universal energy-accuracy tradeoffs in nonequilibrium cellular sensing. Physical review. E Harvey, S. E., Lahiri, S., Ganguli, S. 2023; 108 (1-1): 014403

    Abstract

    We combine stochastic thermodynamics, large deviation theory, and information theory to derive fundamental limits on the accuracy with which single cell receptors can estimate external concentrations. As expected, if the estimation is performed by an ideal observer of the entire trajectory of receptor states, then no energy consuming nonequilibrium receptor that can be divided into bound and unbound states can outperform an equilibrium two-state receptor. However, when the estimation is performed by a simple observer that measures the fraction of time the receptor is bound, we derive a fundamental limit on the accuracy of general nonequilibrium receptors as a function of energy consumption. We further derive and exploit explicit formulas to numerically estimate a Pareto-optimal tradeoff between accuracy and energy. We find this tradeoff can be achieved by nonuniform ring receptors with a number of states that necessarily increases with energy. Our results yield a thermodynamic uncertainty relation for the time a physical system spends in a pool of states and generalize the classic Berg-Purcell limit [H. C. Berg and E. M. Purcell, Biophys. J. 20, 193 (1977)0006-349510.1016/S0006-3495(77)85544-6] on cellular sensing along multiple dimensions.

    View details for DOI 10.1103/PhysRevE.108.014403

    View details for PubMedID 37583173

  • Catalyzing next-generation Artificial Intelligence through NeuroAI. Nature communications Zador, A., Escola, S., Richards, B., Olveczky, B., Bengio, Y., Boahen, K., Botvinick, M., Chklovskii, D., Churchland, A., Clopath, C., DiCarlo, J., Ganguli, S., Hawkins, J., Kording, K., Koulakov, A., LeCun, Y., Lillicrap, T., Marblestone, A., Olshausen, B., Pouget, A., Savin, C., Sejnowski, T., Simoncelli, E., Solla, S., Sussillo, D., Tolias, A. S., Tsao, D. 2023; 14 (1): 1597

    Abstract

    Neuroscience has long been an essential driver of progress in artificial intelligence (AI). We propose that to accelerate progress in AI, we must invest in fundamental research in NeuroAI. A core component of this is the embodied Turing test, which challenges AI animal models to interact with the sensorimotor world at skill levels akin to their living counterparts. The embodied Turing test shifts the focus from those capabilities like game playing and language that are especially well-developed or uniquely human to those capabilities - inherited from over 500 million years of evolution - that are shared with all animals. Building models that can pass the embodied Turing test will provide a roadmap for the next generation of AI.

    View details for DOI 10.1038/s41467-023-37180-x

    View details for PubMedID 36949048

  • An approximate line attractor in the hypothalamus encodes an aggressive state. Cell Nair, A., Karigo, T., Yang, B., Ganguli, S., Schnitzer, M. J., Linderman, S. W., Anderson, D. J., Kennedy, A. 2023; 186 (1): 178

    Abstract

    The hypothalamus regulates innate social behaviors, including mating and aggression. These behaviors can be evoked by optogenetic stimulation of specific neuronal subpopulations within MPOA and VMHvl, respectively. Here, we perform dynamical systems modeling of population neuronal activity in these nuclei during social behaviors. In VMHvl, unsupervised analysis identified a dominant dimension of neural activity with a large time constant (>50 s), generating an approximate line attractor in neural state space. Progression of the neural trajectory along this attractor was correlated with an escalation of agonistic behavior, suggesting that it may encode a scalable state of aggressiveness. Consistent with this, individual differences in the magnitude of the integration dimension time constant were strongly correlated with differences in aggressiveness. In contrast, approximate line attractors were not observed in MPOA during mating; instead, neurons with fast dynamics were tuned to specific actions. Thus, different hypothalamic nuclei employ distinct neural population codes to represent similar social behaviors.

    View details for DOI 10.1016/j.cell.2022.11.027

    View details for PubMedID 36608653

  • Pretraining task diversity and the emergence of non-Bayesian in-context learning for regression Raventos, A., Paul, M., Chen, F., Ganguli, S. edited by Oh, A., Neumann, T., Globerson, A., Saenko, K., Hardt, M., Levine, S. NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2023
  • Stochastic Collapse: How Gradient Noise Attracts SGD Dynamics Towards Simpler Subnetworks Chen, F., Kunin, D., Yamamura, A., Ganguli, S. edited by Oh, A., Neumann, T., Globerson, A., Saenko, K., Hardt, M., Levine, S. NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2023
  • Information Geometry of the Retinal Representation Manifold Ding, X., Lee, D., Melander, J. B., Sivulka, G., Ganguli, S., Baccus, S. A. edited by Oh, A., Neumann, T., Globerson, A., Saenko, K., Hardt, M., Levine, S. NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2023
  • A unified theory for the computational and mechanistic origins of grid cells. Neuron Sorscher, B., Mel, G. C., Ocko, S. A., Giocomo, L. M., Ganguli, S. 2022

    Abstract

    The discovery of entorhinal grid cells has generated considerable interest in how and why hexagonal firing fields might emerge in a generic manner from neural circuits, and what their computational significance might be. Here, we forge a link between the problem of path integration and the existence of hexagonal grids, by demonstrating that such grids arise in neural networks trained to path integrate under simple biologically plausible constraints. Moreover, we develop a unifying theory for why hexagonal grids are ubiquitous in path-integrator circuits. Such trained networks also yield powerful mechanistic hypotheses, exhibiting realistic levels of biological variability not captured by hand-designed models. We furthermore develop methods to analyze the connectome and activity maps of our networks to elucidate fundamental mechanisms underlying path integration. These methods provide a road map to go from connectomic and physiological measurements to conceptual understanding in a manner that could generalize to other settings.

    View details for DOI 10.1016/j.neuron.2022.10.003

    View details for PubMedID 36306779

  • Neural representational geometry underlies few-shot concept learning. Proceedings of the National Academy of Sciences of the United States of America Sorscher, B., Ganguli, S., Sompolinsky, H. 2022; 119 (43): e2200800119

    Abstract

    Understanding the neural basis of the remarkable human cognitive capacity to learn novel concepts from just one or a few sensory experiences constitutes a fundamental problem. We propose a simple, biologically plausible, mathematically tractable, and computationally powerful neural mechanism for few-shot learning of naturalistic concepts. We posit that the concepts that can be learned from few examples are defined by tightly circumscribed manifolds in the neural firing-rate space of higher-order sensory areas. We further posit that a single plastic downstream readout neuron learns to discriminate new concepts based on few examples using a simple plasticity rule. We demonstrate the computational power of our proposal by showing that it can achieve high few-shot learning accuracy on natural visual concepts using both macaque inferotemporal cortex representations and deep neural network (DNN) models of these representations and can even learn novel visual concepts specified only through linguistic descriptors. Moreover, we develop a mathematical theory of few-shot learning that links neurophysiology to predictions about behavioral outcomes by delineating several fundamental and measurable geometric properties of neural representations that can accurately predict the few-shot learning performance of naturalistic concepts across all our numerical simulations. This theory reveals, for instance, that high-dimensional manifolds enhance the ability to learn new concepts from few examples. Intriguingly, we observe striking mismatches between the geometry of manifolds in the primate visual pathway and in trained DNNs. We discuss testable predictions of our theory for psychophysics and neurophysiological experiments.

    View details for DOI 10.1073/pnas.2200800119

    View details for PubMedID 36251997

  • Measuring the dimensionality of behavior. Proceedings of the National Academy of Sciences of the United States of America Ganguli, S. 2022; 119 (43): e2205791119

    View details for DOI 10.1073/pnas.2205791119

    View details for PubMedID 36264834

  • Optimal noise level for coding with tightly balanced networks of spiking neurons in the presence of transmission delays. PLoS computational biology Timcheck, J., Kadmon, J., Boahen, K., Ganguli, S. 2022; 18 (10): e1010593

    Abstract

    Neural circuits consist of many noisy, slow components, with individual neurons subject to ion channel noise, axonal propagation delays, and unreliable and slow synaptic transmission. This raises a fundamental question: how can reliable computation emerge from such unreliable components? A classic strategy is to simply average over a population of N weakly-coupled neurons to achieve errors that scale as [Formula: see text]. But more interestingly, recent work has introduced networks of leaky integrate-and-fire (LIF) neurons that achieve coding errors that scale superclassically as 1/N by combining the principles of predictive coding and fast and tight inhibitory-excitatory balance. However, spike transmission delays preclude such fast inhibition, and computational studies have observed that such delays can cause pathological synchronization that in turn destroys superclassical coding performance. Intriguingly, it has also been observed in simulations that noise can actually improve coding performance, and that there exists some optimal level of noise that minimizes coding error. However, we lack a quantitative theory that describes this fascinating interplay between delays, noise and neural coding performance in spiking networks. In this work, we elucidate the mechanisms underpinning this beneficial role of noise by deriving analytical expressions for coding error as a function of spike propagation delay and noise levels in predictive coding tight-balance networks of LIF neurons. Furthermore, we compute the minimal coding error and the associated optimal noise level, finding that they grow as power-laws with the delay. Our analysis reveals quantitatively how optimal levels of noise can rescue neural coding performance in spiking neural networks with delays by preventing the build up of pathological synchrony without overwhelming the overall spiking dynamics. This analysis can serve as a foundation for the further study of precise computation in the presence of noise and delays in efficient spiking neural circuits.

    View details for DOI 10.1371/journal.pcbi.1010593

    View details for PubMedID 36251693

  • Synaptic balancing: A biologically plausible local learning rule that provably increases neural network noise robustness without sacrificing task performance. PLoS computational biology Stock, C. H., Harvey, S. E., Ocko, S. A., Ganguli, S. 2022; 18 (9): e1010418

    Abstract

    We introduce a novel, biologically plausible local learning rule that provably increases the robustness of neural dynamics to noise in nonlinear recurrent neural networks with homogeneous nonlinearities. Our learning rule achieves higher noise robustness without sacrificing performance on the task and without requiring any knowledge of the particular task. The plasticity dynamics-an integrable dynamical system operating on the weights of the network-maintains a multiplicity of conserved quantities, most notably the network's entire temporal map of input to output trajectories. The outcome of our learning rule is a synaptic balancing between the incoming and outgoing synapses of every neuron. This synaptic balancing rule is consistent with many known aspects of experimentally observed heterosynaptic plasticity, and moreover makes new experimentally testable predictions relating plasticity at the incoming and outgoing synapses of individual neurons. Overall, this work provides a novel, practical local learning rule that exactly preserves overall network function and, in doing so, provides new conceptual bridges between the disparate worlds of the neurobiology of heterosynaptic plasticity, the engineering of regularized noise-robust networks, and the mathematics of integrable Lax dynamical systems.

    View details for DOI 10.1371/journal.pcbi.1010418

    View details for PubMedID 36121844

  • Noise correlations in neural ensemble activity limit the accuracy of hippocampal spatial representations. Nature communications Hazon, O., Minces, V. H., Tomas, D. P., Ganguli, S., Schnitzer, M. J., Jercog, P. E. 2022; 13 (1): 4276

    Abstract

    Neurons in the CA1 area of the mouse hippocampus encode the position of the animal in an environment. However, given the variability in individual neurons responses, the accuracy of this code is still poorly understood. It was proposed that downstream areas could achieve high spatial accuracy by integrating the activity of thousands of neurons, but theoretical studies point to shared fluctuations in the firing rate as a potential limitation. Using high-throughput calcium imaging in freely moving mice, we demonstrated the limiting factors in the accuracy of the CA1 spatial code. We found that noise correlations in the hippocampus bound the estimation error of spatial coding to ~10cm (the size of a mouse). Maximal accuracy was obtained using approximately [300-1400] neurons, depending on the animal. These findings reveal intrinsic limits in the brain's representations of space and suggest that single neurons downstream of thehippocampus can extract maximal spatial information from several hundred inputs.

    View details for DOI 10.1038/s41467-022-31254-y

    View details for PubMedID 35879320

  • Recurrent Connections in the Primate Ventral Visual Stream Mediate a Trade-Off between Task Performance and Network Size during Core Object Recognition. Neural computation Nayebi, A., Sagastuy-Brena, J., Bear, D. M., Kar, K., Kubilius, J., Ganguli, S., Sussillo, D., DiCarlo, J. J., Yamins, D. L. 2022: 1-25

    Abstract

    The computational role of the abundant feedback connections in the ventral visual stream is unclear, enabling humans and nonhuman primates to effortlessly recognize objects across a multitude of viewing conditions. Prior studies have augmented feedforward convolutional neural networks (CNNs) with recurrent connections to study their role in visual processing; however, often these recurrent networks are optimized directly on neural data or the comparative metrics used are undefined for standard feedforward networks that lack these connections. In this work, we develop task-optimized convolutional recurrent (ConvRNN) network models that more correctly mimic the timing and gross neuroanatomy of the ventral pathway. Properly chosen intermediate-depth ConvRNN circuit architectures, which incorporate mechanisms of feedforward bypassing and recurrent gating, can achieve high performance on a core recognition task, comparable to that of much deeper feedforward networks. We then develop methods that allow us to compare both CNNs and ConvRNNs to finely grained measurements of primate categorization behavior and neural response trajectories across thousands of stimuli. We find that high-performing ConvRNNs provide a better match to these data than feedforward networks of any depth, predicting the precise timings at which each stimulus is behaviorally decoded from neural activation patterns. Moreover, these ConvRNN circuits consistently produce quantitatively accurate predictions of neural dynamics from V4 and IT across the entire stimulus presentation. In fact, we find that the highest-performing ConvRNNs, which best match neural and behavioral data, also achieve a strong Pareto trade-off between task performance and overall network size. Taken together, our results suggest the functional purpose of recurrence in the ventral pathway is to fit a high-performing network in cortex, attaining computational power through temporal rather than spatial complexity.

    View details for DOI 10.1162/neco_a_01506

    View details for PubMedID 35798321

  • Emergent reliability in sensory cortical coding and inter-area communication. Nature Ebrahimi, S., Lecoq, J., Rumyantsev, O., Tasci, T., Zhang, Y., Irimia, C., Li, J., Ganguli, S., Schnitzer, M. J. 2022

    Abstract

    Reliable sensory discrimination must arise from high-fidelity neural representations and communication between brain areas. However, how neocortical sensory processing overcomes the substantialvariability of neuronal sensory responses remains undetermined1-6. Here we imaged neuronalactivity in eight neocortical areas concurrently and over five days in mice performing a visual discrimination task, yielding longitudinal recordings of more than 21,000 neurons. Analyses revealed a sequence of events across the neocortex starting from a resting state, to early stages of perception, and through the formation of a task response. At rest, the neocortex had one pattern of functional connections, identified through sets of areas that shared activity cofluctuations7,8. Within about 200ms after the onset of the sensory stimulus, such connections rearranged, with different areas sharing cofluctuations and task-related information. During this short-lived state(approximately 300 ms duration), bothinter-area sensory data transmission and the redundancy of sensory encoding peaked, reflecting a transient increase in correlated fluctuations among task-related neurons. By around 0.5s after stimulus onset, thevisual representation reached a more stable form, the structure of which was robust to the prominent, day-to-day variations in the responses of individual cells. About 1s into stimulus presentation, a global fluctuation mode conveyed the upcoming response of the mouse to every area examined and was orthogonal to modes carrying sensory data. Overall, the neocortex supports sensory performance through brief elevations in sensory coding redundancynear the start of perception, neural populationcodes that are robust to cellular variability, and widespreadinter-area fluctuation modes that transmit sensory data and task responses in non-interfering channels.

    View details for DOI 10.1038/s41586-022-04724-y

    View details for PubMedID 35589841

  • Distinct invivo dynamics of excitatory synapses onto cortical pyramidal neurons and parvalbumin-positive interneurons. Cell reports Melander, J. B., Nayebi, A., Jongbloets, B. C., Fortin, D. A., Qin, M., Ganguli, S., Mao, T., Zhong, H. 2021; 37 (6): 109972

    Abstract

    Cortical function relies on the balanced activation of excitatory and inhibitory neurons. However, little is known about the organization and dynamics of shaft excitatory synapses onto cortical inhibitory interneurons. Here, we use the excitatory postsynaptic marker PSD-95, fluorescently labeled at endogenous levels, as a proxy for excitatory synapses onto layer 2/3 pyramidal neurons and parvalbumin-positive (PV+) interneurons in the barrel cortex of adult mice. Longitudinal invivo imaging under baseline conditions reveals that, although synaptic weights in both neuronal types are log-normally distributed, synapses onto PV+ neurons are less heterogeneous and more stable. Markov model analyses suggest that the synaptic weight distribution is set intrinsically by ongoing cell-type-specific dynamics, and substantial changes are due to accumulated gradual changes. Synaptic weight dynamics are multiplicative, i.e., changes scale with weights, although PV+ synapses also exhibit an additive component. These results reveal that cell-type-specific processes govern cortical synaptic strengths and dynamics.

    View details for DOI 10.1016/j.celrep.2021.109972

    View details for PubMedID 34758304

  • Embodied intelligence via learning and evolution. Nature communications Gupta, A., Savarese, S., Ganguli, S., Fei-Fei, L. 2021; 12 (1): 5721

    Abstract

    The intertwined processes of learning and evolution in complex environmental niches have resulted in a remarkable diversity of morphological forms. Moreover, many aspects of animal intelligence are deeply embodied in these evolved morphologies. However, the principles governing relations between environmental complexity, evolved morphology, and the learnability of intelligent control, remain elusive, because performing large-scale in silico experiments on evolution and learning is challenging. Here, we introduce Deep Evolutionary Reinforcement Learning (DERL): a computational framework which can evolve diverse agent morphologies to learn challenging locomotion and manipulation tasks in complex environments. Leveraging DERL we demonstrate several relations between environmental complexity, morphological intelligence and the learnability of control. First, environmental complexity fosters the evolution of morphological intelligence as quantified by the ability of a morphology to facilitate the learning of novel tasks. Second, we demonstrate a morphological Baldwin effect i.e., in our simulations evolution rapidly selects morphologies that learn faster, thereby enabling behaviors learned late in the lifetime of early ancestors to be expressed early in the descendants lifetime. Third, we suggest a mechanistic basis for the above relationships through the evolution of morphologies that are more physically stable and energy efficient, and can therefore facilitate learning and control.

    View details for DOI 10.1038/s41467-021-25874-z

    View details for PubMedID 34615862

  • A neural circuit state change underlying skilled movements. Cell Wagner, M. J., Savall, J., Hernandez, O., Mel, G., Inan, H., Rumyantsev, O., Lecoq, J., Kim, T. H., Li, J. Z., Ramakrishnan, C., Deisseroth, K., Luo, L., Ganguli, S., Schnitzer, M. J. 2021

    Abstract

    In motor neuroscience, state changes are hypothesized to time-lock neural assemblies coordinating complex movements, but evidence for this remains slender. We tested whether a discrete change from more autonomous to coherent spiking underlies skilled movement by imaging cerebellar Purkinje neuron complex spikes in mice making targeted forelimb-reaches. As mice learned the task, millimeter-scale spatiotemporally coherent spiking emerged ipsilateral to the reaching forelimb, and consistent neural synchronization became predictive of kinematic stereotypy. Before reach onset, spiking switched from more disordered to internally time-locked concerted spiking and silence. Optogenetic manipulations of cerebellar feedback to the inferior olive bi-directionally modulated neural synchronization and reaching direction. A simple model explained the reorganization of spiking during reaching as reflecting a discrete bifurcation in olivary network dynamics. These findings argue that to prepare learned movements, olivo-cerebellar circuits enter a self-regulated, synchronized state promoting motor coordination. State changes facilitating behavioral transitions may generalize across neural systems.

    View details for DOI 10.1016/j.cell.2021.06.001

    View details for PubMedID 34214470

  • Enhancing Associative Memory Recall and Storage Capacity Using Confocal Cavity QED PHYSICAL REVIEW X Marsh, B. P., Guo, Y., Kroeze, R. M., Gopalakrishnan, S., Ganguli, S., Keeling, J., Lev, B. L. 2021; 11 (2)
  • Coupling of activity, metabolism and behaviour across the Drosophila brain. Nature Mann, K., Deny, S., Ganguli, S., Clandinin, T. R. 2021

    Abstract

    Coordinated activity across networks of neurons is a hallmark of both resting and active behavioural states in many species1-5. These global patterns alter energy metabolism over seconds to hours, which underpins the widespread use of oxygen consumption and glucose uptake as proxies of neural activity6,7. However, whether changes in neural activity are causally related to metabolic flux in intact circuits on the timescales associated with behaviour is unclear. Here we combine two-photon microscopy of the fly brain with sensors that enable the simultaneous measurement of neural activity and metabolic flux, across both resting and active behavioural states. We demonstrate that neural activity drives changes in metabolic flux, creating a tight coupling between these signals that can be measured across brain networks. Using local optogenetic perturbation, we demonstrate that even transient increases in neural activity result in rapid and persistent increases in cytosolic ATP, which suggests that neuronal metabolism predictively allocates resources to anticipate the energy demands of future activity. Finally, our studies reveal that the initiation of even minimal behavioural movements causes large-scale changes in the pattern of neural activity and energy metabolism, which reveals a widespread engagement of the brain. As the relationship between neural activity and energy metabolism is probably evolutionarily ancient and highly conserved, our studies provide a critical foundation for using metabolic proxies to capture changes in neural activity.

    View details for DOI 10.1038/s41586-021-03497-0

    View details for PubMedID 33911283

  • Distance-tuned neurons drive specialized path integration calculations in medial entorhinal cortex. Cell reports Campbell, M. G., Attinger, A., Ocko, S. A., Ganguli, S., Giocomo, L. M. 2021; 36 (10): 109669

    Abstract

    During navigation, animals estimate their position using path integration and landmarks, engaging many brain areas. Whether these areas follow specialized or universal cue integration principles remains incompletely understood. We combine electrophysiology with virtual reality to quantify cue integration across thousands of neurons in three navigation-relevant areas: primary visual cortex (V1), retrosplenial cortex (RSC), and medial entorhinal cortex (MEC). Compared with V1 and RSC, path integration influences position estimates more in MEC, and conflicts between path integration and landmarks trigger remapping more readily. Whereas MEC codes position prospectively, V1 codes position retrospectively, and RSC is intermediate between the two. Lowered visual contrast increases the influence of path integration on position estimates only in MEC. These properties are most pronounced in a population of MEC neurons, overlapping with grid cells, tuned to distance run in darkness. These results demonstrate the specialized role that path integration plays in MEC compared with other navigation-relevant cortical areas.

    View details for DOI 10.1016/j.celrep.2021.109669

    View details for PubMedID 34496249

  • Understanding Self-Supervised Learning Dynamics without Contrastive Pairs Tian, Y., Chen, X., Ganguli, S. edited by Meila, M., Zhang, T. JMLR-JOURNAL MACHINE LEARNING RESEARCH. 2021: 7279-7289
  • A Theory of High Dimensional Regression with Arbitrary Correlations between Input Features and Target Functions: Sample Complexity, Multiple Descent Curves and a Hierarchy of Phase Transitions Mel, G. C., Ganguli, S. edited by Meila, M., Zhang, T. JMLR-JOURNAL MACHINE LEARNING RESEARCH. 2021
  • Fundamental bounds on the fidelity of sensory cortical coding. Nature Rumyantsev, O. I., Lecoq, J. A., Hernandez, O., Zhang, Y., Savall, J., Chrapkiewicz, R., Li, J., Zeng, H., Ganguli, S., Schnitzer, M. J. 2020; 580 (7801): 100-105

    Abstract

    How the brain processes information accurately despite stochastic neural activity is a longstanding question1. For instance, perception is fundamentally limited by the information that the brain can extract from the noisy dynamics of sensory neurons. Seminal experiments2,3 suggest that correlated noise in sensory cortical neural ensembles is what limits their coding accuracy4-6, although how correlated noise affects neural codes remains debated7-11. Recent theoretical work proposes that how a neural ensemble's sensory tuning properties relate statistically to its correlated noise patterns is a greater determinant of coding accuracy than is absolute noise strength12-14. However, without simultaneous recordings from thousands of cortical neurons with shared sensory inputs, it is unknown whether correlated noise limits coding fidelity. Here we present a 16-beam, two-photon microscope to monitor activity across the mouse primary visual cortex, along with analyses to quantify the information conveyed by large neural ensembles. We found that, in the visual cortex, correlated noise constrained signalling for ensembles with 800-1,300 neurons. Several noise components of the ensemble dynamics grew proportionally to the ensemble size and the encoded visual signals, revealing the predicted information-limiting correlations12-14. Notably, visual signals were perpendicular to the largest noise mode, which therefore did not limit coding fidelity. The information-limiting noise modes were approximately ten times smaller and concordant with mouse visual acuity15. Therefore, cortical design principles appear to enhance coding accuracy by restricting around 90% of noise fluctuations to modes that do not limit signalling fidelity, whereas much weaker correlated noise modes inherently bound sensory discrimination.

    View details for DOI 10.1038/s41586-020-2130-2

    View details for PubMedID 32238928

  • Fundamental bounds on the fidelity of sensory cortical coding NATURE Rumyantsev, O. I., Lecoq, J. A., Hernandez, O., Zhang, Y., Savall, J., Chrapkiewicz, R., Li, J., Zeng, H., Ganguli, S., Schnitzer, M. J. 2020
  • Discovering Precise Temporal Patterns in Large-Scale Neural Recordings through Robust and Interpretable Time Warping NEURON Williams, A. H., Poole, B., Maheswaranathan, N., Dhawale, A. K., Fisher, T., Wilson, C. D., Brann, D. H., Trautmann, E. M., Ryu, S., Shusterman, R., Rinberg, D., Olveczky, B. P., Shenoy, K. V., Ganguli, S. 2020; 105 (2): 246-+
  • Statistical Mechanics of Deep Learning ANNUAL REVIEW OF CONDENSED MATTER PHYSICS, VOL 11, 2020 Bahri, Y., Kadmon, J., Pennington, J., Schoenholz, S. S., Sohl-Dickstein, J., Ganguli, S. edited by Marchetti, M. C., Mackenzie, A. P. 2020; 11: 501–28
  • Two Routes to Scalable Credit Assignment without Weight Symmetry Kunin, D., Nayebi, A., Sagastuy-Brena, J., Ganguli, S., Bloom, J. M., Yamins, D. L. K. edited by Daume, H., Singh, A. JMLR-JOURNAL MACHINE LEARNING RESEARCH. 2020
  • GluD2- and Cbln1-mediated competitive interactions shape the dendritic arbors of cerebellar Purkinje cells. Neuron Takeo, Y. H., Shuster, S. A., Jiang, L. n., Hu, M. C., Luginbuhl, D. J., Rülicke, T. n., Contreras, X. n., Hippenmeyer, S. n., Wagner, M. J., Ganguli, S. n., Luo, L. n. 2020

    Abstract

    The synaptotrophic hypothesis posits that synapse formation stabilizes dendritic branches, but this hypothesis has not been causally tested in vivo in the mammalian brain. The presynaptic ligand cerebellin-1 (Cbln1) and postsynaptic receptor GluD2 mediate synaptogenesis between granule cells and Purkinje cells in the molecular layer of the cerebellar cortex. Here we show that sparse but not global knockout of GluD2 causes under-elaboration of Purkinje cell dendrites in the deep molecular layer and overelaboration in the superficial molecular layer. Developmental, overexpression, structure-function, and genetic epistasis analyses indicate that these dendrite morphogenesis defects result from a deficit in Cbln1/GluD2-dependent competitive interactions. A generative model of dendrite growth based on competitive synaptogenesis largely recapitulates GluD2 sparse and global knockout phenotypes. Our results support the synaptotrophic hypothesis at initial stages of dendrite development, suggest a second mode in which cumulative synapse formation inhibits further dendrite growth, and highlight the importance of competition in dendrite morphogenesis.

    View details for DOI 10.1016/j.neuron.2020.11.028

    View details for PubMedID 33352118

  • Statistical mechanics of low-rank tensor decomposition JOURNAL OF STATISTICAL MECHANICS-THEORY AND EXPERIMENT Kadmon, J., Ganguli, S. 2019; 2019 (12)
  • From deep learning to mechanistic understanding in neuroscience: the structure of retinal prediction. Advances in neural information processing systems Tanaka, H., Nayebi, A., Maheswaranathan, N., McIntosh, L., Baccus, S. A., Ganguli, S. 2019; 32: 8537-8547

    Abstract

    Recently, deep feedforward neural networks have achieved considerable success in modeling biological sensory processing, in terms of reproducing the input-output map of sensory neurons. However, such models raise profound questions about the very nature of explanation in neuroscience. Are we simply replacing one complex system (a biological circuit) with another (a deep network), without understanding either? Moreover, beyond neural representations, are the deep network's computational mechanisms for generating neural responses the same as those in the brain? Without a systematic approach to extracting and understanding computational mechanisms from deep neural network models, it can be difficult both to assess the degree of utility of deep learning approaches in neuroscience, and to extract experimentally testable hypotheses from deep networks. We develop such a systematic approach by combining dimensionality reduction and modern attribution methods for determining the relative importance of interneurons for specific visual computations. We apply this approach to deep network models of the retina, revealing a conceptual understanding of how the retina acts as a predictive feature extractor that signals deviations from expectations for diverse spatiotemporal stimuli. For each stimulus, our extracted computational mechanisms are consistent with prior scientific literature, and in one case yields a new mechanistic hypothesis. Thus overall, this work not only yields insights into the computational mechanisms underlying the striking predictive capabilities of the retina, but also places the framework of deep networks as neuroscientific models on firmer theoretical foundations, by providing a new roadmap to go beyond comparing neural representations to extracting and understand computational mechanisms.

    View details for PubMedID 35283616

    View details for PubMedCentralID PMC8916592

  • Discovering Precise Temporal Patterns in Large-Scale Neural Recordings through Robust and Interpretable Time Warping. Neuron Williams, A. H., Poole, B., Maheswaranathan, N., Dhawale, A. K., Fisher, T., Wilson, C. D., Brann, D. H., Trautmann, E. M., Ryu, S., Shusterman, R., Rinberg, D., Olveczky, B. P., Shenoy, K. V., Ganguli, S. 2019

    Abstract

    Though the temporal precision of neural computation has been studied intensively, a data-driven determination of this precision remains a fundamental challenge. Reproducible spike patterns may be obscured on single trials by uncontrolled temporal variability in behavior and cognition and may not be time locked to measurable signatures in behavior or local field potentials (LFP). To overcome these challenges, we describe a general-purpose time warping framework that reveals precise spike-time patterns in an unsupervised manner, even when these patterns are decoupled from behavior or are temporally stretched across single trials. We demonstrate this method across diverse systems: cued reaching in nonhuman primates, motor sequence production in rats, and olfaction in mice. This approach flexibly uncovers diverse dynamical firing patterns, including pulsatile responses to behavioral events, LFP-aligned oscillatory spiking, and even unanticipated patterns, such as 7Hz oscillations in rat motor cortex that are not time locked to measured behaviors or LFP.

    View details for DOI 10.1016/j.neuron.2019.10.020

    View details for PubMedID 31786013

  • A deep learning framework for neuroscience. Nature neuroscience Richards, B. A., Lillicrap, T. P., Beaudoin, P., Bengio, Y., Bogacz, R., Christensen, A., Clopath, C., Costa, R. P., de Berker, A., Ganguli, S., Gillon, C. J., Hafner, D., Kepecs, A., Kriegeskorte, N., Latham, P., Lindsay, G. W., Miller, K. D., Naud, R., Pack, C. C., Poirazi, P., Roelfsema, P., Sacramento, J., Saxe, A., Scellier, B., Schapiro, A. C., Senn, W., Wayne, G., Yamins, D., Zenke, F., Zylberberg, J., Therien, D., Kording, K. P. 2019; 22 (11): 1761–70

    Abstract

    Systems neuroscience seeks explanations for how the brain implements a wide variety of perceptual, cognitive and motor tasks. Conversely, artificial intelligence attempts to design computational systems based on the tasks they will have to solve. In artificial neural networks, the three components specified by design are the objective functions, the learning rules and the architectures. With the growing success of deep learning, which utilizes brain-inspired architectures, these three designed components have increasingly become central to how we model, engineer and optimize complex artificial learning systems. Here we argue that a greater focus on these components would also benefit systems neuroscience. We give examples of how this optimization-based framework can drive theoretical and experimental progress in neuroscience. We contend that this principled perspective on systems neuroscience will help to generate more rapid progress.

    View details for DOI 10.1038/s41593-019-0520-2

    View details for PubMedID 31659335

  • A mathematical theory of semantic development in deep neural networks. Proceedings of the National Academy of Sciences of the United States of America Saxe, A. M., McClelland, J. L., Ganguli, S. 2019

    Abstract

    An extensive body of empirical research has revealed remarkable regularities in the acquisition, organization, deployment, and neural representation of human semantic knowledge, thereby raising a fundamental conceptual question: What are the theoretical principles governing the ability of neural networks to acquire, organize, and deploy abstract knowledge by integrating across many individual experiences? We address this question by mathematically analyzing the nonlinear dynamics of learning in deep linear networks. We find exact solutions to this learning dynamics that yield a conceptual explanation for the prevalence of many disparate phenomena in semantic cognition, including the hierarchical differentiation of concepts through rapid developmental transitions, the ubiquity of semantic illusions between such transitions, the emergence of item typicality and category coherence as factors controlling the speed of semantic processing, changing patterns of inductive projection over development, and the conservation of semantic similarity in neural representations across species. Thus, surprisingly, our simple neural model qualitatively recapitulates many diverse regularities underlying semantic development, while providing analytic insight into how the statistical structure of an environment can interact with nonlinear deep-learning dynamics to give rise to these regularities.

    View details for DOI 10.1073/pnas.1820226116

    View details for PubMedID 31101713

  • Shared Cortex-Cerebellum Dynamics in the Execution and Learning of a Motor Task CELL Wagner, M. J., Kim, T., Kadmon, J., Nguyen, N. D., Ganguli, S., Schnitzer, M. J., Luo, L. 2019; 177 (3): 669-+
  • Cortical layer-specific critical dynamics triggering perception. Science (New York, N.Y.) Marshel, J. H., Kim, Y. S., Machado, T. A., Quirin, S. n., Benson, B. n., Kadmon, J. n., Raja, C. n., Chibukhchyan, A. n., Ramakrishnan, C. n., Inoue, M. n., Shane, J. C., McKnight, D. J., Yoshizawa, S. n., Kato, H. E., Ganguli, S. n., Deisseroth, K. n. 2019

    Abstract

    Perceptual experiences may arise from neuronal activity patterns in mammalian neocortex. We probed mouse neocortex during visual discrimination using a red-shifted channelrhodopsin (ChRmine, discovered through structure-guided genome mining) alongside multiplexed multiphoton-holography (MultiSLM), achieving control of individually-specified neurons spanning large cortical volumes with millisecond precision. Stimulating a critical number of stimulus-orientation-selective neurons drove widespread recruitment of functionally-related neurons, a process enhanced by (but not requiring) orientation-discrimination task learning. Optogenetic targeting of orientation-selective ensembles elicited correct behavioral discrimination. Cortical layer specific-dynamics were apparent, as emergent neuronal activity asymmetrically propagated from layer-2/3 to layer-5, and smaller layer-5 ensembles were as effective as larger layer-2/3 ensembles in eliciting orientation discrimination behavior. Population dynamics emerging after optogenetic stimulation both correctly predicted behavior and resembled natural neural representations of visual stimuli.

    View details for DOI 10.1126/science.aaw5202

    View details for PubMedID 31320556

  • Universality and individuality in neural dynamics across large populations of recurrent networks. Advances in neural information processing systems Maheswaranathan, N. n., Williams, A. H., Golub, M. D., Ganguli, S. n., Sussillo, D. n. 2019; 2019: 15629–41

    Abstract

    Task-based modeling with recurrent neural networks (RNNs) has emerged as a popular way to infer the computational function of different brain regions. These models are quantitatively assessed by comparing the low-dimensional neural representations of the model with the brain, for example using canonical correlation analysis (CCA). However, the nature of the detailed neurobiological inferences one can draw from such efforts remains elusive. For example, to what extent does training neural networks to solve common tasks uniquely determine the network dynamics, independent of modeling architectural choices? Or alternatively, are the learned dynamics highly sensitive to different model choices? Knowing the answer to these questions has strong implications for whether and how we should use task-based RNN modeling to understand brain dynamics. To address these foundational questions, we study populations of thousands of networks, with commonly used RNN architectures, trained to solve neuroscientifically motivated tasks and characterize their nonlinear dynamics. We find the geometry of the RNN representations can be highly sensitive to different network architectures, yielding a cautionary tale for measures of similarity that rely on representational geometry, such as CCA. Moreover, we find that while the geometry of neural dynamics can vary greatly across architectures, the underlying computational scaffold-the topological structure of fixed points, transitions between them, limit cycles, and linearized dynamics-often appears universal across all architectures.

    View details for PubMedID 32782422

    View details for PubMedCentralID PMC7416639

  • Reverse engineering recurrent networks for sentiment classification reveals line attractor dynamics. Advances in neural information processing systems Maheswaranathan, N. n., Williams, A. H., Golub, M. D., Ganguli, S. n., Sussillo, D. n. 2019; 32: 15696–705

    Abstract

    Recurrent neural networks (RNNs) are a widely used tool for modeling sequential data, yet they are often treated as inscrutable black boxes. Given a trained recurrent network, we would like to reverse engineer it-to obtain a quantitative, interpretable description of how it solves a particular task. Even for simple tasks, a detailed understanding of how recurrent networks work, or a prescription for how to develop such an understanding, remains elusive. In this work, we use tools from dynamical systems analysis to reverse engineer recurrent networks trained to perform sentiment classification, a foundational natural language processing task. Given a trained network, we find fixed points of the recurrent dynamics and linearize the nonlinear system around these fixed points. Despite their theoretical capacity to implement complex, high-dimensional computations, we find that trained networks converge to highly interpretable, low-dimensional representations. In particular, the topological structure of the fixed points and corresponding linearized dynamics reveal an approximate line attractor within the RNN, which we can use to quantitatively understand how the RNN solves the sentiment analysis task. Finally, we find this mechanism present across RNN architectures (including LSTMs, GRUs, and vanilla RNNs) trained on multiple datasets, suggesting that our findings are not unique to a particular architecture or dataset. Overall, these results demonstrate that surprisingly universal and human interpretable computations can arise across a range of recurrent networks.

    View details for PubMedID 32782423

    View details for PubMedCentralID PMC7416638

  • A unified theory for the origin of grid cells through the lens of pattern formation Sorscher, B., Mel, G. C., Ganguli, S., Ocko, S. A. edited by Wallach, H., Larochelle, H., Beygelzimer, A., d'Alche-Buc, F., Fox, E., Garnett, R. NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2019
  • Universality and individuality in neural dynamics across large populations of recurrent networks Maheswaranathan, N., Williams, A. H., Golub, M. D., Ganguli, S., Sussillo, D. edited by Wallach, H., Larochelle, H., Beygelzimer, A., d'Alche-Buc, F., Fox, E., Garnett, R. NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2019
  • Reverse engineering recurrent networks for sentiment classification reveals line attractor dynamics Maheswaranathan, N., Williams, A. H., Golub, M. D., Ganguli, S., Sussillo, D. edited by Wallach, H., Larochelle, H., Beygelzimer, A., d'Alche-Buc, F., Fox, E., Garnett, R. NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2019
  • From deep learning to mechanistic understanding in neuroscience: the structure of retinal prediction Tanaka, H., Nayebi, A., Maheswaranathan, N., McIntosh, L., Baccus, S. A., Ganguli, S. edited by Wallach, H., Larochelle, H., Beygelzimer, A., d'Alche-Buc, F., Fox, E., Garnett, R. NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2019
  • A Unified Theory of Early Visual Representations from Retina to Cortex through Anatomically Constrained Deep CNNs International Conference on Learning Representations (ICLR) Lindsay, J., Ocko, S., Ganguli, S., Deny, S. 2019
  • An analytic theory of generalization dynamics and transfer learning in deep linear networks International Conference on Learning Representations (ICLR) Lampinen, A., Ganguli, S. 2019
  • Accurate estimation of neural population dynamics without spike sorting Neuron Trautmann, E. M., Stavisky, S. D., Lahiri, S., Ames, K. C., Kauffman, M. T., O'Shea, D. J., Vyas, S., Sun, X., Ryu, S. I., Ganguli, S., Shenoy, K. V. 2019; 103: 1-17
  • Accurate Estimation of Neural Population Dynamics without Spike Sorting. Neuron Trautmann, E. M., Stavisky, S. D., Lahiri, S. n., Ames, K. C., Kaufman, M. T., O'Shea, D. J., Vyas, S. n., Sun, X. n., Ryu, S. I., Ganguli, S. n., Shenoy, K. V. 2019

    Abstract

    A central goal of systems neuroscience is to relate an organism's neural activity to behavior. Neural population analyses often reduce the data dimensionality to focus on relevant activity patterns. A major hurdle to data analysis is spike sorting, and this problem is growing as the number of recorded neurons increases. Here, we investigate whether spike sorting is necessary to estimate neural population dynamics. The theory of random projections suggests that we can accurately estimate the geometry of low-dimensional manifolds from a small number of linear projections of the data. We recorded data using Neuropixels probes in motor cortex of nonhuman primates and reanalyzed data from three previous studies and found that neural dynamics and scientific conclusions are quite similar using multiunit threshold crossings rather than sorted neurons. This finding unlocks existing data for new analyses and informs the design and use of new electrode arrays for laboratory and clinical use.

    View details for DOI 10.1016/j.neuron.2019.05.003

    View details for PubMedID 31171448

  • Emergent elasticity in the neural code for space. Proceedings of the National Academy of Sciences of the United States of America Ocko, S. A., Hardcastle, K., Giocomo, L. M., Ganguli, S. 2018

    Abstract

    Upon encountering a novel environment, an animal must construct a consistent environmental map, as well as an internal estimate of its position within that map, by combining information from two distinct sources: self-motion cues and sensory landmark cues. How do known aspects of neural circuit dynamics and synaptic plasticity conspire to accomplish this feat? Here we show analytically how a neural attractor model that combines path integration of self-motion cues with Hebbian plasticity in synaptic weights from landmark cells can self-organize a consistent map of space as the animal explores an environment. Intriguingly, the emergence of this map can be understood as an elastic relaxation process between landmark cells mediated by the attractor network. Moreover, our model makes several experimentally testable predictions, including (i) systematic path-dependent shifts in the firing fields of grid cells toward the most recently encountered landmark, even in a fully learned environment; (ii) systematic deformations in the firing fields of grid cells in irregular environments, akin to elastic deformations of solids forced into irregular containers; and (iii) the creation of topological defects in grid cell firing patterns through specific environmental manipulations. Taken together, our results conceptually link known aspects of neurons and synapses to an emergent solution of a fundamental computational problem in navigation, while providing a unified account of disparate experimental observations.

    View details for PubMedID 30482856

  • Inferring hidden structure in multilayered neural circuits. PLoS computational biology Maheswaranathan, N., Kastner, D. B., Baccus, S. A., Ganguli, S. 2018; 14 (8): e1006291

    Abstract

    A central challenge in sensory neuroscience involves understanding how neural circuits shape computations across cascaded cell layers. Here we attempt to reconstruct the response properties of experimentally unobserved neurons in the interior of a multilayered neural circuit, using cascaded linear-nonlinear (LN-LN) models. We combine non-smooth regularization with proximal consensus algorithms to overcome difficulties in fitting such models that arise from the high dimensionality of their parameter space. We apply this framework to retinal ganglion cell processing, learning LN-LN models of retinal circuitry consisting of thousands of parameters, using 40 minutes of responses to white noise. Our models demonstrate a 53% improvement in predicting ganglion cell spikes over classical linear-nonlinear (LN) models. Internal nonlinear subunits of the model match properties of retinal bipolar cells in both receptive field structure and number. Subunits have consistently high thresholds, supressing all but a small fraction of inputs, leading to sparse activity patterns in which only one subunit drives ganglion cell spiking at any time. From the model's parameters, we predict that the removal of visual redundancies through stimulus decorrelation across space, a central tenet of efficient coding theory, originates primarily from bipolar cell synapses. Furthermore, the composite nonlinear computation performed by retinal circuitry corresponds to a boolean OR function applied to bipolar cell feature detectors. Our methods are statistically and computationally efficient, enabling us to rapidly learn hierarchical non-linear models as well as efficiently compute widely used descriptive statistics such as the spike triggered average (STA) and covariance (STC) for high dimensional stimuli. This general computational framework may aid in extracting principles of nonlinear hierarchical sensory processing across diverse modalities from limited data.

    View details for PubMedID 30138312

  • Principles governing the integration of landmark and self-motion cues in entorhinal cortical codes for navigation. Nature neuroscience Campbell, M. G., Ocko, S. A., Mallory, C. S., Low, I. I., Ganguli, S., Giocomo, L. M. 2018

    Abstract

    To guide navigation, the nervous system integrates multisensory self-motion and landmark information. We dissected how these inputs generate spatial representations by recording entorhinal grid, border and speed cells in mice navigating virtual environments. Manipulating the gain between the animal's locomotion and the visual scene revealed that border cells responded to landmark cues while grid and speed cells responded to combinations of locomotion, optic flow and landmark cues in a context-dependent manner, with optic flow becoming more influential when it was faster than expected. A network model explained these results by revealing a phase transition between two regimes in which grid cells remain coherent with or break away from the landmark reference frame. Moreover, during path-integration-based navigation, mice estimated their position following principles predicted by our recordings. Together, these results provide a theoretical framework for understanding how landmark and self-motion cues combine during navigation to generate spatial representations and guide behavior.

    View details for PubMedID 30038279

  • Unsupervised Discovery of Demixed, Low-Dimensional Neural Dynamics across Multiple Timescales through Tensor Component Analysis. Neuron Williams, A. H., Kim, T. H., Wang, F. n., Vyas, S. n., Ryu, S. I., Shenoy, K. V., Schnitzer, M. n., Kolda, T. G., Ganguli, S. n. 2018

    Abstract

    Perceptions, thoughts, and actions unfold over millisecond timescales, while learned behaviors can require many days to mature. While recent experimental advances enable large-scale and long-term neural recordings with high temporal fidelity, it remains a formidable challenge to extract unbiased and interpretable descriptions of how rapid single-trial circuit dynamics change slowly over many trials to mediate learning. We demonstrate a simple tensor component analysis (TCA) can meet this challenge by extracting three interconnected, low-dimensional descriptions of neural data: neuron factors, reflecting cell assemblies; temporal factors, reflecting rapid circuit dynamics mediating perceptions, thoughts, and actions within each trial; and trial factors, describing both long-term learning and trial-to-trial changes in cognitive state. We demonstrate the broad applicability of TCA by revealing insights into diverse datasets derived from artificial neural networks, large-scale calcium imaging of rodent prefrontal cortex during maze navigation, and multielectrode recordings of macaque motor cortex during brain machine interface learning.

    View details for PubMedID 29887338

  • Task-Driven Convolutional Recurrent Models of the Visual System Nayebi, A., Bear, D., Kubilius, J., Kar, K., Ganguli, S., Sussillo, D., DiCarlo, J. J., Yamins, D. L. K. edited by Bengio, S., Wallach, H., Larochelle, H., Grauman, K., CesaBianchi, N., Garnett, R. NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2018
  • The emergence of multiple retinal cell types through efficient coding of natural movies Ocko, S. A., Lindsey, J., Ganguli, S., Deny, S. edited by Bengio, S., Wallach, H., Larochelle, H., Grauman, K., CesaBianchi, N., Garnett, R. NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2018
  • Statistical mechanics of low-rank tensor decomposition Kadmon, J., Ganguli, S. edited by Bengio, S., Wallach, H., Larochelle, H., Grauman, K., CesaBianchi, N., Garnett, R. NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2018
  • Statistical mechanics of low-rank tensor decomposition Neural Information Processing Systems (NIPS) Kadmon, J., Ganguli, S. 2018
  • The emergence of spectral universality in deep networks Artificial Intelligence and Statistics (AISTATS) Pennington, J., Schoenholz, S., Ganguli, S. 2018
  • SuperSpike: Supervised Learning in Multilayer Spiking Neural Networks. Neural computation Zenke, F. n., Ganguli, S. n. 2018: 1–28

    Abstract

    A vast majority of computation in the brain is performed by spiking neural networks. Despite the ubiquity of such spiking, we currently lack an understanding of how biological spiking neural circuits learn and compute in vivo, as well as how we can instantiate such capabilities in artificial spiking circuits in silico. Here we revisit the problem of supervised learning in temporally coding multilayer spiking neural networks. First, by using a surrogate gradient approach, we derive SuperSpike, a nonlinear voltage-based three-factor learning rule capable of training multilayer networks of deterministic integrate-and-fire neurons to perform nonlinear computations on spatiotemporal spike patterns. Second, inspired by recent results on feedback alignment, we compare the performance of our learning rule under different credit assignment strategies for propagating output errors to hidden units. Specifically, we test uniform, symmetric, and random feedback, finding that simpler tasks can be solved with any type of feedback, while more complex tasks require symmetric feedback. In summary, our results open the door to obtaining a better scientific understanding of learning and computation in spiking neural networks by advancing our ability to train them to solve nonlinear problems involving transformations between different spatiotemporal spike time patterns.

    View details for PubMedID 29652587

  • The emergence of multiple retinal cell types through efficient coding of natural movies Neural Information Processing Systems (NIPS) Deny, S., Lindsey, J., Ganguli, S., Ocko, S. 2018
  • Task-Driven Convolutional Recurrent Models of the Visual System Neural Information Processing Systems (NIPS) Nayebi, A., Bear, D., Kubulius, J., Kar, K., Ganguli, S., Di Carlo, J., Sussillo, D., Yamins, D. 2018
  • An International Laboratory for Systems and Computational Neuroscience NEURON Abbott, L. F., Angelaki, D. E., Carandini, M., Churchland, A. K., Dan, Y., Dayan, P., Deneve, S., Fiete, I., Ganguli, S., Harris, K. D., Hausser, M., Hofer, S., Latham, P. E., Mainen, Z. F., Mrsic-Flogel, T., Paninski, L., Pillow, J. W., Pouget, A., Svoboda, K., Witten, I. B., Zador, A. M., Intl Brain Lab 2017; 96 (6): 1213–18

    Abstract

    The neural basis of decision-making has been elusive and involves the coordinated activity of multiple brain structures. This NeuroView, by the International Brain Laboratory (IBL), discusses their efforts to develop a standardized mouse decision-making behavior, to make coordinated measurements of neural activity across the mouse brain, and to use theory and analyses to uncover the neural computations that support decision-making.

    View details for DOI 10.1016/j.neuron.2017.12.013

    View details for Web of Science ID 000418900200005

    View details for PubMedID 29268092

    View details for PubMedCentralID PMC5752703

  • Cell types for our sense of location: where we are and where we are going NATURE NEUROSCIENCE Hardcastle, K., Ganguli, S., Giocomo, L. M. 2017; 20 (11): 1474–82

    Abstract

    Technological advances in profiling cells along genetic, anatomical and physiological axes have fomented interest in identifying all neuronal cell types. This goal nears completion in specialized circuits such as the retina, while remaining more elusive in higher order cortical regions. We propose that this differential success of cell type identification may not simply reflect technological gaps in co-registering genetic, anatomical and physiological features in the cortex. Rather, we hypothesize it reflects evolutionarily driven differences in the computational principles governing specialized circuits versus more general-purpose learning machines. In this framework, we consider the question of cell types in medial entorhinal cortex (MEC), a region likely to be involved in memory and navigation. While MEC contains subsets of identifiable functionally defined cell types, recent work employing unbiased statistical methods and more diverse tasks reveals unsuspected heterogeneity and adaptivity in MEC firing patterns. This suggests MEC may operate more as a generalist circuit, obeying computational design principles resembling those governing other higher cortical regions.

    View details for PubMedID 29073649

  • A Multiplexed, Heterogeneous, and Adaptive Code for Navigation in Medial Entorhinal Cortex NEURON Hardcastle, K., Maheswaranathan, N., Ganguli, S., Giocomo, L. M. 2017; 94 (2): 375-?

    Abstract

    Medial entorhinal grid cells display strikingly symmetric spatial firing patterns. The clarity of these patterns motivated the use of specific activity pattern shapes to classify entorhinal cell types. While this approach successfully revealed cells that encode boundaries, head direction, and running speed, it left a majority of cells unclassified, and its pre-defined nature may have missed unconventional, yet important coding properties. Here, we apply an unbiased statistical approach to search for cells that encode navigationally relevant variables. This approach successfully classifies the majority of entorhinal cells and reveals unsuspected entorhinal coding principles. First, we find a high degree of mixed selectivity and heterogeneity in superficial entorhinal neurons. Second, we discover a dynamic and remarkably adaptive code for space that enables entorhinal cells to rapidly encode navigational information accurately at high running speeds. Combined, these observations advance our current understanding of the mechanistic origins and functional implications of the entorhinal code for navigation. VIDEO ABSTRACT.

    View details for DOI 10.1016/j.neuron.2017.03.025

    View details for Web of Science ID 000399451400020

    View details for PubMedID 28392071

  • The temporal paradox of Hebbian learning and homeostatic plasticity. Current opinion in neurobiology Zenke, F., Gerstner, W., Ganguli, S. 2017; 43: 166-176

    Abstract

    Hebbian plasticity, a synaptic mechanism which detects and amplifies co-activity between neurons, is considered a key ingredient underlying learning and memory in the brain. However, Hebbian plasticity alone is unstable, leading to runaway neuronal activity, and therefore requires stabilization by additional compensatory processes. Traditionally, a diversity of homeostatic plasticity phenomena found in neural circuits is thought to play this role. However, recent modelling work suggests that the slow evolution of homeostatic plasticity, as observed in experiments, is insufficient to prevent instabilities originating from Hebbian plasticity. To remedy this situation, we suggest that homeostatic plasticity is complemented by additional rapid compensatory processes, which rapidly stabilize neuronal activity on short timescales.

    View details for DOI 10.1016/j.conb.2017.03.015

    View details for PubMedID 28431369

  • A saturation hypothesis to explain both enhanced and impaired learning with enhanced plasticity. eLife Nguyen-Vu, T. B., Zhao, G. Q., Lahiri, S., Kimpo, R. R., Lee, H., Ganguli, S., Shatz, C. J., Raymond, J. L. 2017; 6

    Abstract

    Across many studies, animals with enhanced synaptic plasticity exhibit either enhanced or impaired learning, raising a conceptual puzzle: how enhanced plasticity can yield opposite learning outcomes? Here we show that recent history of experience can determine whether mice with enhanced plasticity exhibit enhanced or impaired learning in response to the same training. Mice with enhanced cerebellar LTD, due to double knockout (DKO) of MHCI H2-K(b)/H2-D(b) (K(b)D(b-/-)), exhibited oculomotor learning deficits. However, the same mice exhibited enhanced learning after appropriate pre-training. Theoretical analysis revealed that synapses with history-dependent learning rules could recapitulate the data, and suggested that saturation may be a key factor limiting the ability of enhanced plasticity to enhance learning. Moreover, optogenetic stimulation designed to saturate LTD produced the same impairment in WT as observed in DKO mice. Overall, our results suggest that recent history of activity and the threshold for synaptic plasticity conspire to effect divergent learning outcomes.

    View details for DOI 10.7554/eLife.20147

    View details for PubMedID 28234229

  • On the expressive power of deep neural networks International Conference on Machine Learning (ICML) Raghu, M., Poole, B., Kleinberg, J., Ganguli, S., Sohl-Dickstein, J. 2017
  • Resurrecting the sigmoid in deep learning through dynamical isometry: theory and practice Neural Information Processing Systems (NIPS) Pennington, J., Schoenholz, S., Ganguli, S. 2017
  • Variational Walkback: Learning a Transition Operator as a Stochastic Recurrent Net Neural Information Processing Systems (NIPS) Ke, R., Goyal, A., Ganguli, S., Bengio, Y. 2017
  • Continual Learning with Intelligent Synapses International Conference on Machine Learning (ICML) Zenke, F., Poole, B., Ganguli, S. 2017
  • Deep information propagation International Conference on Learning Representations (ICLR) Schoenholz, S., Gilmer, J., Ganguli, S., Sohl-Dickstein, J. 2017
  • Social Control of Hypothalamus-Mediated Male Aggression. Neuron Yang, T. n., Yang, C. F., Chizari, M. D., Maheswaranathan, N. n., Burke, K. J., Borius, M. n., Inoue, S. n., Chiang, M. C., Bender, K. J., Ganguli, S. n., Shah, N. M. 2017; 95 (4): 955–70.e4

    Abstract

    How environmental and physiological signals interact to influence neural circuits underlying developmentally programmed social interactions such as male territorial aggression is poorly understood. We have tested the influence of sensory cues, social context, and sex hormones on progesterone receptor (PR)-expressing neurons in the ventromedial hypothalamus (VMH) that are critical for male territorial aggression. We find that these neurons can drive aggressive displays in solitary males independent of pheromonal input, gonadal hormones, opponents, or social context. By contrast, these neurons cannot elicit aggression in socially housed males that intrude in another male's territory unless their pheromone-sensing is disabled. This modulation of aggression cannot be accounted for by linear integration of environmental and physiological signals. Together, our studies suggest that fundamentally non-linear computations enable social context to exert a dominant influence on developmentally hard-wired hypothalamus-mediated male territorial aggression.

    View details for PubMedID 28757304

    View details for PubMedCentralID PMC5648542

  • Statistical Mechanics of Optimal Convex Inference in High Dimensions PHYSICAL REVIEW X Advani, M., Ganguli, S. 2016; 6 (3)
  • Direction Selectivity in Drosophila Emerges from Preferred-Direction Enhancement and Null-Direction Suppression. journal of neuroscience Leong, J. C., Esch, J. J., Poole, B., Ganguli, S., Clandinin, T. R. 2016; 36 (31): 8078-8092

    Abstract

    Across animal phyla, motion vision relies on neurons that respond preferentially to stimuli moving in one, preferred direction over the opposite, null direction. In the elementary motion detector of Drosophila, direction selectivity emerges in two neuron types, T4 and T5, but the computational algorithm underlying this selectivity remains unknown. We find that the receptive fields of both T4 and T5 exhibit spatiotemporally offset light-preferring and dark-preferring subfields, each obliquely oriented in spacetime. In a linear-nonlinear modeling framework, the spatiotemporal organization of the T5 receptive field predicts the activity of T5 in response to motion stimuli. These findings demonstrate that direction selectivity emerges from the enhancement of responses to motion in the preferred direction, as well as the suppression of responses to motion in the null direction. Thus, remarkably, T5 incorporates the essential algorithmic strategies used by the Hassenstein-Reichardt correlator and the Barlow-Levick detector. Our model for T5 also provides an algorithmic explanation for the selectivity of T5 for moving dark edges: our model captures all two- and three-point spacetime correlations relevant to motion in this stimulus class. More broadly, our findings reveal the contribution of input pathway visual processing, specifically center-surround, temporally biphasic receptive fields, to the generation of direction selectivity in T5. As the spatiotemporal receptive field of T5 in Drosophila is common to the simple cell in vertebrate visual cortex, our stimulus-response model of T5 will inform efforts in an experimentally tractable context to identify more detailed, mechanistic models of a prevalent computation.Feature selective neurons respond preferentially to astonishingly specific stimuli, providing the neurobiological basis for perception. Direction selectivity serves as a paradigmatic model of feature selectivity that has been examined in many species. While insect elementary motion detectors have served as premiere experimental models of direction selectivity for 60 years, the central question of their underlying algorithm remains unanswered. Using in vivo two-photon imaging of intracellular calcium signals, we measure the receptive fields of the first direction-selective cells in the Drosophila visual system, and define the algorithm used to compute the direction of motion. Computational modeling of these receptive fields predicts responses to motion and reveals how this circuit efficiently captures many useful correlations intrinsic to moving dark edges.

    View details for DOI 10.1523/JNEUROSCI.1272-16.2016

    View details for PubMedID 27488629

    View details for PubMedCentralID PMC4971360

  • An equivalence between high dimensional Bayes optimal inference and M-estimation Neural Information Processing Systems (NIPS) Advani, M., Ganguli, S. 2016
  • Deep Learning Models of the Retinal Response to Natural Scenes. Advances in neural information processing systems McIntosh, L. T., Maheswaranathan, N. n., Nayebi, A. n., Ganguli, S. n., Baccus, S. A. 2016; 29: 1369–77

    Abstract

    A central challenge in sensory neuroscience is to understand neural computations and circuit mechanisms that underlie the encoding of ethologically relevant, natural stimuli. In multilayered neural circuits, nonlinear processes such as synaptic transmission and spiking dynamics present a significant obstacle to the creation of accurate computational models of responses to natural stimuli. Here we demonstrate that deep convolutional neural networks (CNNs) capture retinal responses to natural scenes nearly to within the variability of a cell's response, and are markedly more accurate than linear-nonlinear (LN) models and Generalized Linear Models (GLMs). Moreover, we find two additional surprising properties of CNNs: they are less susceptible to overfitting than their LN counterparts when trained on small amounts of data, and generalize better when tested on stimuli drawn from a different distribution (e.g. between natural scenes and white noise). An examination of the learned CNNs reveals several properties. First, a richer set of feature maps is necessary for predicting the responses to natural scenes compared to white noise. Second, temporally precise responses to slowly varying inputs originate from feedforward inhibition, similar to known retinal mechanisms. Third, the injection of latent noise sources in intermediate layers enables our model to capture the sub-Poisson spiking variability observed in retinal ganglion cells. Fourth, augmenting our CNNs with recurrent lateral connections enables them to capture contrast adaptation as an emergent property of accurately describing retinal responses to natural scenes. These methods can be readily generalized to other sensory modalities and stimulus ensembles. Overall, this work demonstrates that CNNs not only accurately capture sensory circuit responses to natural scenes, but also can yield information about the circuit's internal structure and function.

    View details for PubMedID 28729779

  • Exponential expressivity in deep neural networks through transient chaos Neural Information Processing Systems (NIPS) Poole, B., Subhaneil, L., Raghu, M., Sohl-Dickstein, J., Ganguli, S. 2016: 3360–3368
  • Role of the site of synaptic competition and the balance of learning forces for Hebbian encoding of probabilistic Markov sequences FRONTIERS IN COMPUTATIONAL NEUROSCIENCE Bouchard, K. E., Ganguli, S., Brainard, M. S. 2015; 9

    View details for DOI 10.3389/fncom.2015.00092

    View details for Web of Science ID 000360179700001

    View details for PubMedID 26257637

  • On simplicity and complexity in the brave new world of large-scale neuroscience CURRENT OPINION IN NEUROBIOLOGY Gao, P., Ganguli, S. 2015; 32: 148-155
  • Environmental Boundaries as an Error Correction Mechanism for Grid Cells NEURON Hardcastle, K., Ganguli, S., Giocomo, L. M. 2015; 86 (3): 827-839

    Abstract

    Medial entorhinal grid cells fire in periodic, hexagonally patterned locations and are proposed to support path-integration-based navigation. The recursive nature of path integration results in accumulating error and, without a corrective mechanism, a breakdown in the calculation of location. The observed long-term stability of grid patterns necessitates that the system either performs highly precise internal path integration or implements an external landmark-based error correction mechanism. To distinguish these possibilities, we examined grid cells in behaving rodents as they made long trajectories across an open arena. We found that error accumulates relative to time and distance traveled since the animal last encountered a boundary. This error reflects coherent drift in the grid pattern. Further, interactions with boundaries yield direction-dependent error correction, suggesting that border cells serve as a neural substrate for error correction. These observations, combined with simulations of an attractor network grid cell model, demonstrate that landmarks are crucial to grid stability.

    View details for DOI 10.1016/j.neuron.2015.03.039

    View details for Web of Science ID 000354069800021

    View details for PubMedID 25892299

  • Evidence for a causal inverse model in an avian cortico-basal ganglia circuit PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA Giret, N., Kornfeld, J., Ganguli, S., Hahnloser, R. H. 2014; 111 (16): 6063-6068

    Abstract

    Learning by imitation is fundamental to both communication and social behavior and requires the conversion of complex, nonlinear sensory codes for perception into similarly complex motor codes for generating action. To understand the neural substrates underlying this conversion, we study sensorimotor transformations in songbird cortical output neurons of a basal-ganglia pathway involved in song learning. Despite the complexity of sensory and motor codes, we find a simple, temporally specific, causal correspondence between them. Sensory neural responses to song playback mirror motor-related activity recorded during singing, with a temporal offset of roughly 40 ms, in agreement with short feedback loop delays estimated using electrical and auditory stimulation. Such matching of mirroring offsets and loop delays is consistent with a recent Hebbian theory of motor learning and suggests that cortico-basal ganglia pathways could support motor control via causal inverse models that can invert the rich correspondence between motor exploration and sensory feedback.

    View details for DOI 10.1073/pnas.1317087111

    View details for Web of Science ID 000334694000074

    View details for PubMedID 24711417

  • Fast large scale optimization by unifying stochastic gradient and quasi-Newton methods International Conference on Machine Learning (ICML) Dickstein, J. S., Poole, B., Ganguli, S. 2014
  • Exact solutions to the nonlinear dynamics of learning in deep neural networks International Conference on Learning Representations (ICLR) Saxe, A., McClelland, J., Ganguli, S. 2014
  • Identifying and attacking the saddle point problem in high-dimensional non-convex optimization Neural Information Processing Systems (NIPS) Dauphin, Y., Pascanu, R., Gulchere, C., Cho, K., Ganguli, S., Bengio, Y. 2014
  • Investigating the role of firing-rate normalization and dimensionality reduction in brain-machine interface robustness. Conference proceedings : ... Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Conference Kao, J. C., Nuyujukian, P., Stavisky, S., Ryu, S. I., Ganguli, S., Shenoy, K. V. 2013; 2013: 293-298

    Abstract

    The intraday robustness of brain-machine interfaces (BMIs) is important to their clinical viability. In particular, BMIs must be robust to intraday perturbations in neuron firing rates, which may arise from several factors including recording loss and external noise. Using a state-of-the-art decode algorithm, the Recalibrated Feedback Intention Trained Kalman filter (ReFIT-KF) [1] we introduce two novel modifications: (1) a normalization of the firing rates, and (2) a reduction of the dimensionality of the data via principal component analysis (PCA). We demonstrate in online studies that a ReFIT-KF equipped with normalization and PCA (NPC-ReFIT-KF) (1) achieves comparable performance to a standard ReFIT-KF when at least 60% of the neural variance is captured, and (2) is more robust to the undetected loss of channels. We present intuition as to how both modifications may increase the robustness of BMIs, and investigate the contribution of each modification to robustness. These advances, which lead to a decoder achieving state-of-the-art performance with improved robustness, are important for the clinical viability of BMI systems.

    View details for DOI 10.1109/EMBC.2013.6609495

    View details for PubMedID 24109682

  • A Hebbian learning rule gives rise to mirror neurons and links them to control theoretic inverse models FRONTIERS IN NEURAL CIRCUITS Hanuschkin, A., Ganguli, S., Hahnloser, R. H. 2013; 7

    Abstract

    Mirror neurons are neurons whose responses to the observation of a motor act resemble responses measured during production of that act. Computationally, mirror neurons have been viewed as evidence for the existence of internal inverse models. Such models, rooted within control theory, map-desired sensory targets onto the motor commands required to generate those targets. To jointly explore both the formation of mirrored responses and their functional contribution to inverse models, we develop a correlation-based theory of interactions between a sensory and a motor area. We show that a simple eligibility-weighted Hebbian learning rule, operating within a sensorimotor loop during motor explorations and stabilized by heterosynaptic competition, naturally gives rise to mirror neurons as well as control theoretic inverse models encoded in the synaptic weights from sensory to motor neurons. Crucially, we find that the correlational structure or stereotypy of the neural code underlying motor explorations determines the nature of the learned inverse model: random motor codes lead to causal inverses that map sensory activity patterns to their motor causes; such inverses are maximally useful, by allowing the imitation of arbitrary sensory target sequences. By contrast, stereotyped motor codes lead to less useful predictive inverses that map sensory activity to future motor actions. Our theory generalizes previous work on inverse models by showing that such models can be learned in a simple Hebbian framework without the need for error signals or backpropagation, and it makes new conceptual connections between the causal nature of inverse models, the statistical structure of motor variability, and the time-lag between sensory and motor responses of mirror neurons. Applied to bird song learning, our theory can account for puzzling aspects of the song system, including necessity of sensorimotor gating and selectivity of auditory responses to bird's own song (BOS) stimuli.

    View details for DOI 10.3389/fncir.2013.00106

    View details for Web of Science ID 000320922000001

    View details for PubMedID 23801941

  • Statistical mechanics of complex neural systems and high dimensional data JOURNAL OF STATISTICAL MECHANICS-THEORY AND EXPERIMENT Advani, M., Lahiri, S., Ganguli, S. 2013
  • Vocal learning with inverse models Principles of Neural Coding Hahnloser, R., Ganguli, S. CRC Press. 2013
  • Learning hierarchical category structure in deep neural networks Proceedings of the Cognitive Science Society Saxe, A., McClelland, J., Ganguli, S. 2013: 1271–1276
  • A memory frontier for complex synapses Neural Information Processing Systems (NIPS) Lahiri, S., Ganguli, S. 2013
  • Spatial Information Outflow from the Hippocampal Circuit: Distributed Spatial Coding and Phase Precession in the Subiculum JOURNAL OF NEUROSCIENCE Kim, S. M., Ganguli, S., Frank, L. M. 2012; 32 (34): 11539-11558

    Abstract

    Hippocampal place cells convey spatial information through a combination of spatially selective firing and theta phase precession. The way in which this information influences regions like the subiculum that receive input from the hippocampus remains unclear. The subiculum receives direct inputs from area CA1 of the hippocampus and sends divergent output projections to many other parts of the brain, so we examined the firing patterns of rat subicular neurons. We found a substantial transformation in the subicular code for space from sparse to dense firing rate representations along a proximal-distal anatomical gradient: neurons in the proximal subiculum are more similar to canonical, sparsely firing hippocampal place cells, whereas neurons in the distal subiculum have higher firing rates and more distributed spatial firing patterns. Using information theory, we found that the more distributed spatial representation in the subiculum carries, on average, more information about spatial location and context than the sparse spatial representation in CA1. Remarkably, despite the disparate firing rate properties of subicular neurons, we found that neurons at all proximal-distal locations exhibit robust theta phase precession, with similar spiking oscillation frequencies as neurons in area CA1. Our findings suggest that the subiculum is specialized to compress sparse hippocampal spatial codes into highly informative distributed codes suitable for efficient communication to other brain regions. Moreover, despite this substantial compression, the subiculum maintains finer scale temporal properties that may allow it to participate in oscillatory phase coding and spike timing-dependent plasticity in coordination with other regions of the hippocampal circuit.

    View details for DOI 10.1523/JNEUROSCI.5942-11.2012

    View details for Web of Science ID 000308140500004

    View details for PubMedID 22915100

  • Compressed Sensing, Sparsity, and Dimensionality in Neuronal Information Processing and Data Analysis ANNUAL REVIEW OF NEUROSCIENCE, VOL 35 Ganguli, S., Sompolinsky, H. 2012; 35: 485-508

    Abstract

    The curse of dimensionality poses severe challenges to both technical and conceptual progress in neuroscience. In particular, it plagues our ability to acquire, process, and model high-dimensional data sets. Moreover, neural systems must cope with the challenge of processing data in high dimensions to learn and operate successfully within a complex world. We review recent mathematical advances that provide ways to combat dimensionality in specific situations. These advances shed light on two dual questions in neuroscience. First, how can we as neuroscientists rapidly acquire high-dimensional data from the brain and subsequently extract meaningful models from limited amounts of these data? And second, how do brains themselves process information in their intrinsically high-dimensional patterns of neural activity as well as learn meaningful, generalizable models of the external world from limited experience?

    View details for DOI 10.1146/annurev-neuro-062111-150410

    View details for Web of Science ID 000307960400024

    View details for PubMedID 22483042

  • Short-term memory in neuronal networks through dynamical compressed sensing Neural Information Processing Systems (NIPS) Gangui, S., Sompolinsky, H. 2010
  • Feedforward to the Past: The Relation between Neuronal Connectivity, Amplification, and Short-Term Memory NEURON Ganguli, S., Latham, P. 2009; 61 (4): 499-501

    Abstract

    Two studies in this issue of Neuron challenge widely held assumptions about the role of positive feedback in recurrent neuronal networks. Goldman shows that such feedback is not necessary for memory maintenance in a neural integrator, and Murphy and Miller show that it is not necessary for amplification of orientation patterns in V1. Both suggest that seemingly recurrent networks can be feedforward in disguise.

    View details for DOI 10.1016/j.neuron.2009.02.006

    View details for Web of Science ID 000263816300004

    View details for PubMedID 19249270

  • Memory traces in dynamical systems PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA Ganguli, S., Huh, D., Sompolinsky, H. 2008; 105 (48): 18970-18975

    Abstract

    To perform nontrivial, real-time computations on a sensory input stream, biological systems must retain a short-term memory trace of their recent inputs. It has been proposed that generic high-dimensional dynamical systems could retain a memory trace for past inputs in their current state. This raises important questions about the fundamental limits of such memory traces and the properties required of dynamical systems to achieve these limits. We address these issues by applying Fisher information theory to dynamical systems driven by time-dependent signals corrupted by noise. We introduce the Fisher Memory Curve (FMC) as a measure of the signal-to-noise ratio (SNR) embedded in the dynamical state relative to the input SNR. The integrated FMC indicates the total memory capacity. We apply this theory to linear neuronal networks and show that the capacity of networks with normal connectivity matrices is exactly 1 and that of any network of N neurons is, at most, N. A nonnormal network achieving this bound is subject to stringent design constraints: It must have a hidden feedforward architecture that superlinearly amplifies its input for a time of order N, and the input connectivity must optimally match this architecture. The memory capacity of networks subject to saturating nonlinearities is further limited, and cannot exceed square root N. This limit can be realized by feedforward structures with divergent fan out that distributes the signal across neurons, thereby avoiding saturation. We illustrate the generality of the theory by showing that memory in fluid systems can be sustained by transient nonnormal amplification due to convective instability or the onset of turbulence.

    View details for DOI 10.1073/pnas.0804451105

    View details for Web of Science ID 000261489100065

    View details for PubMedID 19020074

  • One-dimensional dynamics of attention and decision making in LIP NEURON Ganguli, S., Bisley, J. W., Roitman, J. D., Shadlen, M. N., Goldberg, M. E., Miller, K. D. 2008; 58 (1): 15-25

    Abstract

    Where we allocate our visual spatial attention depends upon a continual competition between internally generated goals and external distractions. Recently it was shown that single neurons in the macaque lateral intraparietal area (LIP) can predict the amount of time a distractor can shift the locus of spatial attention away from a goal. We propose that this remarkable dynamical correspondence between single neurons and attention can be explained by a network model in which generically high-dimensional firing-rate vectors rapidly decay to a single mode. We find direct experimental evidence for this model, not only in the original attentional task, but also in a very different task involving perceptual decision making. These results confirm a theoretical prediction that slowly varying activity patterns are proportional to spontaneous activity, pose constraints on models of persistent activity, and suggest a network mechanism for the emergence of robust behavioral timing from heterogeneous neuronal populations.

    View details for DOI 10.1016/j.neuron.2008.01.038

    View details for Web of Science ID 000254946200006

    View details for PubMedID 18400159

  • Function constrains network architecture and dynamics: A case study on the yeast cell cycle Boolean network PHYSICAL REVIEW E Lau, K., Ganguli, S., Tang, C. 2007; 75 (5)

    Abstract

    We develop a general method to explore how the function performed by a biological network can constrain both its structural and dynamical network properties. This approach is orthogonal to prior studies which examine the functional consequences of a given structural feature, for example a scale free architecture. A key step is to construct an algorithm that allows us to efficiently sample from a maximum entropy distribution on the space of Boolean dynamical networks constrained to perform a specific function, or cascade of gene expression. Such a distribution can act as a "functional null model" to test the significance of any given network feature, and can aid in revealing underlying evolutionary selection pressures on various network properties. Although our methods are general, we illustrate them in an analysis of the yeast cell cycle cascade. This analysis uncovers strong constraints on the architecture of the cell cycle regulatory network as well as significant selection pressures on this network to maintain ordered and convergent dynamics, possibly at the expense of sacrificing robustness to structural perturbations.

    View details for DOI 10.1103/PhysRevE.75.051907

    View details for Web of Science ID 000246890100094

    View details for PubMedID 17677098

  • E10 Orbifolds Journal of High Energy Physics Brown, J., Ganguli, S., Ganor, O., Helfgott, C. 2005; 06 (057)
  • Twisted six dimensional gauge theories on tori, matrix models, and integrable systems JOURNAL OF HIGH ENERGY PHYSICS Ganguli, S., Ganor, O. J., Gill, J. 2004
  • Holographic protection of chronology in universes of the Godel type PHYSICAL REVIEW D Boyda, E. K., Ganguli, S., Horava, P., Varadarajan, U. 2003; 67 (10)