All Publications


  • Deep learning models map rapid plant species changes from citizen science and remote sensing data. Proceedings of the National Academy of Sciences of the United States of America Gillespie, L. E., Ruffley, M., Exposito-Alonso, M. 2024; 121 (37): e2318296121

    Abstract

    Anthropogenic habitat destruction and climate change are reshaping the geographic distribution of plants worldwide. However, we are still unable to map species shifts at high spatial, temporal, and taxonomic resolution. Here, we develop a deep learning model trained using remote sensing images from California paired with half a million citizen science observations that can map the distribution of over 2,000 plant species. Our model-Deepbiosphere-not only outperforms many common species distribution modeling approaches (AUC 0.95 vs. 0.88) but can map species at up to a few meters resolution and finely delineate plant communities with high accuracy, including the pristine and clear-cut forests of Redwood National Park. These fine-scale predictions can further be used to map the intensity of habitat fragmentation and sharp ecosystem transitions across human-altered landscapes. In addition, from frequent collections of remote sensing data, Deepbiosphere can detect the rapid effects of severe wildfire on plant community composition across a 2-y time period. These findings demonstrate that integrating public earth observations and citizen science with deep learning can pave the way toward automated systems for monitoring biodiversity change in real-time worldwide.

    View details for DOI 10.1073/pnas.2318296121

    View details for PubMedID 39236239

  • The Limiting Dynamics of SGD: Modified Loss, Phase-Space Oscillations, and Anomalous Diffusion. Neural computation Kunin, D., Sagastuy-Brena, J., Gillespie, L., Margalit, E., Tanaka, H., Ganguli, S., Yamins, D. L. 2023: 1-25

    Abstract

    In this work, we explore the limiting dynamics of deep neural networks trained with stochastic gradient descent (SGD). As observed previously, long after performance has converged, networks continue to move through parameter space by a process of anomalous diffusion in which distance traveled grows as a power law in the number of gradient updates with a nontrivial exponent. We reveal an intricate interaction among the hyperparameters of optimization, the structure in the gradient noise, and the Hessian matrix at the end of training that explains this anomalous diffusion. To build this understanding, we first derive a continuous-time model for SGD with finite learning rates and batch sizes as an underdamped Langevin equation. We study this equationin the setting of linear regression, where we can derive exact, analytic expressions for the phase-space dynamics of the parameters and their instantaneous velocities from initialization to stationarity. Using the Fokker-Planck equation, we show that the key ingredient driving these dynamics is not the original training loss but rather the combination of a modified loss, which implicitly regularizes the velocity, and probability currents that cause oscillations in phase space. We identify qualitative and quantitative predictions of this theory in the dynamics of a ResNet-18 model trained on ImageNet. Through the lens of statistical physics, we uncover a mechanistic origin for the anomalous limiting dynamics of deep neural networks trained with SGD. Understanding the limiting dynamics of SGD, and its dependence on various important hyperparameters like batch size, learning rate, and momentum, can serve as a basis for future work that can turn these insights into algorithmic gains.

    View details for DOI 10.1162/neco_a_01626

    View details for PubMedID 38052080

  • Genetic diversity loss in the Anthropocene. Science (New York, N.Y.) Exposito-Alonso, M., Booker, T. R., Czech, L., Gillespie, L., Hateley, S., Kyriazis, C. C., Lang, P. L., Leventhal, L., Nogues-Bravo, D., Pagowski, V., Ruffley, M., Spence, J. P., Toro Arana, S. E., WeiSS, C. L., Zess, E. 2022; 377 (6613): 1431-1435

    Abstract

    Anthropogenic habitat loss and climate change are reducing species' geographic ranges, increasing extinction risk and losses of species' genetic diversity. Although preserving genetic diversity is key to maintaining species' adaptability, we lack predictive tools and global estimates of genetic diversity loss across ecosystems. We introduce a mathematical framework that bridges biodiversity theory and population genetics to understand the loss of naturally occurring DNA mutations with decreasing habitat. By analyzing genomic variation of 10,095 georeferenced individuals from 20 plant and animal species, we show that genome-wide diversity follows a mutations-area relationship power law with geographic area, which can predict genetic diversity loss from local population extinctions. We estimate that more than 10% of genetic diversity may already be lost for many threatened and nonthreatened species, surpassing the United Nations' post-2020 targets for genetic preservation.

    View details for DOI 10.1126/science.abn5642

    View details for PubMedID 36137047