Blair Kaneshiro is Director of Research & Development (Academic Staff - Research) in the Graduate School of Education and Adjunct Professor of Music. She completed the BA in Music; MA in Music, Science, and Technology; MS in Electrical Engineering; and PhD in Computer-Based Music Theory and Acoustics, all from Stanford. She has worked as a Postdoctoral Scholar at Stanford (Music), Research Scientist at Stanford School of Medicine (Otolaryngology/Head & Neck Surgery), and has held industry roles at Shazam and Smule. Research interests: Multivariate EEG decoding, ecologically valid stimuli, musical engagement, user studies, music information retrieval.

Academic Appointments

  • Social Science Research Scholar, Initiative Centers & Program

2023-24 Courses

All Publications

  • Natural music evokes correlated EEG responses reflecting temporal structure and beat. NeuroImage Kaneshiro, B. n., Nguyen, D. T., Norcia, A. M., Dmochowski, J. P., Berger, J. n. 2020: 116559


    The brain activity of multiple subjects has been shown to synchronize during salient moments of natural stimuli, suggesting that correlation of neural responses indexes a brain state operationally termed 'engagement'. While past electroencephalography (EEG) studies have considered both auditory and visual stimuli, the extent to which these results generalize to music-a temporally structured stimulus for which the brain has evolved specialized circuitry-is less understood. Here we investigated neural correlation during natural music listening by recording EEG responses from N=48 adult listeners as they heard real-world musical works, some of which were temporally disrupted through shuffling of short-term segments (measures), reversal, or randomization of phase spectra. We measured correlation between multiple neural responses (inter-subject correlation) and between neural responses and stimulus envelope fluctuations (stimulus-response correlation) in the time and frequency domains. Stimuli retaining basic musical features, such as rhythm and melody, elicited significantly higher behavioral ratings and neural correlation than did phase-scrambled controls. However, while unedited songs were self-reported as most pleasant, time-domain correlations were highest during measure-shuffled versions. Frequency-domain measures of correlation (coherence) peaked at frequencies related to the musical beat, although the magnitudes of these spectral peaks did not explain the observed temporal correlations. Our findings show that natural music evokes significant inter-subject and stimulus-response correlations, and suggest that the neural correlates of musical 'engagement' may be distinct from those of enjoyment.

    View details for DOI 10.1016/j.neuroimage.2020.116559

    View details for PubMedID 31978543

  • Characterizing Listener Engagement with Popular Songs Using Large-Scale Music Discovery Data FRONTIERS IN PSYCHOLOGY Kaneshiro, B., Ruan, F., Baker, C. W., Berger, J. 2017; 8


    Music discovery in everyday situations has been facilitated in recent years by audio content recognition services such as Shazam. The widespread use of such services has produced a wealth of user data, specifying where and when a global audience takes action to learn more about music playing around them. Here, we analyze a large collection of Shazam queries of popular songs to study the relationship between the timing of queries and corresponding musical content. Our results reveal that the distribution of queries varies over the course of a song, and that salient musical events drive an increase in queries during a song. Furthermore, we find that the distribution of queries at the time of a song's release differs from the distribution following a song's peak and subsequent decline in popularity, possibly reflecting an evolution of user intent over the "life cycle" of a song. Finally, we derive insights into the data size needed to achieve consistent query distributions for individual songs. The combined findings of this study suggest that music discovery behavior, and other facets of the human experience of music, can be studied quantitatively using large-scale industrial data.

    View details for DOI 10.3389/fpsyg.2017.00416

    View details for Web of Science ID 000397317600001

    View details for PubMedID 28386241

  • A Representational Similarity Analysis of the Dynamics of Object Processing Using Single-Trial EEG Classification. PloS one Kaneshiro, B., Perreau Guimaraes, M., Kim, H., Norcia, A. M., Suppes, P. 2015; 10 (8)


    The recognition of object categories is effortlessly accomplished in everyday life, yet its neural underpinnings remain not fully understood. In this electroencephalography (EEG) study, we used single-trial classification to perform a Representational Similarity Analysis (RSA) of categorical representation of objects in human visual cortex. Brain responses were recorded while participants viewed a set of 72 photographs of objects with a planned category structure. The Representational Dissimilarity Matrix (RDM) used for RSA was derived from confusions of a linear classifier operating on single EEG trials. In contrast to past studies, which used pairwise correlation or classification to derive the RDM, we used confusion matrices from multi-class classifications, which provided novel self-similarity measures that were used to derive the overall size of the representational space. We additionally performed classifications on subsets of the brain response in order to identify spatial and temporal EEG components that best discriminated object categories and exemplars. Results from category-level classifications revealed that brain responses to images of human faces formed the most distinct category, while responses to images from the two inanimate categories formed a single category cluster. Exemplar-level classifications produced a broadly similar category structure, as well as sub-clusters corresponding to natural language categories. Spatiotemporal components of the brain response that differentiated exemplars within a category were found to differ from those implicated in differentiating between categories. Our results show that a classification approach can be successfully applied to single-trial scalp-recorded EEG to recover fine-grained object category structure, as well as to identify interpretable spatiotemporal components underlying object processing. Finally, object category can be decoded from purely temporal information recorded at single electrodes.

    View details for DOI 10.1371/journal.pone.0135697

    View details for PubMedID 26295970

  • Lexical and sublexical cortical tuning for print revealed by Steady-State Visual Evoked Potentials (SSVEPs) in early readers. Developmental science Wang, F., Nguyen, Q. T., Kaneshiro, B., Hasak, L., Wang, A. M., Toomarian, E. Y., Norcia, A. M., McCandliss, B. D. 2022: e13352


    There are multiple levels of processing relevant to reading that vary in their visual, sublexical and lexical orthographic processing demands. Segregating distinct cortical sources for each of these levels has been challenging in EEG studies of early readers. To address this challenge, we applied recent advances in analyzing high-density EEG using Steady-State Visual Evoked Potentials (SSVEPs) via data-driven Reliable Components Analysis (RCA) in a group of early readers spanning from kindergarten to second grade. Three controlled stimulus contrasts-familiar words versus unfamiliar pseudofonts, familiar words versus pseudowords, and pseudowords versus nonwords-were used to isolate coarse print tuning, lexical processing, and sublexical orthography-related processing, respectively. First, three overlapping yet distinct neural sources-left vOT, dorsal parietal, and primary visual cortex were revealed underlying coarse print tuning. Second, we segregated distinct cortical sources for the other two levels of processing: lexical fine tuning over occipito-tempopral/parietal regions; sublexical orthographic fine tuning over left occipital regions. Finally, exploratory group analyses based on children's reading fluency suggested that coarse print tuning emerges early even in children with limited reading knowledge, while sublexical and higher-level lexical processing emerge only in children with sufficient reading knowledge. Cognitive processes underlying coarse print tuning, sublexical, and lexical fine tuning were examined in beginning readers. Three overlapping yet distinct neural sources-left ventral occipito-temporal (vOT), left temporo-parietal, and primary visual cortex-were revealed underlying coarse print tuning. Responses to sublexical orthographic fine tuning were found over left occipital regions, while responses to higher-level linguistic fine tuning were found over occipito- temporal/parietal regions. Exploratory group analyses suggested that coarse print tuning emerges in children with limited reading knowledge, while sublexical and higher-level linguistic fine tuning effects emerge in children with sufficient reading knowledge. This article is protected by copyright. All rights reserved.

    View details for DOI 10.1111/desc.13352

    View details for PubMedID 36413170

  • Hiting Pause: How User Perceptions of Collaborative Playlists Evolved in the United States During the COVID-19 Pandemic Park, S., Redmond, E., Berger, J., Kaneshiro, B., ACM ASSOC COMPUTING MACHINERY. 2022
  • Distinct neural sources underlying visual word form processing as revealed by steady state visual evoked potentials (SSVEP). Scientific reports Wang, F., Kaneshiro, B., Strauber, C. B., Hasak, L., Nguyen, Q. T., Yakovleva, A., Vildavski, V. Y., Norcia, A. M., McCandliss, B. D. 2021; 11 (1): 18229


    EEG has been central to investigations of the time course of various neural functions underpinning visual word recognition. Recently the steady-state visual evoked potential (SSVEP) paradigm has been increasingly adopted for word recognition studies due to its high signal-to-noise ratio. Such studies, however, have been typically framed around a single source in the left ventral occipitotemporal cortex (vOT). Here, we combine SSVEP recorded from 16 adult native English speakers with a data-driven spatial filtering approach-Reliable Components Analysis (RCA)-to elucidate distinct functional sources with overlapping yet separable time courses and topographies that emerge when contrasting words with pseudofont visual controls. The first component topography was maximal over left vOT regions with a shorter latency (approximately 180 ms). A second component was maximal over more dorsal parietal regions with a longer latency (approximately 260 ms). Both components consistently emerged across a range of parameter manipulations including changes in the spatial overlap between successive stimuli, and changes in both base and deviation frequency. We then contrasted word-in-nonword and word-in-pseudoword to test the hierarchical processing mechanisms underlying visual word recognition. Results suggest that these hierarchical contrasts fail to evoke a unitary component that might be reasonably associated with lexical access.

    View details for DOI 10.1038/s41598-021-95627-x

    View details for PubMedID 34521874

  • Inter-subject Correlation While Listening to Minimalist Music: A Study of Electrophysiological and Behavioral Responses to Steve Reich's Piano Phase. Frontiers in neuroscience Dauer, T., Nguyen, D. T., Gang, N., Dmochowski, J. P., Berger, J., Kaneshiro, B. 1800; 15: 702067


    Musical minimalism utilizes the temporal manipulation of restricted collections of rhythmic, melodic, and/or harmonic materials. One example, Steve Reich's Piano Phase, offers listeners readily audible formal structure with unpredictable events at the local level. For example, pattern recurrences may generate strong expectations which are violated by small temporal and pitch deviations. A hyper-detailed listening strategy prompted by these minute deviations stands in contrast to the type of listening engagement typically cultivated around functional tonal Western music. Recent research has suggested that the inter-subject correlation (ISC) of electroencephalographic (EEG) responses to natural audio-visual stimuli objectively indexes a state of "engagement," demonstrating the potential of this approach for analyzing music listening. But can ISCs capture engagement with minimalist music, which features less obvious expectation formation and has historically received a wide range of reactions? To approach this question, we collected EEG and continuous behavioral (CB) data while 30 adults listened to an excerpt from Steve Reich's Piano Phase, as well as three controlled manipulations and a popular-music remix of the work. Our analyses reveal that EEG and CB ISC are highest for the remix stimulus and lowest for our most repetitive manipulation, no statistical differences in overall EEG ISC between our most musically meaningful manipulations and Reich's original piece, and evidence that compositional features drove engagement in time-resolved ISC analyses. We also found that aesthetic evaluations corresponded well with overall EEG ISC. Finally we highlight co-occurrences between stimulus events and time-resolved EEG and CB ISC. We offer the CB paradigm as a useful analysis measure and note the value of minimalist compositions as a limit case for the neuroscientific study of music listening. Overall, our participants' neural, continuous behavioral, and question responses showed strong similarities that may help refine our understanding of the type of engagement indexed by ISC for musical stimuli.

    View details for DOI 10.3389/fnins.2021.702067

    View details for PubMedID 34955706

  • Armed in ARMY: A Case Study of How BTS Fans Successfully Collaborated to #MatchAMillion for Black Lives Matter Proceedings of the 39th Annual ACM Conference on Human Factors in Computing Systems (CHI) Park, S., Santero, N., Kaneshiro, B., Lee, J. 2021

    View details for DOI 10.1145/3411764.3445353

  • Social Music Curation That Works: Insights from Successful Collaborative Playlists Proceedings of the 2021 ACM Conference on Computer-Supported Cooperative Work & Social Computing (CSCW) Park, S., Kaneshiro, B. 2021; 5 (117)

    View details for DOI 10.1145/3449191

  • Time-resolved correspondences between deep neural network layers and EEG measurements in object processing. Vision research Kong, N. C., Kaneshiro, B., Yamins, D. L., Norcia, A. M. 2020; 172: 27–45


    The ventral visual stream is known to be organized hierarchically, where early visual areas processing simplistic features feed into higher visual areas processing more complex features. Hierarchical convolutional neural networks (CNNs) were largely inspired by this type of brain organization and have been successfully used to model neural responses in different areas of the visual system. In this work, we aim to understand how an instance of these models corresponds to temporal dynamics of human object processing. Using representational similarity analysis (RSA) and various similarity metrics, we compare the model representations with two electroencephalography (EEG) data sets containing responses to a shared set of 72 images. We find that there is a hierarchical relationship between the depth of a layer and the time at which peak correlation with the brain response occurs for certain similarity metrics in both data sets. However, when comparing across layers in the neural network, the correlation onset time did not appear in a strictly hierarchical fashion. We present two additional methods that improve upon the achieved correlations by optimally weighting features from the CNN and show that depending on the similarity metric, deeper layers of the CNN provide a better correspondence than shallow layers to later time points in the EEG responses. However, we do not find that shallow layers provide better correspondences than those of deeper layers to early time points, an observation that violates the hierarchy and is in agreement with the finding from the onset-time analysis. This work makes a first comparison of various response features-including multiple similarity metrics and data sets-with respect to a neural network.

    View details for DOI 10.1016/j.visres.2020.04.005

    View details for PubMedID 32388211

  • Corrigendum to "Neural dynamics underlying coherent motion perception in children and adults" [Dev. Cogn. Neurosci. 38 (August) (2019) 100670]. Developmental cognitive neuroscience Manning, C. n., Kaneshiro, B. n., Kohler, P. J., Duta, M. n., Scerif, G. n., Norcia, A. M. 2020; 41: 100748

    View details for DOI 10.1016/j.dcn.2019.100748

    View details for PubMedID 31999566

  • Factors influencing classification of frequency following responses to speech and music stimuli. Hearing research Losorelli, S. n., Kaneshiro, B. n., Musacchia, G. A., Blevins, N. H., Fitzgerald, M. B. 2020; 398: 108101


    Successful mapping of meaningful labels to sound input requires accurate representation of that sound's acoustic variances in time and spectrum. For some individuals, such as children or those with hearing loss, having an objective measure of the integrity of this representation could be useful. Classification is a promising machine learning approach which can be used to objectively predict a stimulus label from the brain response. This approach has been previously used with auditory evoked potentials (AEP) such as the frequency following response (FFR), but a number of key issues remain unresolved before classification can be translated into clinical practice. Specifically, past efforts at FFR classification have used data from a given subject for both training and testing the classifier. It is also unclear which components of the FFR elicit optimal classification accuracy. To address these issues, we recorded FFRs from 13 adults with normal hearing in response to speech and music stimuli. We compared labeling accuracy of two cross-validation classification approaches using FFR data: (1) a more traditional method combining subject data in both the training and testing set, and (2) a "leave-one-out" approach, in which subject data is classified based on a model built exclusively from the data of other individuals. We also examined classification accuracy on decomposed and time-segmented FFRs. Our results indicate that the accuracy of leave-one-subject-out cross validation approaches that obtained in the more conventional cross-validation classifications while allowing a subject's results to be analysed with respect to normative data pooled from a separate population. In addition, we demonstrate that classification accuracy is highest when the entire FFR is used to train the classifier. Taken together, these efforts contribute key steps toward translation of classification-based machine learning approaches into clinical practice.

    View details for DOI 10.1016/j.heares.2020.108101

    View details for PubMedID 33142106

  • Neural dynamics underlying coherent motion perception in children and adults. Developmental cognitive neuroscience Manning, C., Kaneshiro, B., Kohler, P. J., Duta, M., Scerif, G., Norcia, A. M. 2019; 38: 100670


    Motion sensitivity increases during childhood, but little is known about the neural correlates. Most studies investigating children's evoked responses have not dissociated direction-specific and non-direction-specific responses. To isolate direction-specific responses, we presented coherently moving dot stimuli preceded by incoherent motion, to 6- to 7-year-olds (n = 34), 8- to 10-year-olds (n = 34), 10- to 12-year-olds (n = 34) and adults (n = 20). Participants reported the coherent motion direction while high-density EEG was recorded. Using a data-driven approach, we identified two stimulus-locked EEG components with distinct topographies: an early component with an occipital topography likely reflecting sensory encoding and a later, sustained positive component over centro-parietal electrodes that we attribute to decision-related processes. The component waveforms showed clear age-related differences. In the early, occipital component, all groups showed a negativity peaking at ~300 ms, like the previously reported coherent-motion N2. However, the children, unlike adults, showed an additional positive peak at ~200 ms, suggesting differential stimulus encoding. The later positivity in the centro-parietal component rose more steeply for adults than for the youngest children, likely reflecting age-related speeding of decision-making. We conclude that children's protracted development of coherent motion sensitivity is associated with maturation of both early sensory and later decision-related processes.

    View details for DOI 10.1016/j.dcn.2019.100670

    View details for PubMedID 31228678

  • Tunes Together: Perception and Experience of Collaborative Playlists Proceedings of the 20th International Society for Music Information Retrieval Conference (ISMIR) Park, S., Laplante, A., Lee, J., Kaneshiro, B. 2019

    View details for DOI 10.5281/zenodo.3527912

  • AudExpCreator: A GUI-based Matlab tool for designing and creating auditory experiments with the Psychophysics Toolbox SOFTWAREX Nguyen, D. T., Kaneshiro, B. 2018; 7: 328–34
  • Analyzing and Classifying Guitarists from Rock Guitar Solo Tablature Proceedings of the Sound and Music Computing Conference (SMC) Das, O., Kaneshiro, B., Collins, T. 2018

    View details for DOI 10.5281/zenodo.1422569

  • Decoding Neurally Relevant Musical Features Using Canonical Correlation Analysis Proceedings of the 18th International Society for Music Information Retrieval Conference (ISMIR) Gang, N., Kaneshiro, B., Berger, J., Dmochowski, J. P. 2017

    View details for DOI 10.5281/zenodo.1417137

  • NMED-T: A Tempo-Focused Dataset of Cortical and Behavioral Responses to Naturalistic Music Proceedings of the 18th International Society for Music Information Retrieval Conference (ISMIR) Losorelli, S., Nguyen, D. T., Dmochowski, J. P., Kaneshiro, B. 2017

    View details for DOI 10.5281/zenodo.1417917

  • I Said It First: Topological Analysis of Lyrical Influence Network Proceedings of the 17th International Society for Music Information Retrieval Conference (ISMIR) Atherton, J., Kaneshiro, B. 2016

    View details for DOI 10.5281/zenodo.1418047

  • Neuroimaging Methods for Music Information Retrieval: Current Findings and Future Prospects Proceedings of the 16th International Society for Music Information Retrieval Conference (ISMIR) Kaneshiro, B., Dmochowski, J. P. 2015

    View details for DOI 10.5281/zenodo.1416082

  • QBT-Extended: An Annotated Dataset of Melodically Contoured Tapped Queries Proceedings of the 14th International Society for Music Information Retrieval Conference (ISMIR) Kaneshiro, B., Kim, H., Herrera, J., Oh, J., Berger, J., Slaney, M. 2013

    View details for DOI 10.5281/zenodo.1415756

  • Probing Neural Mechanisms of Music Perception, Cognition, and Performance Using Multivariate Decoding Psychomusicology: Music, Mind, and Brain Schaefer, R. S., Furuya, S., Smith, L. M., Kaneshiro, B. B., Toiviainen, P. 2012; 22 (2)

    View details for DOI 10.1037/a0031014