Bio


Dan Jurafsky is Professor and Chair of Linguistics and Professor of Computer Science at Stanford University.

He is the recipient of a 2002 MacArthur Fellowship, is the co-author with Jim Martin of the widely-used textbook "Speech and Language Processing", and co-created with Chris Manning one of the first massively open online courses, Stanford's course in Natural Language Processing. His trade book "The Language of Food: A Linguist Reads the Menu" was a finalist for the 2015 James Beard Award.

Dan received a B.A in Linguistics in 1983 and a Ph.D. in Computer Science in 1992 from the University of California at Berkeley, was a postdoc 1992-1995 at the International Computer Science Institute, and was on the faculty of the University of Colorado, Boulder until moving to Stanford in 2003.

His research ranges widely across computational linguistics; special interests include natural language understanding, human-human conversation, the relationship between human and machine processing, and the application of natural language processing to the social and behavioral sciences. He also works on the linguistics of food and the linguistics of Chinese.

Academic Appointments


Administrative Appointments


  • Professor and Chair of Linguistics and Professor of Computer Science, Stanford University (2014 - Present)
  • Professor of Linguistics and (by courtesy) of Computer Science, Stanford University (2010 - 2014)
  • Associate Professor of Linguistics and (by courtesy) of Computer Science, Stanford University (2004 - 2010)
  • Associate Professor of Linguistics, Computer Science, Cognitive Science, University of Colorado (2001 - 2003)
  • Assistant Professor of Linguistics, Computer Science, and Cognitive Science, University of Colorado (1996 - 2001)
  • Assistant Professor of Linguistics, UC Berkeley (1993 - 1994)

Honors & Awards


  • MacArthur Fellowship, MacArthur Foundation (2003)
  • James Beard Award Finalist, James Beard Foundation (2015)
  • Fellow, Center for Advanced Study in the Behavioral Sciences (2012-2013)
  • Fillmore Professor, Linguistic Society of America (2015)
  • NSF CAREER Award, National Science Foundation (1998)
  • Roger V. Gould Prize, American Journal of Sociology (2015)
  • Cozzarelli Prize, Proceedings of the National Academy of Sciences (2017)
  • Best Paper, EMNLP 2013 (2013)
  • Best Paper, WWW 2013 (2013)
  • Best Paper, ACL/COLING 2006 (2006)
  • Distinguished paper, IJCAI 2001 (2001)
  • Marr Prize Honorable Mention, Cognitive Science Society (1998)

Boards, Advisory Committees, Professional Organizations


  • Member, Editorial Boards, Annual Review of Linguistics, Computer Speech and Language, Computational Linguistics
  • Chair, ACL SIGHAN (2009 - 2011)
  • Associate Director, LSA Summer Institute, Stanford (2007 - 2007)
  • Member, Executive Committee, North American Association of Computational Linguistics (2001 - 2002)
  • Chair, Linguistic Society of America Committee on Computing, Linguistic Society (2000 - 2000)

Program Affiliations


  • Symbolic Systems Program

Professional Education


  • Postdoc, International Computer Science Institute, Berkeley (1995)
  • Ph.D., University of California at Berkeley, Computer Science (1992)
  • B.A., University of California at Berkeley, Linguistics (1983)

2018-19 Courses


Stanford Advisees


  • Doctoral Dissertation Reader (AC)
    Johannes Birgmeier, Braden Hancock, Albert Haque, Ed King, Matt Lamm, Panupong Pasupat, Emma Pierson, Peng Qi, Arianna Yuan
  • Postdoctoral Faculty Sponsor
    Dallas Card, Vivek Kulkarni, Kyle Mahowald
  • Doctoral Dissertation Advisor (AC)
    Ignacio Cases, Urvashi Khandelwal, Rob Voigt
  • Orals Evaluator
    Braden Hancock, Panupong Pasupat
  • Master's Program Advisor
    Nishit Asnani, William Hang, John Kamalu, Sahil Yakhmi
  • Doctoral (Program)
    Dora Demszky, Peter Henderson, Dan Iter, Pratyusha Kalluri, Yiwei Luo, Reid Pryzant

All Publications


  • Building DNN acoustic models for large vocabulary speech recognition COMPUTER SPEECH AND LANGUAGE Maas, A. L., Qi, P., Xie, Z., Hannun, A. Y., Lengerich, C. T., Jurafsky, D., Ng, A. Y. 2017; 41: 195-213
  • Cans and cants: Computational potentials for multimodality with a case study in head position JOURNAL OF SOCIOLINGUISTICS Voigt, R., Eckert, P., Jurafsky, D., Podesva, R. J. 2016; 20 (5): 677-711

    View details for DOI 10.1111/josl.12216

    View details for Web of Science ID 000389052600005

  • Deterministic Coreference Resolution Based on Entity-Centric, Precision-Ranked Rules COMPUTATIONAL LINGUISTICS Lee, H., Chang, A., Peirsman, Y., Chambers, N., Surdeanu, M., Jurafsky, D. 2013; 39 (4): 885-916
  • Differentiating language usage through topic models POETICS McFarland, D. A., Ramage, D., Chuang, J., Heer, J., Manning, C. D., Jurafsky, D. 2013; 41 (6): 607-625
  • Making the Connection: Social Bonding in Courtship Situations AMERICAN JOURNAL OF SOCIOLOGY McFarland, D. A., Jurafsky, D., Rawlings, C. 2013; 118 (6): 1596-1649

    View details for DOI 10.1086/670240

    View details for Web of Science ID 000321045300004

  • Detecting friendly, flirtatious, awkward, and assertive speech in speed-dates COMPUTER SPEECH AND LANGUAGE Ranganath, R., Jurafsky, D., McFarland, D. A. 2013; 27 (1): 89-115
  • Positive Diversity Tuning for Machine Translation System Combination WMT Cer, D., Manning, C. D., Jurafsky, D. 2013
  • No country for old members Proceedings of WWW 2013 Danescu-Niculescu-Mizil, C., West, R., Jurafsky, D., Leskovec, J., Potts, C. 2013
  • Same Referent, Different Words: Unsupervised Mining of Opaque Coreferent Mentions Recasens, M., Can, M., Jurafsky, D. 2013
  • Linguistic Models for Analyzing and Detecting Biased Language Recasens, M., Danescu-Niculescu-Mizil, C., Jurafsky, D. 2013
  • Emergence of Gricean maxims from multi-agent decision theory Vogel, A., Bodoia, M., Jurafsky, D., Potts, C. 2013
  • A computational approach to politeness with application to social factors Danescu-Niculescu-Mizil, C., Sudhof, M., Jurafsky, D., Leskovec, J., Potts, C. 2013
  • Citation-based bootstrapping for large-scale author disambiguation JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY Levin, M., Krawczyk, S., Bethard, S., Jurafsky, D. 2012; 63 (5): 1030-1047

    View details for DOI 10.1002/asi.22621

    View details for Web of Science ID 000303500300012

  • Bootstrapping Dependency Grammar Inducers from Incomplete Sentence Fragments via Austere Models Spitkovsky, V. I., Alshawi, H., Jurafsky, D. 2012
  • Joint Entity and Event Coreference Resolution across Documents Lee, H., Recasens, M., Chang, A., Surdeanu, M., Jurafsky, D. 2012
  • Unsupervised Dependency Parsing without Gold Part-of-Speech Tags Spitkovsky, V. I., Alshawi, H., Chang, A. X., Jurafsky, D. 2011
  • Authenticity in America: Class Distinctions in Potato Chip Advertising Gastronomica Freedman, J., Jurafsky, D. 2011; 11 (4): 46-54
  • Using query patterns to learn the duration of events Gusev, A., Chambers, N., Khilnani, D. R., Khaitan, P., Bethard, S., Jurafsky, D. 2011
  • LeadLag LDA: Estimating Topic Specific Leads and Lags of Information Outlets Nallapati, R., Shi, X., McFarland, D., Leskovec, J., Jurafsky, D. 2011
  • Stanford’s Multi-Pass Sieve Coreference Resolution System at the CoNLL-2011 Shared Task Lee, H., Peirsman, Y., Chang, A., Chambers, N., Surdeanu, M., Jurafsky, D. 2011
  • Lateen EM: Unsupervised Training with Multiple Objectives, Applied to Dependency Grammar Induction Spitkovsky, V. I., Alshawi, H., Jurafsky, D. 2011
  • Punctuation: Making a Point in Unsupervised Dependency Parsing Spitkovsky, V. I., Alshawi, H., Jurafsky, D. 2011
  • The NXT-format Switchboard Corpus: a rich resource for investigating the syntax, semantics, pragmatics and prosody of dialogue LANGUAGE RESOURCES AND EVALUATION Calhoun, S., Carletta, J., Brenier, J. M., Mayo, N., Jurafsky, D., Steedman, M., Beaver, D. 2010; 44 (4): 387-419
  • Which words are hard to recognize? Prosodic, lexical, and disfluency factors that increase speech recognition error rates SPEECH COMMUNICATION Goldwater, S., Jurafsky, D., Manning, C. D. 2010; 52 (3): 181-200
  • How Good are Humans at Solving CAPTCHAs? A Large Scale Evaluation Symposium on Security and Privacy Bursztein, E., Bethard, S., Fabry, C., Mitchell, J. C., Jurafsky, D. IEEE COMPUTER SOC. 2010: 399–413

    View details for DOI 10.1109/SP.2010.31

    View details for Web of Science ID 000287456100027

  • From Baby Steps to Leapfrog: How ÂHow âAIJLess is MoreâAI in Unsupervised Dependency Parsing Spitkovsky, V. I., Alshawi, H., Jurafsky, D. 2010
  • Proceedings of the 23rd International Conference on Computational Linguistics Jurafsky, D. edited by Huang, C., Jurafsky, D. 2010: 1387
  • Measuring Machine Translation Quality as Semantic Equivalence: A Metric Based on Entailment Features Machine Translation Padó, S., Cer, D., Galley, M., Jurafsky, D., Manning, Christopher, D. 2010; 23: 181-193
  • Which words are hard to recognize? Prosodic, lexical, and disfluency factors that increase speech recognition error rates Goldwater, S., Jurafsky, D., Manning, C. D. 2010: 181–200
  • Parsing to Stanford Dependencies: Trade-offs between speed and accuracy Cer, D., de Marneffe, M., Jurafsky, D., Manning, C. D. 2010
  • Learning to Follow Navigational Directions Vogel, A., Jurafsky, D. 2010
  • Profiting from Mark-Up: Hyper-Text Annotations for Guided Parsing Spitkovsky, V. I., Jurafsky, D., Alshawi, H. 2010
  • The Best Lexical Metric for Phrase-Based Statistical MT System Optimization Cer, D., Jurafsky, D., Manning, C. 2010
  • How good are humans at solving CAPTCHAs? A large scale evaluation Bursztein, E., Bethard, S., Mitchell, J. C., Jurafsky, D., Fabry, C. 2010
  • A Multi-Pass Sieve for Coreference Resolution Raghunathan, K., Lee, H., Rangarajan, S., Chambers, N., Surdeanu, M., Jurafsky, D., Manning, C. 2010
  • The effect of lexical frequency and Lombard reflex on tone hyperarticulation JOURNAL OF PHONETICS Zhao, Y., Jurafsky, D. 2009; 37 (2): 231-247
  • Predictability effects on durations of content and function words in conversational English JOURNAL OF MEMORY AND LANGUAGE Bell, A., Brenier, J. M., Gregory, M., Girand, C., Jurafsky, D. 2009; 60 (1): 92-111
  • Robust Machine Translation Evaluation with Entailment Features Pado, S., Galley, M., Jurafsky, D., Manning, C. 2009
  • Disambiguating "DE" for Chinese- English Machine Translation Proceedings of the EACL Chang, P., Jurafsky, D., Manning, C. D. 2009
  • Extracting Social Meaning: Identifying Interactional Style in Spoken Conversation Jurafsky, D., Ranganath, R., McFarland, D. 2009
  • Extracting Social Meaning: Identifying Interactional Style in Spoken Conversation Jurafsky, D., Ranganath, R., McFarland, D. 2009
  • Speech and Language Processing Jurafsky, D. edited by Jurafsky, D., Martin, J. H. 2009
  • Hidden Conditional Random Fields for Phone Recognition IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU 2009) Sung, Y., Jurafsky, D. IEEE. 2009: 107–112
  • It's Not You, It's Me: Automatically Extracting Social Meaning from Speed Dates IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU 2009) Jurafsky, D. IEEE. 2009: 11–11
  • Extracting Social Meaning: Identifying Interactional Style in Spoken Conversation Jurafsky, D., Ranganath, R., McFarland, D. 2009
  • It’s Not You, it’s Me: Detecting Flirting and its Misperception in Speed-Dates Ranganath, R., Jurafsky, D., McFarland, D. 2009
  • Unsupervised Learning of Narrative Schemas and their Participants Chambers, N., Jurafsky, D. 2009
  • Distant supervision for relation extraction without labeled data Mintz, M., Bills, S., Snow, R., Jurafsky, D. 2009
  • Measuring Machine Translation Quality as Semantic Equivalence: A Metric Based on Entailment Features Machine Translation Pado, S., Cer, D., Galley, M., Jurafsky, D., Manning, C. D. 2009; 23 (2-3): 181-193
  • Robust Machine Translation Evaluation with Entailment Features Pado, S., Galley, M., Jurafsky, D., Manning, C. D. 2009
  • Textual Entailment Features for Machine Translation Evaluation Pado, S., Galley, M., Jurafsky, D., Manning, C. D. 2009
  • Hidden Conditional Random Fields for Phone Recognition Sung, Y., Jurafsky, D. 2009
  • Detecting prominence in conversational speech: pitch accent, givenness and focus Kumar, V., Sridhar, R., Nenkova, A., Narayanan, S., Jurafsky, D. 2008
  • Regularization and Search for Minimum Error Rate Training Cer, D., Jurafsky, D., Manning, C. D. 2008
  • Maximum Conditional Likelihood Linear Regression and Maximum A Posteriori for Hidden Conditional Random Fields Speaker Adaptation Sung, Y., Boulis, C., Jurafsky, D. 2008: 4293–96
  • Unsupervised Learning of Narrative Event Chains Chambers, N., Jurafsky, D. 2008: 789–97
  • Studying the History of Ideas Using Topic Models Hall, D., Jurafsky, D., Manning, C. D. 2008
  • Which words are hard to recognize? Lexical, prosodic, and disfluency factors that increase ASR error rates Goldwater, S., Jurafsky, D., Manning, C. D. 2008: 380–88
  • Maximum Conditional Likelihood Linear Regression and Maximum A Posteriori for Hidden Conditional Random Fields speaker adaptation 33rd IEEE International Conference on Acoustics, Speech and Signal Processing Sung, Y., Boulis, C., Jurafsky, D. IEEE. 2008: 4293–4296
  • Jointly Combining Implicit Constraints Improves Temporal Ordering Chambers, N., Jurafsky, D. 2008: 698–706
  • Cheap and Fast - But is it Good? Evaluating Non-Expert Annotations for Natural Language Tasks Snow, R., O’Connor, B., Jurafsky, D., Ng, A. Y. 2008
  • Automatic detection of contrastive elements in spontaneous speech IEEE Workshop on Automatic Speech Recognition and Understanding Nenkova, A., Jurafsky, D. IEEE. 2007: 201–206
  • Measuring Importance and Query Relevance in Topicfocused Multi-document Summarization Gupta, S., Nenkova, A., Jurafsky, D. 2007
  • Regularization, Adaptation, andNon-Independent Features Improve Hidden Conditional Random Fields for Phone Classification Sung, Y., Boulis, C., Manning, C., Jurafsky, D. 2007: 347–52
  • Disambiguating Between Generic and Referential “You" in Dialog Gupta, S., Purver, M., Jurafsky, D. 2007
  • Regularization, adaptation, and non-independent features improve Hidden Conditional Random Fields for phone classification IEEE Workshop on Automatic Speech Recognition and Understanding Sung, Y., Boulis, C., Manning, C., Jurafsky, D. IEEE. 2007: 347–352
  • The Effect of Lexical Frequency on Tone Production Zhao, Y., Jurafsky, D. 2007: 477–80
  • Learning to merge word senses Snow, R., Prakash, S., Jurafsky, D., Ng, A. Y. 2007
  • Classifying Temporal Relations Between Events Chambers, N., Wang, S., Jurafsky, D. 2007
  • Automated Methods for Processing Arabic Text: From Tokenization to Base Phrase Chunking Arabic Computational Morphology: Knowledge-based and Empirical Methods Diab, M., Hacioglu, K., Jurafsky, D., Neumann, G. edited by Soudi, A., van den Bosch, A. Springer. 2007: 159–180
  • A dialectal chinese speech recognition framework JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY Li, J., Zheng, T. F., Byrne, W., Jurafsky, D. 2006; 21 (1): 106-115
  • Limitations of MLLR Adaptation with Spanish-Accented English: An Error Analysis Clarke, C., Jurafsky, D. 2006
  • Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing Jurafsky, D. edited by Jurafsky, D., Gaussier, E. 2006
  • The (non)utility of linguistic features for predicting prominence in spontaneous speech 1st Workshop on Spoken Language Technology Brenier, J. M., Nenkova, A., Kothari, A., Whitton, L., Beaver, D., Jurafsky, D. IEEE. 2006: 54–57
  • Have we met? MDP Based Speaker ID for Robot Dialogue 9th International Conference on Spoken Language Processing/INTERSPEECH 2006 Krsmanovic, F., Spencer, C., Jurafsky, D., Ng, A. Y. ISCA-INST SPEECH COMMUNICATION ASSOC. 2006: 461–464
  • Limitations of MLLR Adaptation with Spanish-Accented English: An Error Analysis 9th International Conference on Spoken Language Processing/INTERSPEECH 2006 Clarke, C., Jurafsky, D. ISCA-INST SPEECH COMMUNICATION ASSOC. 2006: 1117–1120
  • Have we met? MDP Based Speaker ID for Robot Dialogue Krsmanovic, F., Spencer, C., Jurafsky, D., Ng, A. Y. 2006
  • Detection of Word Fragments in Mandarin Telephone Conversation 9th International Conference on Spoken Language Processing/INTERSPEECH 2006 Chu, C., Sung, Y., Zhao, Y., Jurafsky, D. ISCA-INST SPEECH COMMUNICATION ASSOC. 2006: 2334–2337
  • Extracting opinion propositions and opinion holders using syntactic and lexical cues Symposium on Computing Attitude and Affect in Text Bethard, S., Yu, H., Thornton, A., Hatzivassiloglou, V., Jurafsky, D. SPRINGER. 2006: 125–141
  • Semantic Taxonomy Induction from Heterogenous Evidence 21st International Conference on Computational Linguistics/44th Annual Meeting of the Association for Computational Linguistics Snow, R., Jurafsky, D., Ng, A. Y. ASSOC COMPUTATIONAL LINGUISTICS-ACL. 2006: 801–808
  • Support vector learning for semantic argument classification MACHINE LEARNING Pradhan, S., Hacioglu, K., Krugler, V., Ward, W., Martin, J., Jurafsky, D. 2005; 60 (1-3): 11-39
  • Special issue on pronunciation modeling and lexicon adaptation SPEECH COMMUNICATION Fosler-Lussier, E., Byrne, W., Jurafsky, D. 2005; 46 (2): 117-118
  • Extracting opinion propositions and opinion holders using syntactic and lexical cues Computing Attitude and Affect in Text: Theory and Applications Bethard, S., Yu, H., Thornton, A., Hatzivassiloglou, V., Jurafsky, D. edited by Shanahan, J. G., Qu, Y., Wiebe, J. Springer. 2005: 125–142
  • Integrating advanced models of syntax, phonology, and accent/dialect with a speech recognizer Jurafsky, D., Wooters, C., Tajchman, G., Segal, J., Stolcke, A., Morgan, N. 2005: 107–15
  • Morphological features help POS tagging of unknown words across language varieties Tseng, H., Jurafsky, D., Manning, C. 2005
  • A Conditional Random FieldWord Segmenter Tseng, H., Chang, P., Andrew, G., Jurafsky, D., Manning, C. 2005
  • Semantic Role Labeling Using Different Syntactic Views Pradhan, S., Ward, W., Hacioglu, K., Martin, J., Jurafsky, D. 2005
  • Detection of questions in Chinese conversational speech IEEE Workshop on Automatic Speech Recognition and Understanding Yuan, J. H., Jurafsky, D. IEEE. 2005: 47–52
  • Detection of Questions in Chinese Conversation Yuan, J., Jurafsky, D. 2005
  • Accent Detection and Speech Recognition for Shanghai-Accented Mandarin Zheng, Y., Sproat, R., Gu, L., Shafran, I., Zhou, H., Su, Y., Jurafsky, D., Starr, R., Yoon, S. 2005
  • The Detection of Emphatic Words Using Acoustic and Lexical Features Brenier, J. M., Cer, D., Jurafsky, D. 2005
  • Pitch Accent Prediction: Effects of Genre and Speaker Yuan, J., Brenier, J. M., Jurafsky, D. 2005
  • Speech Communication Special Issue on Pronunciation Modeling and Lexicon Adaptation Jurafsky, D. edited by Fosler-Lussier, E., Byrne, W., Jurafsky, D. Elsevier. 2005; 46 (2)
  • Verb subcategorization frequencies: American English corpus data, methodological studies, and cross-corpus comparisons BEHAVIOR RESEARCH METHODS INSTRUMENTS & COMPUTERS Gahl, S., Jurafsky, D., Roland, D. 2004; 36 (3): 432-443

    Abstract

    Verb subcategorization frequencies (verb biases) have been widely studied in psycholinguistics and play an important role in human sentence processing. Yet available resources on subcategorization frequencies suffer from limited coverage, limited ecological validity, and divergent coding criteria. Prior estimates of verb transitivity, for example, vary widely with corpus size, coverage, and coding criteria This article provides norming data for 281 verbs of interest to psycholinguistic research, sampled from a corpus of American English, along with a detailed coding manual. We examine the effect on transitivity bias of various coding decisions and methods of computing verb biases.

    View details for Web of Science ID 000225848300009

    View details for PubMedID 15641433

  • Pragmatics and Computational Linguistics Handbook of Pragmatics Jurafsky, D. edited by Horn, L. R., Ward, G. Blackwell. 2004: 578–604
  • Parsing Arguments of Nominalizations in English and Chinese Pradhan, S., Sun, H., Ward, W., Martin, J. H., Jurafsky, D. 2004
  • Automatic Extraction of Opinion Propositions and their Holders Bethard, S., Yu, H., Thornton, A., Hativassiloglou, V., Jurafsky, D. 2004
  • Shallow semantic parsing using support vector machines Human Language Technology Conference of the North American Chapter of the Association-for-Computational-Linguistics Pradhan, S., Ward, W., Hacioglu, K., Martin, J. H., Jurafsky, D. ASSOCIATION COMPUTATIONAL LINGUISTICS. 2004: 233–240
  • Shallow semantic parsing of Chinese Human Language Technology Conference of the North American Chapter of the Association-for-Computational-Linguistics Sun, H., Jurafsky, D. ASSOCIATION COMPUTATIONAL LINGUISTICS. 2004: 249–256
  • Automatic Tagging of Arabic Text: From Raw Text to Base Phrase ChunkS Diab, M., Hacioglu, K., Jurafsky, D. 2004
  • Learning syntactic patterns for automatic hypernym discovery Snow, R., Jurafsky, D., Ng, A. Y. 2004
  • Effects of disfluencies, predictability, and utterance position on word form variation in English conversation JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA Bell, A., Jurafsky, D., Fosler-Lussier, E., Girand, C., Gregory, M., Gildea, D. 2003; 113 (2): 1001-1024

    Abstract

    Function words, especially frequently occurring ones such as (the, that, and, and of), vary widely in pronunciation. Understanding this variation is essential both for cognitive modeling of lexical production and for computer speech recognition and synthesis. This study investigates which factors affect the forms of function words, especially whether they have a fuller pronunciation (e.g., thi, thaet, aend, inverted-v v) or a more reduced or lenited pronunciation (e.g., thax, thixt, n, ax). It is based on over 8000 occurrences of the ten most frequent English function words in a 4-h sample from conversations from the Switchboard corpus. Ordinary linear and logistic regression models were used to examine variation in the length of the words, in the form of their vowel (basic, full, or reduced), and whether final obstruents were present or not. For all these measures, after controlling for segmental context, rate of speech, and other important factors, there are strong independent effects that made high-frequency monosyllabic function words more likely to be longer or have a fuller form (1) when neighboring disfluencies (such as filled pauses uh and um) indicate that the speaker was encountering problems in planning the utterance; (2) when the word is unexpected, i.e., less predictable in context; (3) when the word is either utterance initial or utterance final. Looking at the phenomenon in a different way, frequent function words are more likely to be shorter and to have less-full forms in fluent speech, in predictable positions or multiword collocations, and utterance internally. Also considered are other factors such as sex (women are more likely to use fuller forms, even after controlling for rate of speech, for example), and some of the differences among the ten function words in their response to the factors.

    View details for DOI 10.1121/1.1534836

    View details for Web of Science ID 000180874900032

    View details for PubMedID 12597194

  • Semantic role parsing: Adding semantic structure to unstructured text 3rd IEEE International Conference on Data Mining Pradhan, S., Hacioglu, K., Ward, W., Martin, J. H., Jurafsky, D. IEEE COMPUTER SOC. 2003: 629–632
  • Issues in Recognition of Spanish-Accented Spontaneous English Ikeno, A., Pellom, B., Cer, D., Thornton, A., Brenier, J. M., Jurafsky, D., Ward, W., Byrne, W. 2003
  • Syntactic frame and verb bias in aphasia: Plausibility judgments of undergoer-subject sentence Theoretical and Experimental Neuropsychology (TENNET) Conference Proceedings special issue Gahl, S., Menn, L., Ramsberger, G., Jurafsky, D., Elder, E., Rewega, M., Holland, A. L. 2003: 223–28
  • The Effect of Rhythm on Structural Disambiguation in Chinese Sun, H., Jurafsky, D. 2003
  • Probabilistic Modeling in Psycholinguistics: Linguistic Comprehension and Production Probability Theory in Linguistics Jurafsky, D. edited by Bod, R., Hay, J., Jannedy, S. The MIT Press. 2003: 39–96
  • Automatic Labeling of semantic roles COMPUTATIONAL LINGUISTICS Gildea, D., Jurafskyy, D. 2002; 28 (3): 245-288
  • Identifying Semantic Relations in Text Exploring AI in the New Millenium Gildea, D., Jurafsky, D. edited by Lakemeyer, G., Nebel, B. Morgan Kaufmann. 2002: 69–102
  • Which predictability measures affect content word durations? PMLA Bell, A., Gregory, M. L., Jurafsky, D., Girand, C., Brenier, J., Ikeno, A. 2002
  • A Bayesian Model Predicts Human Parse Preference and Reading Time in Sentence Processing Narayanan, S., Jurafsky, D., Ghahramani, Z. edited by Dietterich, T. G., Becker, S. 2002: 59–65
  • Lexicon adaptation for LVCSR: Speaker idiosyncracies, non-native speakers, and pronunciation choice PMLA Ward, W., Krech, H., Yu, X., Herold, K., Figgs, G., Ikeno, A., Jurafsky, D. 2002
  • The Role of the Lemma in Form Variation Papers in Laboratory Phonology VII Jurafsky, D., Bell, A., Girand, C., Warner, N. edited by Gussenhoven, C. Berlin/New York: Mouton de Gruyter. 2002: 1–34
  • Verb sense and verb subcategorization probabilities The Lexical Basis of Sentence Processing: Formal, Computational, and Experimental Issues Roland, D., Jurafsky, D. edited by Stevenson, S., Merlo, P. Amsterdam: John Benjamins. 2002: 325–346
  • Probabilistic Relations between Words: Evidence from Reduction in Lexical Production Frequency and the emergence of linguistic structure Jurafsky, D., Bell, A., Gregory, M., Raymond, W. D. edited by Bybee, J., Hopper, P. Amsterdam: John Benjamins. 2001: 229–254
  • Knowledge-Free Induction of Inflectional Morphologies Jurafsky, D. 2001
  • Is knowledge-free induction of multiword unit dictionary headwords a solved problem? Conference on Empirical Methods in Natural Language Processing Schone, P., Jurafsky, D. ASSOCIATION COMPUTATIONAL LINGUISTICS. 2001: 100–108
  • The effect of language model probability on pronunciation reduction IEEE International Conference on Acoustics, Speech, and Signal Processing Jurafsky, D., Bell, A., Gregory, M., Raymond, W. D. IEEE. 2001: 801–804
  • What kind of pronunciation variation is hard for triphones to model? IEEE International Conference on Acoustics, Speech, and Signal Processing Jurafsky, D., Ward, W., Zhang, J. P., Herold, K., Yu, X. Y., Zhang, S. IEEE. 2001: 577–580
  • Dialog act modeling for automatic tagging and recognition of conversational speech Computational Linguistics Stolcke, A., Ries, K., Coccaro, N., Shriberg, E., Bates, R., Jurafsky, D., Taylor, P., Martin, R., Meteer, M., Van Ess-Dykema, C. 2000; 26 (3): 339–371
  • Verb Subcategorization Frequency Differences between Business-News and Balanced Corpora: the role of verb sense Roland, D., Jurafsky, D., Menn, L., Gahl, S., Elder, E., Riddoch, C. 2000
  • Knowledge-Free Induction of Morphology using Latent Semantic Analysis CoNLL Jurafsky, D. 2000
  • The effects of collocational strength and contextual predictability in lexical production Gregory, M., William, L., Raymond, D., Bell, A., Fosler-Lussier, E., Jurafsky, D. 2000: 151–66
  • The American National Corpus: An outline of the project ACIDCA Ide, N., Macleod, C., Fillmore, C., Jurafsky, D. 2000
  • Automatic labeling of semantic roles 38th Annual Meeting of the Association-for-Computational-Linguistics Gildea, D., Jurafsky, D. ASSOCIATION COMPUTATIONAL LINGUISTICS. 2000: 512–520
  • Forms of English function words – Effects of disfluencies, turn position, age and sex, and predictability Bell, A., Jurafsky, D., Fosler-Lussier, E., Girand, C., Gildea, D. 1999: 395–98
  • Cognition and Function in Language Jurafsky, D. edited by Fox, B. A., Jurafsky, D., Michaelis, L. A. CSLI Publications, Stanford, CA. 1999
  • Can prosody aid the automatic classification of dialog acts in conversational speech? LANGUAGE AND SPEECH Shriberg, E., Bates, R., Stolcke, A., Taylor, P., Jurafsky, D., Ries, K., Coccaro, N., Martin, R., Meteer, M., Van Ess-Dykema, C. 1998; 41: 443-492

    Abstract

    Identifying whether an utterance is a statement, question, greeting, and so forth is integral to effective automatic understanding of natural dialog. Little is known, however, about how such dialog acts (DAs) can be automatically classified in truly natural conversation. This study asks whether current approaches, which use mainly word information, could be improved by adding prosodic information. The study is based on more than 1000 conversations from the Switchboard corpus. DAs were hand-annotated, and prosodic features (duration, pause, F0, energy, and speaking rate) were automatically extracted for each DA. In training, decision trees based on these features were inferred; trees were then applied to unseen test data to evaluate performance. Performance was evaluated for prosody models alone, and after combining the prosody models with word information--either from true words or from the output of an automatic speech recognizer. For an overall classification task, as well as three subtasks, prosody made significant contributions to classification. Feature-specific analyses further revealed that although canonical features (such as F0 for questions) were important, less obvious features could compensate if canonical features were removed. Finally, in each task, integrating the prosodic model with a DA-specific statistical language model improved performance over that of the language model alone, especially for the case of recognized words. Results suggest that DAs are redundantly marked in natural conversation, and that a variety of automatically extractable prosodic features could aid dialog processing in speech applications.

    View details for Web of Science ID 000079598500010

    View details for PubMedID 10746366

  • Reduction of English function words in Switchboard ICSLP Jurafsky, D., Bell, A., Fosler-Lussier, E., Girand, C., Raymond, W. D. 1998: 3111–14
  • On the semantics of the Cantonese changed tone BLS Jurafsky, D. 1998: 304–18
  • Towards Better Integration of Semantic Predictors in Statistical Language Modeling ICSLP Coccaro, N., Jurafsky, D. 1998: 2403–6
  • Dialog act modeling for conversational speech TRSS Stolcke, A., Shriberg, E., Bates, R., Coccaro, N., Jurafsky, D., Martin, R., Meteer, M., Ries, K., Taylor, P., Van Ess-Dykema, C. 1998
  • An American National Corpus: A Proposal Fillmore, C., Ide, N., Jurafsky, D., Macleod, C. 1998: 965–70
  • Bayesian models of human sentence processing 20th Annual Conference of the Cognitive-Science-Society Narayanan, S., Jurafsky, D. LAWRENCE ERLBAUM ASSOC PUBL. 1998: 752–757
  • Lexical, Prosodic, and Syntactic Cues for Dialog Acts Jurafsky, D., Shriberg, E. E., Fox, B., Curl, T. 1998: 114–20
  • How Verb Subcategorization Frequencies Are Affected By Corpus Choice COLING/ACL Roland, D., Jurafsky, D. 1998: 1122–28
  • Automatic detection of discourse structure for speech recognition and understanding IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU-97) Jurafsky, D., Bates, R., Coccaro, N., Martin, R., Meteer, M., Ries, K., Shriberg, E., Stolcke, A., Taylor, P., Van Ess-Dykema, C. IEEE. 1997: 88–95
  • Universal tendencies in the semantics of the diminutive LANGUAGE Jurafsky, D. 1996; 72 (3): 533-578
  • A probabilistic model of lexical and syntactic access and disambiguation COGNITIVE SCIENCE Jurafsky, D. 1996; 20 (2): 137-194
  • Learning bias and phonological induction Computational Linguistics Gildea, D., Jurafsky, D. 1996; 22: 497-530
  • Building multiple pronunciation models for novel words using exploratory computational phonology Tajchman, G., Fosler, E., Jurafsky, D. 1995: 2247–50
  • USING A STOCHASTIC CONTEXT-FREE GRAMMAR AS A LANGUAGE MODEL FOR SPEECH RECOGNITION 1995 International Conference on Acoustics, Speech, and Signal Processing Jurafsky, D., Wooters, C., Segal, J., Stolcke, A., FOSLER, E., TAJCHMAN, G., Morgan, N. IEEE. 1995: 189–192
  • Automatic induction of finite state transducers for simple phonological rules ACL Gildea, D., Jurafsky, D. 1995: 9–15
  • Learning phonological rule probabilities from speech corpora with exploratory computational phonology ACL Tajchman, G., Jurafsky, D., Fosler, E. 1995: 1–8
  • Type underspecification and on-line type construction in the lexicon Koenig, J., Jurafsky, D. 1995: 270–85
  • The Berkeley restaurant project ICSLP Jurafsky, D., Wooters, C., Tajchman, G., Segal, J., Stolcke, A., Fosler, E., Morgan, N. 1994: 2139–42
  • Universals in the semantics of the diminutive BLS Jurafsky, D. 1993: 423–36
  • An on-line model of human sentence interpretation AAAI Jurafsky, D. 1992: 302–8
  • AN ONLINE MODEL OF HUMAN SENTENCE INTERPRETATION 13TH ANNUAL CONF OF THE COGNITIVE SCIENCE SOC Jurafsky, D. LAWRENCE ERLBAUM ASSOC PUBL. 1991: 449–454