Christopher Manning
Thomas M. Siebel Professor of Machine Learning, Professor of Linguistics, of Computer Science and Senior Fellow at the Stanford Institute for HAI
Web page: https://nlp.stanford.edu/~manning/
Bio
Christopher Manning is the inaugural Thomas M. Siebel Professor of Machine Learning in the Departments of Linguistics and Computer Science at Stanford University, Director of the Stanford Artificial Intelligence Laboratory (SAIL), and an Associate Director of the Stanford Institute for Human-Centered Artificial Intelligence (HAI). His research goal is computers that can intelligently process, understand, and generate human languages. Manning was an early leader in applying Deep Learning to Natural Language Processing (NLP), with well-known research on the GloVe model of word vectors, attention, machine translation, question answering, self-supervised model pre-training, tree-recursive neural networks, machine reasoning, dependency parsing, sentiment analysis, and summarization. He also focuses on computational linguistic approaches to parsing, natural language inference and multilingual language processing, including being a principal developer of Stanford Dependencies and Universal Dependencies. Manning has coauthored leading textbooks on statistical approaches to NLP (Manning and Schütze 1999) and information retrieval (Manning, Raghavan, and Schütze, 2008), as well as linguistic monographs on ergativity and complex predicates. His online CS224N Natural Language Processing with Deep Learning videos have been watched by hundreds of thousands of people. He is an ACM Fellow, a AAAI Fellow, and an ACL Fellow, and a Past President of the ACL (2015). His research has won ACL, Coling, EMNLP, and CHI Best Paper Awards, an ACL Test of Time Award, and the IEEE John von Neumann Medal (2024). He has a B.A. (Hons) from The Australian National University and a Ph.D. from Stanford in 1994, and an Honorary Doctorate from U. Amsterdam in 2023, and he held faculty positions at Carnegie Mellon University and the University of Sydney before returning to Stanford. He is the founder of the Stanford NLP group (@stanfordnlp) and manages development of the Stanford CoreNLP and Stanza software.
Academic Appointments
-
Professor, Linguistics
-
Professor, Computer Science
-
Senior Fellow, Institute for Human-Centered Artificial Intelligence (HAI)
-
Member, Bio-X
Administrative Appointments
-
Associate Director, Human-Centered Artificial Intelligence Initiative (HAI) (2018 - Present)
-
Director, Stanford Artificial Intelligence Laboratory (2018 - Present)
Honors & Awards
-
John von Neumann Medal, IEEE (2024)
-
Honorary Doctorate, University of Amsterdam (2023)
-
Fellow, ACL
-
Fellow, AAAI
-
Fellow, ACM
Program Affiliations
-
Symbolic Systems Program
Professional Education
-
PhD, Stanford University (1994)
2024-25 Courses
- Foundations of Linguistic Theory
LINGUIST 200 (Aut) -
Independent Studies (25)
- Advanced Reading and Research
CS 499 (Aut, Win, Spr) - Advanced Reading and Research
CS 499P (Aut, Win, Spr) - Biomedical Informatics Teaching Methods
BIOMEDIN 290 (Aut, Win, Spr, Sum) - Curricular Practical Training
CS 390A (Aut, Win, Spr) - Curricular Practical Training
CS 390B (Aut, Win, Spr) - Curricular Practical Training
CS 390C (Aut, Win, Spr) - Directed Reading
LINGUIST 397 (Aut, Win, Spr) - Directed Reading and Research
BIOMEDIN 299 (Aut, Win, Spr, Sum) - Directed Research
LINGUIST 398 (Aut, Win, Spr) - Dissertation Research
LINGUIST 399 (Aut, Win, Spr) - Honors Research
LINGUIST 198 (Aut, Win, Spr) - Independent Project
CS 399 (Aut, Win, Spr) - Independent Project
CS 399P (Aut, Win, Spr) - Independent Study
LINGUIST 199 (Aut, Win, Spr) - Independent Work
CS 199 (Aut, Win, Spr) - Independent Work
CS 199P (Aut, Win, Spr) - M.A. Project
LINGUIST 390 (Aut, Win, Spr) - Master's Degree Project
SYMSYS 290 (Aut, Win, Spr) - Medical Scholars Research
BIOMEDIN 370 (Aut, Win, Spr, Sum) - Part-time Curricular Practical Training
CS 390D (Aut, Win, Spr) - Programming Service Project
CS 192 (Aut, Win, Spr) - Research Projects in Linguistics
LINGUIST 396 (Win) - Senior Project
CS 191 (Aut, Win, Spr) - Supervised Undergraduate Research
CS 195 (Aut, Win, Spr) - Writing Intensive Senior Research Project
CS 191W (Aut, Win, Spr)
- Advanced Reading and Research
-
Prior Year Courses
2023-24 Courses
- History of Natural Language Processing
CS 324H (Win) - Natural Language Processing with Deep Learning
CS 224N, LINGUIST 284, SYMSYS 195N (Spr)
2022-23 Courses
- Foundations of Linguistic Theory
LINGUIST 200 (Aut) - Natural Language Processing with Deep Learning
CS 224N, LINGUIST 284, SYMSYS 195N (Win)
2021-22 Courses
- Natural Language Processing with Deep Learning
CS 224N, LINGUIST 284, SYMSYS 195N (Win) - Transformers United
CS 25 (Aut)
- History of Natural Language Processing
Stanford Advisees
-
Postdoctoral Faculty Sponsor
Robert Csordas -
Doctoral Dissertation Advisor (AC)
Anna Goldie -
Orals Evaluator
Omar Khattab -
Master's Program Advisor
Stanley Cao, Riley Carlson, Jayendra Chauhan, Afnaan Hashmi, Mingjian Jiang, Yvette Lin, Juliana Ma, Wesley Tjangnaka, Matthew Villescas, Dylan Zhou -
Doctoral Dissertation Co-Advisor (AC)
Moussa Doumbouya, Chaofei Fan, Demi Guo, Omar Khattab, Tolulope Ogunremi, Zen Wu -
Doctoral (Program)
Anna Goldie, Lucia Zheng, Shikhar murty
All Publications
-
Handwritten Code Recognition for Pen-and-Paper CS Education
ASSOC COMPUTING MACHINERY. 2024: 200-210
View details for DOI 10.1145/3657604.3662027
View details for Web of Science ID 001276257200003
-
ReCOGS: How Incidental Details of a Logical Form Overshadow an Evaluation of Semantic Interpretation
TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS
2023; 11: 1719-1733
View details for DOI 10.1162/tacl_a_00623
View details for Web of Science ID 001157063100002
-
Backpack Language Models
ASSOC COMPUTATIONAL LINGUISTICS-ACL. 2023: 9103-9125
View details for Web of Science ID 001190962500038
-
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2023
View details for Web of Science ID 001220818800032
-
Self-Destructing Models: Increasing the Costs of Harmful Dual Uses of Foundation Models
ASSOC COMPUTING MACHINERY. 2023: 287-296
View details for DOI 10.1145/3600211.3604690
View details for Web of Science ID 001117838100023
-
Mini But Mighty: Efficient Multilingual Pretraining with Linguistically-Informed Data Selection
ASSOC COMPUTATIONAL LINGUISTICS-ACL. 2023: 1251-1266
View details for Web of Science ID 001181085100091
-
Grokking of Hierarchical Structure in Vanilla Transformers
ASSOC COMPUTATIONAL LINGUISTICS-ACL. 2023: 439-448
View details for Web of Science ID 001181088800038
-
Human Language Understanding & Reasoning
DAEDALUS
2022; 151 (2): 127-138
View details for DOI 10.1162/daed_a_01905
View details for Web of Science ID 000786702600009
-
Synthetic Disinformation Attacks on Automated Fact Verification Systems
ASSOC ADVANCEMENT ARTIFICIAL INTELLIGENCE. 2022: 10581-10589
View details for Web of Science ID 000893639103066
-
Memory-Based Model Editing at Scale
JMLR-JOURNAL MACHINE LEARNING RESEARCH. 2022
View details for Web of Science ID 000900064905041
-
Biomedical and clinical English model packages for the Stanza Python NLP library.
Journal of the American Medical Informatics Association : JAMIA
2021
Abstract
OBJECTIVE: The study sought to develop and evaluate neural natural language processing (NLP) packages for the syntactic analysis and named entity recognition of biomedical and clinical English text.MATERIALS AND METHODS: We implement and train biomedical and clinical English NLP pipelines by extending the widely used Stanza library originally designed for general NLP tasks. Our models are trained with a mix of public datasets such as the CRAFT treebank as well as with a private corpus of radiology reports annotated with 5 radiology-domain entities. The resulting pipelines are fully based on neural networks, and are able to perform tokenization, part-of-speech tagging, lemmatization, dependency parsing, and named entity recognition for both biomedical and clinical text. We compare our systems against popular open-source NLP libraries such as CoreNLP and scispaCy, state-of-the-art models such as the BioBERT models, and winning systems from the BioNLP CRAFT shared task.RESULTS: For syntactic analysis, our systems achieve much better performance compared with the released scispaCy models and CoreNLP models retrained on the same treebanks, and are on par with the winning system from the CRAFT shared task. For NER, our systems substantially outperform scispaCy, and are better or on par with the state-of-the-art performance from BioBERT, while being much more computationally efficient.CONCLUSIONS: We introduce biomedical and clinical NLP packages built for the Stanza library. These packages offer performance that is similar to the state of the art, and are also optimized for ease of use. To facilitate research, we make all our models publicly available. We also provide an online demonstration (http://stanza.run/bio).
View details for DOI 10.1093/jamia/ocab090
View details for PubMedID 34157094
-
Universal Dependencies
COMPUTATIONAL LINGUISTICS
2021; 47 (2): 255-308
View details for DOI 10.1162/COLI_a_00402
View details for Web of Science ID 000753208400002
-
Answering Open-Domain Questions of Varying Reasoning Steps from Text
ASSOC COMPUTATIONAL LINGUISTICS-ACL. 2021: 3599-3614
View details for Web of Science ID 000855966303064
-
DReCa: A General Task Augmentation Strategy for Few-Shot Natural Language Inference
ASSOC COMPUTATIONAL LINGUISTICS-ACL. 2021: 1113-1125
View details for Web of Science ID 000895685601021
-
Human-like informative conversations: Better acknowledgements using conditional mutual information
ASSOC COMPUTATIONAL LINGUISTICS-ACL. 2021: 768-781
View details for Web of Science ID 000895685600059
-
Challenges for Information Extraction from Dialogue in Criminal Law
ASSOC COMPUTATIONAL LINGUISTICS-ACL. 2021: 71-81
View details for Web of Science ID 000696679700008
-
Large-Scale Quantitative Evaluation of Dialogue Agents' Response Strategies against Offensive Users
ASSOC COMPUTATIONAL LINGUISTICS. 2021: 556-561
View details for Web of Science ID 000707001800058
-
Effective Social Chatbot Strategies for Increasing User Initiative
ASSOC COMPUTATIONAL LINGUISTICS. 2021: 99-110
View details for Web of Science ID 000707001800011
-
Understanding and predicting user dissatisfaction in a neural generative chatbot
ASSOC COMPUTATIONAL LINGUISTICS. 2021: 1-12
View details for Web of Science ID 000707001800001
-
Conditional probing: measuring usable information beyond a baseline
ASSOC COMPUTATIONAL LINGUISTICS-ACL. 2021: 1626-1639
View details for Web of Science ID 000855966301055
-
ELECTRA: PRE-TRAINING TEXT ENCODERS AS DISCRIMINATORS RATHER THAN GENERATORS
INFORMATION SYSTEMS RESEARCH
2020
View details for DOI 10.48550/arXiv.2003.10555
View details for Web of Science ID 001046971300001
-
Emergent linguistic structure in artificial neural networks trained by self-supervision.
Proceedings of the National Academy of Sciences of the United States of America
2020
Abstract
This paper explores the knowledge of linguistic structure learned by large artificial neural networks, trained via self-supervision, whereby the model simply tries to predict a masked word in a given context. Human language communication is via sequences of words, but language understanding requires constructing rich hierarchical structures that are never observed explicitly. The mechanisms for this have been a prime mystery of human language acquisition, while engineering work has mainly proceeded by supervised learning on treebanks of sentences hand labeled for this latent structure. However, we demonstrate that modern deep contextual language models learn major aspects of this structure, without any explicit supervision. We develop methods for identifying linguistic hierarchical structure emergent in artificial neural networks and demonstrate that components in these models focus on syntactic grammatical relationships and anaphoric coreference. Indeed, we show that a linear transformation of learned embeddings in these models captures parse tree distances to a surprising degree, allowing approximate reconstruction of the sentence tree structures normally assumed by linguists. These results help explain why these models have brought such large improvements across many language-understanding tasks.
View details for DOI 10.1073/pnas.1907367117
View details for PubMedID 32493748
-
Universal Dependencies v2: An Evergrowing Multilingual Treebank Collection
EUROPEAN LANGUAGE RESOURCES ASSOC-ELRA. 2020: 4034-4043
View details for Web of Science ID 000724697205003
-
Pre-Training Transformers as Energy-Based Cloze Models
ASSOC COMPUTATIONAL LINGUISTICS-ACL. 2020: 285-294
View details for Web of Science ID 000855160700020
-
RNNs can generate bounded hierarchical languages with optimal memory
ASSOC COMPUTATIONAL LINGUISTICS-ACL. 2020: 1978-2010
View details for Web of Science ID 000855160702013
-
Stanza: A Python Natural Language Processing Toolkit for Many Human Languages
ASSOC COMPUTATIONAL LINGUISTICS-ACL. 2020: 101–8
View details for Web of Science ID 000563368700014
-
Syn-QG: Syntactic and Shallow Semantic Rules for Question Generation
ASSOC COMPUTATIONAL LINGUISTICS-ACL. 2020: 752–65
View details for Web of Science ID 000570978201004
-
SLM: Learning a Discourse Language Representation with Sentence Unshuffling
ASSOC COMPUTATIONAL LINGUISTICS-ACL. 2020: 1551-1562
View details for Web of Science ID 000855160701060
-
GQA: A New Dataset for Real-World Visual Reasoning and Compositional Question Answering visualreasoning.net
IEEE. 2019: 6693–6702
View details for DOI 10.1109/CVPR.2019.00686
View details for Web of Science ID 000542649300016
-
CoQA: A Conversational Question Answering Challenge
TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS
2019; 7: 249-266
View details for DOI 10.1162/tacl_a_00266
View details for Web of Science ID 000736523200016
-
A Structural Probe for Finding Syntax in Word Representations
ASSOC COMPUTATIONAL LINGUISTICS-ACL. 2019: 4129-4138
View details for Web of Science ID 000900116904031
-
Answering Complex Open-domain Questions Through Iterative Query Generation
ASSOC COMPUTATIONAL LINGUISTICS-ACL. 2019: 2590-2602
View details for Web of Science ID 000854193302069
-
What does BERT look at? An Analysis of BERT's Attention
ASSOC COMPUTATIONAL LINGUISTICS-ACL. 2019: 276–86
View details for DOI 10.18653/v1/w19-4828
View details for Web of Science ID 000538563900029
-
Learning by Abstraction: The Neural State Machine
NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2019
View details for Web of Science ID 000534424305085
-
BAM! Born-Again Multi-Task Networks for Natural Language Understanding
ASSOC COMPUTATIONAL LINGUISTICS-ACL. 2019: 5931–37
View details for Web of Science ID 000493046108045
-
Simpler but More Accurate Semantic Dependency Parsing
ASSOC COMPUTATIONAL LINGUISTICS-ACL. 2018: 484–90
View details for Web of Science ID 000493913100077
-
Semi-Supervised Sequence Modeling with Cross-View Training
ASSOC COMPUTATIONAL LINGUISTICS-ACL. 2018: 1914-1925
View details for Web of Science ID 000865723402005
-
Graph Convolution over Pruned Dependency Trees Improves Relation Extraction
ASSOC COMPUTATIONAL LINGUISTICS-ACL. 2018: 2205-2215
View details for Web of Science ID 000865723402032
-
HOTPOTQA: A Dataset for Diverse, Explainable Multi-hop Question Answering
ASSOC COMPUTATIONAL LINGUISTICS-ACL. 2018: 2369-2380
View details for Web of Science ID 000865723402047
-
Textual Analogy Parsing: What's Shared and What's Compared among Analogous Facts
ASSOC COMPUTATIONAL LINGUISTICS-ACL. 2018: 82-92
View details for Web of Science ID 000865723400008
-
Arc-swift: A Novel Transition System for Dependency Parsing
ASSOC COMPUTATIONAL LINGUISTICS-ACL. 2017: 110–17
View details for DOI 10.18653/v1/P17-2018
View details for Web of Science ID 000493992300018
-
Key-Value Retrieval Networks for Task-Oriented Dialogue
ASSOC COMPUTATIONAL LINGUISTICS. 2017: 37-49
View details for Web of Science ID 000708086400007
-
Get To The Point: Summarization with Pointer-Generator Networks
ASSOC COMPUTATIONAL LINGUISTICS-ACL. 2017: 1073–83
View details for DOI 10.18653/v1/P17-1099
View details for Web of Science ID 000493984800099
-
Naturalizing a Programming Language via Interactive Learning
ASSOC COMPUTATIONAL LINGUISTICS-ACL. 2017: 929–38
View details for DOI 10.18653/v1/P17-1086
View details for Web of Science ID 000493984800086
-
A comparison of Named-Entity Disambiguation and Word Sense Disambiguation
EUROPEAN LANGUAGE RESOURCES ASSOC-ELRA. 2016: 860–67
View details for Web of Science ID 000526952501013
-
A Fast Unified Model for Parsing and Sentence Understanding
ASSOC COMPUTATIONAL LINGUISTICS-ACL. 2016: 1466–77
View details for Web of Science ID 000493806800139
-
Learning Language Games through Interaction
ASSOC COMPUTATIONAL LINGUISTICS-ACL. 2016: 2368–78
View details for Web of Science ID 000493806800224
-
Understanding Human Language: Can NLP and Deep Learning Help?
ASSOC COMPUTING MACHINERY. 2016: 1
View details for DOI 10.1145/2911451.2926732
View details for Web of Science ID 000455100800001
-
Combining Natural Logic and Shallow Reasoning for Question Answering
ASSOC COMPUTATIONAL LINGUISTICS-ACL. 2016: 442–52
View details for Web of Science ID 000493806800042
-
A Thorough Examination of the CNN/Daily Mail Reading Comprehension Task
ASSOC COMPUTATIONAL LINGUISTICS-ACL. 2016: 2358–67
View details for Web of Science ID 000493806800223
-
Improving Coreference Resolution by Learning Entity-Level Distributed Representations
ASSOC COMPUTATIONAL LINGUISTICS-ACL. 2016: 643–53
View details for Web of Science ID 000493806800061
-
Universal Dependencies v1: A Multilingual Treebank Collection
EUROPEAN LANGUAGE RESOURCES ASSOC-ELRA. 2016: 1659–66
View details for Web of Science ID 000526952501136
-
Enhanced English Universal Dependencies: An Improved Representation for Natural Language Understanding Tasks
EUROPEAN LANGUAGE RESOURCES ASSOC-ELRA. 2016: 2371–78
View details for Web of Science ID 000526952502095
-
Achieving Open Vocabulary Neural Machine Translation with Hybrid Word-Character Models
ASSOC COMPUTATIONAL LINGUISTICS-ACL. 2016: 1054–63
View details for Web of Science ID 000493806800100
-
Computational Linguistics and Deep Learning
COMPUTATIONAL LINGUISTICS
2015; 41 (4): 701-707
View details for DOI 10.1162/COLI_a_00239
View details for Web of Science ID 000367813400006
-
Natural Language Translation at the Intersection of AI and HCI
COMMUNICATIONS OF THE ACM
2015; 58 (9): 47-54
View details for DOI 10.1145/2767151
View details for Web of Science ID 000360214000018
-
Advances in natural language processing
SCIENCE
2015; 349 (6245): 261-266
Abstract
Natural language processing employs computational techniques for the purpose of learning, understanding, and producing human language content. Early computational approaches to language research focused on automating the analysis of the linguistic structure of language and developing basic technologies such as machine translation, speech recognition, and speech synthesis. Today's researchers refine and make use of such tools in real-world applications, creating spoken dialogue systems and speech-to-speech translation engines, mining social media for information about health or finance, and identifying sentiment and emotion toward products and services. We describe successes and challenges in this rapidly advancing area.
View details for DOI 10.1126/science.aaa8685
View details for Web of Science ID 000358218600041
View details for PubMedID 26185244
-
Text to 3D Scene Generation with Rich Lexical Grounding
ASSOC COMPUTATIONAL LINGUISTICS-ACL. 2015: 53–62
View details for Web of Science ID 000493808900006
-
Forum77: An Analysis of an Online Health Forum Dedicated to Addiction Recovery
ASSOC COMPUTING MACHINERY. 2015: 1511–26
View details for DOI 10.1145/2675133.2675146
View details for Web of Science ID 000371990400129
-
Robust Subgraph Generation Improves Abstract Meaning Representation Parsing
ASSOC COMPUTATIONAL LINGUISTICS-ACL. 2015: 982–91
View details for Web of Science ID 000493808900095
-
Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks
ASSOC COMPUTATIONAL LINGUISTICS-ACL. 2015: 1556–66
View details for Web of Science ID 000493808900150
-
Leveraging Linguistic Structure For Open Domain Information Extraction
ASSOC COMPUTATIONAL LINGUISTICS-ACL. 2015: 344–54
View details for Web of Science ID 000493808900034
-
On-the-Job Learning with Bayesian Decision Theory
NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2015
View details for Web of Science ID 000450913102009
-
Entity-Centric Coreference Resolution with Model Stacking
ASSOC COMPUTATIONAL LINGUISTICS-ACL. 2015: 1405–15
View details for Web of Science ID 000493808900136
-
Induced lexico-syntactic patterns improve information extraction from online medical forums.
Journal of the American Medical Informatics Association
2014; 21 (5): 902-909
Abstract
To reliably extract two entity types, symptoms and conditions (SCs), and drugs and treatments (DTs), from patient-authored text (PAT) by learning lexico-syntactic patterns from data annotated with seed dictionaries.Despite the increasing quantity of PAT (eg, online discussion threads), tools for identifying medical entities in PAT are limited. When applied to PAT, existing tools either fail to identify specific entity types or perform poorly. Identification of SC and DT terms in PAT would enable exploration of efficacy and side effects for not only pharmaceutical drugs, but also for home remedies and components of daily care.We use SC and DT term dictionaries compiled from online sources to label several discussion forums from MedHelp (http://www.medhelp.org). We then iteratively induce lexico-syntactic patterns corresponding strongly to each entity type to extract new SC and DT terms.Our system is able to extract symptom descriptions and treatments absent from our original dictionaries, such as 'LADA', 'stabbing pain', and 'cinnamon pills'. Our system extracts DT terms with 58-70% F1 score and SC terms with 66-76% F1 score on two forums from MedHelp. We show improvements over MetaMap, OBA, a conditional random field-based classifier, and a previous pattern learning approach.Our entity extractor based on lexico-syntactic patterns is a successful and preferable technique for identifying specific entity types in PAT. To the best of our knowledge, this is the first paper to extract SC and DT entities from PAT. We exhibit learning of informal terms often used in PAT but missing from typical dictionaries.
View details for DOI 10.1136/amiajnl-2014-002669
View details for PubMedID 24970840
View details for PubMedCentralID PMC4147618
-
Two Knives Cut Better Than One: Chinese Word Segmentation with Dual Decomposition
ASSOC COMPUTATIONAL LINGUISTICS-ACL. 2014: 193–98
View details for Web of Science ID 000493811100032
-
Robust Logistic Regression using Shift Parameters
ASSOC COMPUTATIONAL LINGUISTICS-ACL. 2014: 124-129
View details for Web of Science ID 000493811100021
-
Word Segmentation of Informal Arabic with Domain Adaptation
ASSOC COMPUTATIONAL LINGUISTICS-ACL. 2014: 206-211
View details for Web of Science ID 000493811100034
-
TransPhoner: Automated Mnemonic Keyword Generation
ASSOC COMPUTING MACHINERY. 2014: 3725-3734
View details for DOI 10.1145/2556288.2556985
View details for Web of Science ID 000773858603083
-
Faster Phrase-Based Decoding by Refining Feature State
ASSOC COMPUTATIONAL LINGUISTICS-ACL. 2014: 130-135
View details for Web of Science ID 000493811100022
-
Natural Logic and Natural Language Inference
COMPUTING MEANING, VOL 4
2014; 47: 129–47
View details for DOI 10.1007/978-94-007-7284-7_8
View details for Web of Science ID 000336508700008
-
A Gold Standard Dependency Corpus for English
EUROPEAN LANGUAGE RESOURCES ASSOC-ELRA. 2014: 2897–2904
View details for Web of Science ID 000355611004085
-
Event Extraction Using Distant Supervision
EUROPEAN LANGUAGE RESOURCES ASSOC-ELRA. 2014: 4527–31
View details for Web of Science ID 000355611006024
-
Universal Stanford Dependencies: A cross-linguistic typology
EUROPEAN LANGUAGE RESOURCES ASSOC-ELRA. 2014: 4585–92
View details for Web of Science ID 000355611006033
-
Learning Distributed Representations for Structured Output Prediction
NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2014
View details for Web of Science ID 000452647100102
-
Global Belief Recursive Neural Networks
NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2014
View details for Web of Science ID 000452647100054
-
Simple MAP Inference via Low-Rank Relaxations
NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2014
View details for Web of Science ID 000452647100040
-
The Stanford CoreNLP Natural Language Processing Toolkit
ASSOC COMPUTATIONAL LINGUISTICS-ACL. 2014: 55–60
View details for DOI 10.3115/v1/p14-5010
View details for Web of Science ID 000538328300010
-
Differentiating language usage through topic models
POETICS
2013; 41 (6): 607-625
View details for DOI 10.1016/j.poetic.2013.06.004
View details for Web of Science ID 000329558200003
-
Parsing Models for Identifying Multiword Expressions
COMPUTATIONAL LINGUISTICS
2013; 39 (1): 195-227
View details for Web of Science ID 000315648000009
- The Efficacy of Human Post-editing for Language Translation CHI 2013
- Learning a Product of Experts with Elitist Lasso. 2013
- Bilingual Word Embeddings for Phrase-Based Machine Translation 2013
- Effect of Nonlinear Deep Architecture in Sequence Labeling. 2013
- Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank 2013
- Zero-Shot Learning Through Cross-Modal Transfer 2013
- Named Entity Recognition with Bilingual Constraints. 2013
- Effective Bilingual Constraints for Semi-supervised Learning of Named Entity Recognizers. 2013
- Parsing With Compositional Vector Grammars 2013
- Differentiating Language Usage through Topic Models Poetics Daniel A. McFarland, Daniel Ramage, Jason Chuang, Jeffrey Heer, and Christopher D. Manning 2013
- Fast and Adaptive Online Training of Feature-Rich Translation Models Proc. ACL 2013
- Joint Word Alignment and Bilingual Named Entity Recognition Using Dual Decomposition. 2013
- Same Referent, Different Words: Unsupervised Mining of Opaque Coreferent Mentions. 2013
- Tradition and Modernity in 20th Century Chinese Poetry. 2013
- Generating Recommendation Dialogs by Extracting Information from User Reviews. 2013
- Linguistic Models for Analyzing and Detecting Biased Language. 2013
- A computational approach to politeness with application to social factors 2013
- Effect of Non-linear Deep Architecture in Sequence Labeling. 2013
- Philosophers are Mortal: Inferring the Truth of Unseen Facts. 2013
- Semantic Parsing on Freebase from Question-Answer Pairs. 2013
- Learning Biological Processes with Global Constraints. 2013
- Language-Independent Discriminative Parsing of Temporal Expressions 2013
- Parsing entire discourses as very long strings: Capturing topic continuity in grounded language learning. Transactions of the Association for Computational Linguistics 2013; 3 (1): 315-323
- Zero Shot Learning Through Cross-Modal Transfer. In Advances in Neural Information Processing Systems 2013; 26
- The Life and Death of Discourse Entities: Identifying Singleton Mentions 2013
- Implicatures and Nested Beliefs in Approximate Decentralized-POMDPs. 2013
- Learning New Facts From Knowledge Bases With Neural Tensor Networks and Semantic Word Vectors 2013
- Better Word Representations with Recursive Neural Networks for Morphology. 2013
- No country for old members: User lifecycle and linguistic change in online communities. 2013
- Breaking Out of Local Optima with Count Transforms and Model Recombination: A Study in Grammar Induction. 2013
- Reasoning With Neural Tensor Networks For Knowledge Base Completion. In Advances in Neural Information Processing Systems 2013; 26
- Crowdsourcing and the Crisis-affected Population Information Retrieval 2013; 2 (16): 210-266
-
"Without the Clutter of Unimportant Words": Descriptive Keyphrases for Text Visualization
ACM TRANSACTIONS ON COMPUTER-HUMAN INTERACTION
2012; 19 (3)
View details for DOI 10.1145/2362364.2362367
View details for Web of Science ID 000310780700003
-
Combining joint models for biomedical event extraction
Conference on BioNLP Shared Task
BIOMED CENTRAL LTD. 2012
Abstract
We explore techniques for performing model combination between the UMass and Stanford biomedical event extraction systems. Both sub-components address event extraction as a structured prediction problem, and use dual decomposition (UMass) and parsing algorithms (Stanford) to find the best scoring event structure. Our primary focus is on stacking where the predictions from the Stanford system are used as features in the UMass system. For comparison, we look at simpler model combination techniques such as intersection and union which require only the outputs from each system and combine them directly.First, we find that stacking substantially improves performance while intersection and union provide no significant benefits. Second, we investigate the graph properties of event structures and their impact on the combination of our systems. Finally, we trace the origins of events proposed by the stacked model to determine the role each system plays in different components of the output. We learn that, while stacking can propose novel event structures not seen in either base model, these events have extremely low precision. Removing these novel events improves our already state-of-the-art F1 to 56.6% on the test set of Genia (Task 1). Overall, the combined system formed via stacking ("FAUST") performed well in the BioNLP 2011 shared task. The FAUST system obtained 1st place in three out of four tasks: 1st place in Genia Task 1 (56.0% F1) and Task 2 (53.9%), 2nd place in the Epigenetics and Post-translational Modifications track (35.0%), and 1st place in the Infectious Diseases track (55.6%).We present a state-of-the-art event extraction system that relies on the strengths of structured prediction and model combination through stacking. Akin to results on other tasks, stacking outperforms intersection and union and leads to very strong results. The utility of model combination hinges on complementary views of the data, and we show that our sub-systems capture different graph properties of event structures. Finally, by removing low precision novel events, we show that performance from stacking can be further improved.
View details for Web of Science ID 000306140800009
View details for PubMedID 22759463
View details for PubMedCentralID PMC3395172
-
Did It Happen? The Pragmatic Complexity of Veridicality Assessment
COMPUTATIONAL LINGUISTICS
2012; 38 (2): 301-333
View details for Web of Science ID 000306100900003
- Capitalization Cues Improve Dependency Grammar Induction. 2012
-
Short message communications: users, topics, and in-language processing
ASSOC COMPUTING MACHINERY. 2012
View details for Web of Science ID 000395793500004
- Towards a Literary Machine Translation: The Role of Referential Cohesion 2012
- Multi-instance Multi-label Learning for Relation Extraction 2012
- Learning Constraints for Consistent Timeline Extraction. 2012
- SPEDE: Probabilistic Edit Distance Metrics for MT Evaluation. 2012
- A Comparison of Chinese Parsers for Stanford Dependencies 2012
- Stanford’s System for Parsing the English Web 2012
- Improving Word Representations via Global Context and Multiple Word Prototypes. 2012
- Semantic Compositionality Through Recursive Matrix-Vector Spaces 2012
- Towards a Computational History of the ACL: 1980-2008. 2012
- Joint Entity and Event Coreference Resolution across Documents. 2012
- Entity Clustering Across Languages 2012
- A Cross-Lingual Dictionary for English Wikipedia Concepts 2012
- Annotating Near-Identity from Coreference Disagreements. 2012
- Baselines and Bigrams: Simple, Good Sentiment and Topic Classification. 2012
- Probabilistic Finite State Machines for Regression-based MT Evaluation. 2012
- Three Dependency-and-Boundary Models for Grammar Induction 2012
- Automatic animacy classification 2012
- Citation-based bootstrapping for large-scale author disambiguation Journal of the American Society for Information Science and Technology 2012; 5 (63): 301-333
- Coreference resolution: an empirical study based on SemEval-2010 shared Task 1 Language Resources and Evaluation. 2012
- Learning Attitudes and Attributes from Multi-Aspect Reviews In International Conference on Data Mining. 2012
- Without the clutter of unimportant words ACM Transactions on Computer-Human Interaction (TOCHI) 2012; 3 (19)
- Stanford: Probabilistic Edit Distance Metrics for STS. 2012
- A Computational Analysis of Style, Affect, and Imagery in Contemporary Poetry. 2012
- Bootstrapping Dependency Grammar Inducers from Incomplete Sentence Fragments via Austere Models 2012
- Parsing Time: Learning to Interpret Time Expressions 2012
- He Said, She Said: Gender in the ACL Anthology 2012
-
SUTIME: A Library for Recognizing and Normalizing Time Expressions
8th International Conference on Language Resources and Evaluation (LREC)
EUROPEAN LANGUAGE RESOURCES ASSOC-ELRA. 2012: 3735–3740
View details for Web of Science ID 000323927703131
- SUTIME: A Library for Recognizing and Normalizing Time Expressions. In Eighth International Conference on Language Resources and Evaluation (LREC 2012) 2012
- Convolutional-Recursive Deep Learning for 3D Object Classification. In Advances in Neural Information Processing Systems 2012; 25
-
Termite: Visualization Techniques for Assessing Textual Topic Models
International Working Conference on Advanced Visual Interfaces (AVI)
ASSOC COMPUTING MACHINERY. 2012: 74–77
View details for DOI 10.1145/2254556.2254572
View details for Web of Science ID 000323214900014
-
Part-of-Speech Tagging from 97% to 100%: Is It Time for Some Linguistics?
12th Annual Conference on Intelligent Text Processing and Computational Linguistics
SPRINGER-VERLAG BERLIN. 2011: 171–189
View details for Web of Science ID 000302390500014
-
Veridicality and utterance understanding
IEEE COMPUTER SOC. 2011: 430–37
View details for DOI 10.1109/ICSC.2011.10
View details for Web of Science ID 000410187400074
- Punctuation: Making a Point in Unsupervised Dependency Parsing 2011
- Lateen EM: Unsupervised Training with Multiple Objectives, Applied to Dependency Grammar Induction. 2011
- Stanford's Distantly-Supervised Slot-Filling System. 2011
- Parsing Natural Scenes and Natural Language with Recursive Neural Networks. 2011
- Analyzing the Dynamics of Research by Extracting Key Aspects of Scientific Papers. 2011
- Model Combination for Event Extraction in BioNLP 2011 2011
- Stanford-UBC Entity Linking at TAC-KBP, Again 2011
- Event Extraction as Dependency Parsing 2011
- Subword and spatiotemporal models for identifying actionable information in Haitian Kreyol. 2011
- Unsupervised Dependency Parsing without Gold Part-of-Speech Tags. 2011
- Strong Baselines for Cross-Lingual Entity Linking. 2011
- LeadLag LDA: Estimating Topic Specific Leads and Lags of Information Outlets 2011
- Spectral Chinese Restaurant Processes: Nonparametric Clustering Based on Similarities 2011
- Event Extraction as Dependency Parsing for BioNLP 2011 2011
- Customizing an Information Extraction System to a New Domain 2011
- A Study of Academic Collaborations in Computational Linguistics using a Latent Mixture of Authors Model 2011
- Stanford's Multi-Pass Sieve Coreference Resolution System at the CoNLL-2011 Shared Task 2011
- Template-Based Information Extraction without the Templates. 2011
- Parsing Natural Scenes and Natural Language with Recursive Neural Networks 2011
- Multiword Expression Identification with Tree Substitution Grammars: A Parsing tour de force with French In EMNLP. 2011
- Risk Analysis for Intellectual Property Litigation 2011
- Veridicality and utterance understanding. 2011
- The Role of Social Networks in Online Shopping: Information Passing, Price of Trust, and Consumer Choice 2011
- Semi-Supervised Recursive Autoencoders for Predicting Sentiment Distributions 2011
- Spectral Chinese Restaurant Processes: Nonparametric Clustering Based on Similarities 2011
- Dynamic Pooling and Unfolding Recursive Autoencoders for Paraphrase Detection Advances in Neural Information Processing Systems 2011
- Risk Analysis for Intellectual Property Litigation 2011
- Using Evolutive Summary Counters for Efficient Cooperative Caching in Search Engines IEEE Transactions on Parallel and Distributed Systems 99(PrePrints) 2011
- Learning to Rank Answers to Non-Factoid Questions from Web Collections Computational Linguistics 2011; 2 (37)
- Dynamic Pooling and Unfolding Recursive Autoencoders for Paraphrase Detection In Advances in Neural Information Processing Systems 2011; 24
- TopicFlow model: Unsupervised learning of topic specific influences of hyperlinked documents Artificial Intelligence and Statistics 2011
- Veridicality and utterance understanding CA: IEEE Computer Society Press 2011
-
Assessing the relationship between excess argon content and recrystallization of ultrahigh-pressure metamorphic rocks
Conference on Goldschmidt 2010 - Earth, Energy, and the Environment
PERGAMON-ELSEVIER SCIENCE LTD. 2010: A698–A698
View details for Web of Science ID 000283941402100
-
Which words are hard to recognize? Prosodic, lexical, and disfluency factors that increase speech recognition error rates
SPEECH COMMUNICATION
2010; 52 (3): 181-200
View details for DOI 10.1016/j.specom.2009.10.001
View details for Web of Science ID 000274888900001
- The NXT-format Switchboard Corpus: a rich resource for investigating the syntax, semantics, pragmatics and prosody of dialogue. Language Resources \& Evaluation 44: 2010: 387-419
-
Parsing to Stanford Dependencies: Trade-offs between speed and accuracy
EUROPEAN LANGUAGE RESOURCES ASSOC-ELRA. 2010
View details for Web of Science ID 000356879504010
-
Legal Claim Identification: Information Extraction with Hierarchically Labeled Data
EUROPEAN LANGUAGE RESOURCES ASSOC-ELRA. 2010: I22–I29
View details for Web of Science ID 000356879501131
-
Hierarchical Joint Learning: Improving Joint Parsing and Named Entity Recognition with Non-Jointly Labeled Data
ASSOC COMPUTATIONAL LINGUISTICS. 2010: 720–28
View details for Web of Science ID 000391195300074
-
"Was it good? It was provocative." Learning the meaning of scalar adjectives
ASSOC COMPUTATIONAL LINGUISTICS. 2010: 167–76
View details for Web of Science ID 000391195300018
- Characterizing Microblogs with Topic Models 2010
- Learning Continuous Phrase Representations and Syntactic Parsing with Recursive Neural Networks 2010
- Legal Claim Identification: Information Extraction with Hierarchically Labeled Data 2010
- Learning to Follow Navigational Directions 2010
- Hierarchical Joint Learning: Improving Joint Parsing and Named Entity Recognition with Non-Jointly Labeled Data 2010
- Accurate Non-Hierarchical Phrase-Based Translation 2010
- Better Arabic Parsing: Baselines, Evaluations, and Analysis 2010
- Who should I cite? Learning literature search models from citation behavior. 2010
- Automatic Domain Adaptation for Parsing 2010
- How good are humans at solving CAPTCHAs? A large scale evaluation 2010
- Improved Models of Distortion Cost for Statistical Machine Translation 2010
- Ensemble Models for Dependency Parsing: Cheap and Good? 2010
- Stanford-UBC Entity Linking at TAC-KBP 2010
- Parsing to Stanford Dependencies: Trade-offs between speed and accuracy 2010
- mproving Semantic Role Classification with Selectional Preferences. 2010
- Profiting from Mark-Up: Hyper-Text Annotations for Guided Parsing 2010
- Was it good? It was provocative. Learning the meaning of scalar adjectives 2010
- Improving the Use of Pseudo-Words for Evaluating Selectional Preferences 2010
- Probabilistic Tree-Edit Models with Structured Latent Variables for Textual Entailment and Question Answering 2010
- A Simple Distant Supervision Approach for the TAC-KBP Slot Filling Task 2010
- Phrasal: a toolkit for statistical machine translation with facilities for extraction and incorporation of arbitrary model features 2010
- A Database of Narrative Schemas 2010
- Was it good? It was provocative,Learning the meaning of scalar adjectives 2010
- From Baby Steps to Leapfrog: How “Less is More” in Unsupervised Dependency Parsing 2010
- The best lexical metric for phrase-based statistical MT system optimization 2010
- Viterbi Training Improves Unsupervised Dependency Parsing 2010
- A Multi-Pass Sieve for Coreference Resolution 2010
- Subword Variation in Text Message Classification 2010
- Crowdsourced translation for emergency response in Haiti: the global collaboration of local knowledge 2010
-
Measuring Machine Translation Quality as Semantic Equivalence: A Metric Based on Entailment Features
Machine Translation
2010; 23: 181-193
View details for DOI 10.1007/s10590-009-9060-y
- Phrasal: A Toolkit for Statistical Machine Translation with Facilities for Extraction and Incorporation of Arbitrary Model Features 2010
- Predictability Effects on Durations of Content and Function Words in Conversational English. Journal of Memory and Language 2009; 1 (60): 92-111
- Joint Parsing and Named Entity Recognition 2009
- Clustering the Tagged Web. 2009
- Hidden Conditional Random Fields for Phone Recognition 2009
- Robust Machine Translation Evaluation with Entailment Features 2009
- Distant supervision for relation extraction without labeled data 2009
- Labeled LDA: A supervised topic model for credit attribution in multi-labeled corpora 2009
- NP subject detection in verb-initial Arabic clauses 2009
- Topic Modeling for the Social Sciences 2009
- Disambiguating “DE” for Chinese-English Machine Translation 2009
- Extracting Social Meaning: Identifying Interactional Style in Spoken Conversation 2009
- NP subject detection in verb-initial Arabic clauses 2009
- An extended model of natural logic 2009
- Textual Entailment Features for Machine Translation Evaluation 2009
- Discriminative Reordering with Chinese Grammatical Relations Features. 2009
- Quadratic-Time Dependency Parsing for Machine Translation 2009
- Multi-word expressions in textual inference: Much ado about nothing? 2009
- WikiWalk: Random walks on Wikipedia for Semantic Relatedness 2009
- Random Walks for Text Semantic Similarity 2009
- Stanford-UBC at TAC-KBP 2009
- Unsupervised Learning of Narrative Schemas and their Participants 2009
- Revisiting Graphemes with Increasing Amounts of Data 2009
- An extended model of natural logic 2009
- Clustering the Tagged Web 2009
- Discriminative Reordering with Chinese Grammatical Relations Features 2009
- Nested Named Entity Recognition 2009
- Baby Steps: How “Less is More” in Unsupervised Dependency Parsing 2009
- Disambiguating DE for Chinese-English Machine Translation 2009
- Stanford University's Arabic-to-English Statistical Machine Translation System for the 2009 NIST Evaluation The 2009 NIST Open Machine Translation Evaluation Workshop 2009
- Hierarchical Bayesian Domain Adaptation. 2009
- It's Not You, it's Me: Detecting Flirting and its Misperception in Speed-Dates 2009
- Textual Entailment Features for Machine Translation Evaluation 2009
- Robust Machine Translation Evaluation with Entailment Features 2009
- Labeled LDA: A supervised topic model for credit attribution in multi-labeled corpora 2009
- Nested Named Entity Recognition 2009
-
A global joint model for semantic role labeling
COMPUTATIONAL LINGUISTICS
2008; 34 (2): 161-191
View details for Web of Science ID 000257139400002
- The Stanford typed dependencies representation 2008
- Studying the History of Ideas Using Topic Models. 2008
- Modeling semantic containment and exclusion in natural language inference 2008
- Optimizing Chinese Word Segmentation for Machine Translation Performance. 2008
- Parsing Three German Treebanks: Lexicalized and Unlexicalized Baselines. 2008
- Deciding Entailment and Contradiction with Stochastic and Edit Distance-based Alignment 2008
- Modeling semantic containment and exclusion in natural language inference 2008
- Legal Docket Classification: Where Machine Learning Stumbles 2008
- A phrase-based alignment model for natural language inference 2008
- Maximum Conditional Likelihood Linear Regression and Maximum a Posteriori for Hidden Conditional Random Fields Speaker Adaptation 2008
- Detecting prominence in conversational speech: pitch accent, givenness and focus 2008
- Which words are hard to recognize? Lexical, prosodic, and disfluency factors that increase ASR error rates 2008
- Finding Contradictions in Text 2008
- Social Tag Prediction 2008
- Semantic Role Assignment for Event Nominalisations by Leveraging Verbal Data 2008
- The Stanford typed dependencies representation 2008
- Jointly Combining Implicit Constraints Improves Temporal Ordering 2008
- Regularization and Search for Minimum Error Rate Training 2008
- Finding Contradictions in Text 2008
- Studying the History of Ideas Using Topic Models 2008
- A Structured Vector Space Model for Word Meaning in Context 2008
- Efficient, Feature-based, Conditional Random Field Parsing. 2008
- Speech and Language Processing: An Introduction to Natural Language Processing, Speech Recognition, and Computational Linguistics. Prentice-Hall. 2008
- Constructing Integrated Corpus and Lexicon Models for Multi-Layer Annotation in OWL DL Linguistic Issues in Language Technologies 2008; 1: 1-33
- Comparing and Combining Semantic Verb Classifications. Journal of Language Resources and Evaluation 2008; 3 (42)
- ntroduction to Information Retrieval. Cambridge: Cambridge University Press. 2008
- A phrase-based alignment model for natural language inference 2008
- A Simple and Effective Hierarchical Phrase Reordering Model 2008
- Enforcing Transitivity in Coreference Resolution 2008
- Cheap and Fast - But is it Good? Evaluating Non-Expert Annotations for Natural Language Tasks 2008
- Unsupervised Learning of Narrative Event Chains 2008
- A Simple and Effective Hierarchical Phrase Reordering Model 2008
- Optimizing Chinese Word Segmentation for Machine Translation Performance 2008
- Enforcing Transitivity in Coreference Resolution. 2008
- Which Words Are Hard to Recognize? Prosodic, Lexical, and Disfluency Factors that Increase ASR Error Rates 2008
- Efficient, Feature-based, Conditional Random Field Parsing 2008
-
Regularization, adaptation, and non-independent features improve Hidden Conditional Random Fields for phone classification
IEEE Workshop on Automatic Speech Recognition and Understanding
IEEE. 2007: 347–352
View details for Web of Science ID 000255861600062
- Modelling Prominence and Emphasis Improves Unit-Selection Synthesis 2007
- Aligning semantic graphs for textual inference and machine reading. 2007
- Measuring Importance and Query Relevance in Topic-focused Multi-document Summarization 2007
- Classifying Temporal Relations Between Events 2007
- Learning Alignments and Leveraging Natural Logic 2007
- Aligning semantic graphs for textual inference and machine reading 2007
- The Infinite Tree 2007
- Learning to Merge Word Senses 2007
- The Effect of Lexical Frequency on Tone Production 2007
- Lexical Semantic Relatedness with Random Graph Walks 2007
- Disambiguating Between Generic and Referential “You” in Dialog 2007
- Natural logic for textual inference 2007
- A fully Bayesian approach to unsupervised part-of-speech tagging. 2007
- A Discriminative Syntactic Word Order Model for Machine Translation 2007
- Learning Alignments and Leveraging Natural Logic 2007
- Natural logic for textual inference. 2007
-
Probabilistic models of language processing and acquisition
Workshop on Probabilistic Models of Cognition - The Mathematics of Mind
ELSEVIER SCIENCE LONDON. 2006: 335–44
Abstract
Probabilistic methods are providing new explanatory approaches to fundamental cognitive science questions of how humans structure, process and acquire language. This review examines probabilistic models defined over traditional symbolic structures. Language comprehension and production involve probabilistic inference in such models; and acquisition involves choosing the best model, given innate constraints and linguistic and other input. Probabilistic models can account for the learning and processing of language, while maintaining the sophistication of symbolic models. A recent burgeoning of theoretical developments and online corpus creation has enabled large models to be tested, revealing probabilistic constraints in processing, undermining acquisition arguments based on a perceived poverty of the stimulus, and suggesting fruitful links with probabilistic theories of categorization and ambiguity resolution in perception.
View details for DOI 10.1016/j.tics.2006.05.006
View details for Web of Science ID 000239648200008
View details for PubMedID 16784883
-
An Effective Two-Stage Model for Exploiting Non-Local Dependencies in Named Entity Recognition
21st International Conference on Computational Linguistics/44th Annual Meeting of the Association for Computational Linguistics
ASSOC COMPUTATIONAL LINGUISTICS-ACL. 2006: 1121–1128
View details for Web of Science ID 000274500200141
-
Graphical model representations of word lattices
IEEE. 2006: 162-+
View details for DOI 10.1109/SLT.2006.326842
View details for Web of Science ID 000245891500040
- Detection of Word Fragments in Mandarin Telephone Conversation 2006
- Learning to distinguish valid textual entailments 2006
- Automatically Detecting Action Items in Audio Meeting Recordings 2006
- Unsupervised Discovery of a Statistical Verb Lexicon. 2006
- Tregex and Tsurgeon: tools for querying and manipulating tree data structures. 2006
- Generating Typed Dependency Parses from Phrase Structure Parses 2006
- Solving the Problem of Cascading Errors: Approximate Bayesian Inference for Linguistic Annotation Pipelines 2006
- The (Non)Utility of Linguistic Features for Predicting Prominence in Spontaneous Speech 2006
- Learning to recognize features of valid textual entailments. 2006
- Learning to distinguish valid textual entailments. 2006
- olving the Problem of Cascading Errors: Approximate Bayesian Inference for Linguistic Annotation Pipelines. 2006
- Graphical Model Representations of Word Lattices 2006
- Ergativity Encyclopedia of Language & Linguistics, Second Edition edited by Keith Brown, Ergativity., In Oxford: Elsevier. 2006: 210–217
- Learning to recognize features of valid textual entailments 2006
- Semantic Taxonomy Induction from Heterogenous Evidence 2006
- Local Textual Inference: It's hard to circumscribe, but you know it when you see it - and NLP needs it MS, Stanford University 2006
- Generating Typed Dependency Parses from Phrase Structure Parses 2006
- Unsupervised Discovery of a Statistical Verb Lexicon 2006
-
Programming for linguists: Java (TM) technology for language researchers. (Book Review)
LANGUAGE
2005; 81 (3): 740-742
View details for Web of Science ID 000232076300010
-
Natural language grammar induction with a generative constituent-context model
40th Annual Meeting of the Association-for-Computational-Linguistics
ELSEVIER SCI LTD. 2005: 1407–19
View details for DOI 10.1016/j.patcog.2004.03.023
View details for Web of Science ID 000230047900007
-
Exploring the boundaries: gene and protein identification in biomedical text
BMC BIOINFORMATICS
2005; 6
Abstract
Good automatic information extraction tools offer hope for automatic processing of the exploding biomedical literature, and successful named entity recognition is a key component for such tools.We present a maximum-entropy based system incorporating a diverse set of features for identifying gene and protein names in biomedical abstracts.This system was entered in the BioCreative comparative evaluation and achieved a precision of 0.83 and recall of 0.84 in the "open" evaluation and a precision of 0.78 and recall of 0.85 in the "closed" evaluation.Central contributions are rich use of features derived from the training data at multiple levels of granularity, a focus on correctly identifying entity boundaries, and the innovative use of several external knowledge sources including full MEDLINE abstracts and web searches.
View details for DOI 10.1186/1471-2105-6-S1-S5
View details for Web of Science ID 000236061400005
View details for PubMedID 15960839
View details for PubMedCentralID PMC1869019
-
A system for identifying named entities in biomedical text: how results from two evaluations reflect on both the system and the evaluations
ISMB BioLink 2004 Meeting
HINDAWI PUBLISHING CORPORATION. 2005: 77–85
Abstract
We present a maximum entropy-based system for identifying named entities (NEs) in biomedical abstracts and present its performance in the only two biomedical named entity recognition (NER) comparative evaluations that have been held to date, namely BioCreative and Coling BioNLP. Our system obtained an exact match F-score of 83.2% in the BioCreative evaluation and 70.1% in the BioNLP evaluation. We discuss our system in detail, including its rich use of local features, attention to correct boundary identification, innovative use of external knowledge resources, including parsing and web searches, and rapid adaptation to new NE sets. We also discuss in depth problems with data annotation in the evaluations which caused the final performance to be lower than optimal.
View details for DOI 10.1002/cfg.457
View details for Web of Science ID 000227860600007
View details for PubMedID 18629295
View details for PubMedCentralID PMC2448599
- Robust Textual Inference using Diverse Knowledge Sources 2005
- Incorporating non-local information into information extraction systems by Gibbs sampling. 2005
- Accent Detection and Speech Recognition for Shanghai-Accented Mandarin 2005
- A Joint Model for Semantic Role Labeling 2005
- A Conditional Random Field Word Segmenter 2005
- Stochastic HPSG Parse Disambiguation using the Redwoods Corpus. Research in Language and Computation 2005
- LinGO Redwoods: A Rich and Dynamic Treebank for HPSG Research in Language and Computation 2005
- Robust Textual Inference via Graph Matching HLT-EMNLP 2005: 387-394
- Pitch Accent Prediction: Effects of Genre and Speaker 2005
- The Detection of Emphatic Words Using Acoustic and Lexical Features 2005
- Unsupervised learning of field segmentation models for information extraction 2005
- Stochastic HPSG Parse Disambiguation using the Redwoods Corpus 2005
- Morphological features help POS tagging of unknown words across language varieties 2005
- Unsupervised Learning of Field Segmentation Models for Information Extraction 2005
- Incorporating Non-local Information into Information Extraction Systems by Gibbs Sampling. 2005
- Template Sampling for Leveraging Domain Knowledge in Information Extraction 2005
- A preliminary study of Mandarin filled pauses. 2005
- Robust textual inference via learning and abductive reasoning. 2005
- Joint learning imrpoves semantic role labeling 2005
- Semantic Role Labeling Using Different Syntactic Views 2005
- A Conditional Random Field Word Segmenter 2005
- Joint Learning Improves Semantic Role Labeling 2005
- Learning syntactic patterns for automatic hypernym discovery 2005
- A Rich and Dynamic Treebank for HPSG 2005
-
How useful and usable are dictionaries for speakers of Australian indigenous languages?
INTERNATIONAL JOURNAL OF LEXICOGRAPHY
2004; 17 (1): 33-68
View details for Web of Science ID 000222885100002
-
Using feature conjunctions across examples for learning pairwise classifiers
15th European Conference on Machine Learning/8th European Conference on Principles and Practice of Knowledge Discovery in Databases
SPRINGER-VERLAG BERLIN. 2004: 322–333
View details for Web of Science ID 000223999500031
-
Log-linear models for label ranking
MIT PRESS. 2004: 497-504
View details for Web of Science ID 000225309500063
- Solving logic puzzles: from robust processing to precise semantics. 2004
- Deep dependencies from context-free statistical parsers: correcting the surface dependency approximation. 2004
- A System For Identifying Named Entities in Biomedical Text: How Results From Two Evaluations Reflect on Both the System and the Evaluations 2004
- Exploiting Context for Biomedical Entity Recognition: From Syntax to the Web 2004
- Shallow semantic parsing using support vector machines. 2004
- Exploiting Context for Biomedical Entity Recognition: From Syntax to the Web 2004
- Deep dependencies from context-free statistical parsers: correcting the surface dependency approximation 2004
- Corpus-Based Induction of Syntactic Structure: Models of Dependency and Constituency 2004
- Solving Logic Puzzles: From Robust Processing to Precise Semantics 2004
- Max-Margin Parsing 2004
- Log-Linear Models for Label Ranking Advances in Neural Information Processing Systems 16 (NIPS 2003). edited by Thrun, S., Saul, Lawrence, K., Schölkopf, B. Cambridge, MA: MIT Press. 2004: 497–504
- Corpus-Based Induction of Syntactic Structure: Models of Constituency and Dependency Language Learning: An Interdisciplinary Perspective. edited by Cohen, P., Clark, A., Hovy, E. AAAI Spring Symposium. 2004: 32–38
- Exploring Sentiment Summarization Exploring Attitude and Affect in Text: Theories and Applications edited by Qu, I. Y., Shanahan, J., Wiebe, J. AAAI Spring Symposium Technical Report SS-04-07. 2004: 12–15
- Automatic extraction of option propositions and their holders 2004
- Parsing arguments of nominalizations in English and Chinese. 2004
- Verb Sense and Subcategorization: Using Joint Inference to Improve Performance on Complementary Tasks 2004
- Parsing and Hypergraphs. New Developments in Parsing Technology. edited by Bunt, H., Carroll, J., Satta, G. Dordrecht: Kluwer Academic Publishers. 2004: 351–372
- Automatic tagging of arabic text: from raw text to base phrase chunks. 2004
- Log-Linear Models for Label Ranking Advances in Neural Information Processing Systems 16 (NIPS 2003) edited by Thrun, S., Saul, Lawrence, K., Schölkopf, B. Cambridge. 2004: 497–504
- Exploring the Boundaries: Gene and Protein Identification in Biomedical Text 2004
- Learning Random Walk Models for Inducing Word Dependency Distributions 2004
- The Leaf Projection Path View of Parse Trees: Exploring String Kernels for HPSG Parse Selection 2004
-
Parsing and hypergraphs
7th International Workshop on Parsing Technology
SPRINGER. 2004: 351–372
View details for Web of Science ID 000223030900018
-
Is it harder to parse Chinese, or the Chinese treebank?
41st Annual Meeting of the Association-for-Computational-Linguistics
ASSOCIATION COMPUTATIONAL LINGUISTICS. 2003: 439–446
View details for Web of Science ID 000223097500056
- Spectral Learning 2003
- The EigenTrust Algorithm for Reputation Management in P2P Networks 2003
- Addressing the Non-Cooperation Problem in Competitive P2P Networks 2003
- Incentives for Combatting Freeriding on P2P Networks 2003
- Adaptive Methods for the Computation of PageRank In Linear Algebra and its Applications, Special Issue on the Numerical Solution of Markov Chains 2003
- The Second Eigenvalue of the Google Matrix Stanford University Technical Report, March 2003
- Named Entity Recognition with Character-Level Models 2003
- Fast Exact Inference with a Factored Model for Natural Language Parsing Advances in Neural Information Processing Systems 15 (NIPS 2002) edited by Becker, S., Thrun, S., Obermayer, K. Cambridge, MA: MIT Press. 2003: 3–10
- An Analytical Comparison of Approaches to Personalizing PageRank Stanford University Technical Report, June 2003
- Parse Selection on the Redwoods Corpus: 3rd Growth Results. Technical report dbpubs 2003-64, Stanford University. 2003
- Extrapolation Methods for Accelerating PageRank Computations 2003
- Extrapolation Methods for Accelerating PageRank Computations. 2003
- Factored A* Search for Models over Sequences and Trees 2003
- Computing PageRank using Power Extrapolation Stanford University Technical Report dbpubs/2003-45 2003
- Spectral Learning 2003
- Factored A* Search for Models over Sequences and Trees 2003
- Finding Educational Resources on the Web: Exploiting Automatic Extraction of Metadata 2003
- Exploiting the Block Structure of the Web for Computing PageRank Stanford University Technical Report, June 2003
- The Condition Number of the PageRank Problem Stanford University Technical Report, June 2003
- Exploiting the Block Structure of theWeb for Computing PageRank tanford University Technical Report dbpubs/2003-17 2003
-
A generative model for semantic role labeling
14th European Conference on Machine Learning
SPRINGER-VERLAG BERLIN. 2003: 397–408
View details for Web of Science ID 000187061900036
-
Accurate unlexicalized parsing
41st Annual Meeting of the Association-for-Computational-Linguistics
ASSOCIATION COMPUTATIONAL LINGUISTICS. 2003: 423–430
View details for Web of Science ID 000223097500054
-
Probabilistic syntax
Symposium on Probability Theory in Linguistics held at the Linguistic-Society-of-America Meeting
M I T PRESS. 2003: 289–341
View details for Web of Science ID 000187006000008
-
A* parsing: Fast extract viterbi parse selection
Human Language Technology Conference
ASSOCIATION COMPUTATIONAL LINGUISTICS. 2003: 119–126
View details for Web of Science ID 000223096100016
-
Feature-rich part-of-speech tagging with a cyclic dependency network
Human Language Technology Conference
ASSOCIATION COMPUTATIONAL LINGUISTICS. 2003: 252–259
View details for Web of Science ID 000223096100033
-
Optimizing local probability models for statistical parsing
14th European Conference on Machine Learning
SPRINGER-VERLAG BERLIN. 2003: 409–420
View details for Web of Science ID 000187061900037
- Statistical approaches to natural language processing Encyclopedia of Cognitive Science. edited by Nadel, L. London: Nature Publishing Group. 2003: 1
-
Beyond grammar: An experiences theory of language (Book Review)
JOURNAL OF LINGUISTICS
2002; 38 (2): 441-442
View details for Web of Science ID 000178288900017
-
Extensions to HMM-based statistical word alignment models
Conference on Empirical Methods in Natural Language Processing
ASSOCIATION COMPUTATIONAL LINGUISTICS. 2002: 87–94
View details for Web of Science ID 000223079900012
- Parse Disambiguation for a Rich HPSG Grammar. 2002
- From Instance-level Constraints to Space-level Constraints: Making the Most of Prior Knowledge in Data Clustering 2002
- LinGO Redwoods. A Rich and Dynamic Treebank for HPSG 2002
- Combining Heterogeneous Classifiers for Word-Sense Disambiguation 2002
- Evaluating Strategies for Similarity Search on the Web 2002
- LinGO Redwoods: A Rich and Dynamic Treebank for HPSG 2002
- Simulating a File-Sharing P2P Network 2002
- Pronunciation Modeling for Improved Spelling Correction 2002
- nterpreting and Extending Classical Agglomerative Clustering Algorithms using a Model-Based Approach 2002
- The LinGO Redwoods Treebank: Motivation and Preliminary Applications 2002
- LinGO Redwoods: A Rich and Dynamic Treebank for HPSG 2002
- LinGO Redwoods. A Rich and Dynamic Treebank for HPSG. 2002
- Parse Disambiguation for a Rich HPSG Grammar 2002
- Natural Language Grammar Induction using a Constituent-Context Model. Advances in Neural Information Processing Systems 14 (NIPS 2001) edited by Dietterich, Thomas, G., Becker, S., Ghahramani, Z. Cambridge, MA: MIT Press. 2002: 35–42
- Review of Rens Bod, Beyond Grammar: An Experience-based Theory of Language. Journal of Linguistics 2002; 2 (38): 441-442
- The LinGO Redwoods Treebank: Motivation and Preliminary Applications. 2002
- Feature Selection for a Rich HPSG Grammar Using Decision Trees 2002
- Inducing Novel Gene-Drug Interactions from the Biomedical Literature Stanford University Technical Report, December 2002
- Combining Heterogeneous Classifiers for Word-Sense Disambiguation. 2002
- Dictionaries and Endangered Languages Language Endangerment and Language Maintenance. edited by Bradley, D., Bradley, M. London: RoutledgeCurzon. 2002: 329–347
- A* Parsing: Fast Exact Viterbi Parse Selection. Stanford University Technical Report dbpubs/2002-16 2002
- Review of Rens Bod, Beyond Grammar: An Experience-based Theory of Language Journal of Linguistics 2002; 2 (38): 441-442
- From Instance-level Constraints to Space-level Constraints: Making the Most of Prior Knowledge in Data Clustering. 2002
- Feature Selection for a Rich HPSG Grammar Using Decision Trees 2002
-
Natural language grammar induction using a constituent-context model
15th Annual Conference on Neural Information Processing Systems (NIPS)
M I T PRESS. 2002: 35–42
View details for Web of Science ID 000180520100005
-
Conditional structure versus conditional estimation in NLP models
Conference on Empirical Methods in Natural Language Processing
ASSOCIATION COMPUTATIONAL LINGUISTICS. 2002: 9–16
View details for Web of Science ID 000223079900002
-
A generative constituent-context model for improved grammar induction
40th Annual Meeting of the Association-for-Computational-Linguistics
ASSOCIATION COMPUTATIONAL LINGUISTICS. 2002: 128–135
View details for Web of Science ID 000223096700017
-
Parsing with treebank grammars: Empirical bounds, theoretical models, and the structure of the Penn Treebank
39th Annual Meeting of the Association-for-Computational-Linguistics
ASSOCIATION COMPUTATIONAL LINGUISTICS. 2001: 330–337
View details for Web of Science ID 000223267600042
- Parsing and Hypergraphs 2001
- Text Classification in a Hierarchical Mixture Model for Small Training Sets 2001
- An O(n3) Agenda-Based Chart Parser for Arbitrary Probabilistic Context-Free Grammars 2001
- What's needed for lexical databases? Experiences with Kirrkirr 2001
- Parsing and Hypergraphs 2001
- An O(n3) Agenda-Based Chart Parser for Arbitrary Probabilistic Context-Free Grammars Stanford Technical Report dbpubs/2001-16 2001
- Kirrkirr: Software for browsing and visual exploration of a structured Warlpiri dictionary. Literary and Linguistic Computing, 2001; 2 (16): 123-139
- Soft Constraints Mirror Hard Constraints: Voice and Person in English and Lummi 2001
- An Oncology Patient Interface to Medline 2001
- Combining Heterogeneous Classifiers for Word-Sense Disambiguation 2001
- What's needed for lexical databases? Experiences with Kirrkirr 2001
- Distributional Phrase Structure Induction 2001
- Kirrkirr: Software for browsing and visual exploration of a structured Warlpiri dictionary Literary and Linguistic Computing 2001; 2 (16): 135-151
-
Synovial tissue in rheumatoid arthritis is a source of osteoclast differentiation factor
4th International Synovitis Workshop
WILEY-BLACKWELL. 2000: 250–58
Abstract
Osteoclast differentiation factor (ODF; also known as osteoprotegerin ligand, receptor activator of nuclear factor kappaB ligand, and tumor necrosis factor-related activation-induced cytokine) is a recently described cytokine known to be critical in inducing the differentiation of cells of the monocyte/macrophage lineage into osteoclasts. The role of osteoclasts in bone erosion in rheumatoid arthritis (RA) has been demonstrated, but the exact mechanisms involved in the formation and activation of osteoclasts in RA are not known. These studies address the potential role of ODF and the bone and marrow microenvironment in the pathogenesis of osteoclast-mediated bone erosion in RA.Tissue sections from the bone-pannus interface at sites of bone erosion were examined for the presence of osteoclast precursors by the colocalization of messenger RNA (mRNA) for tartrate-resistant acid phosphatase (TRAP) and cathepsin K in mononuclear cells. Reverse transcriptase-polymerase chain reaction (RT-PCR) was used to identify mRNA for ODF in synovial tissues, adherent synovial fibroblasts, and activated T lymphocytes derived from patients with RA.Multinucleated cells expressing both TRAP and cathepsin K mRNA were identified in bone resorption lacunae in areas of pannus invasion into bone in RA patients. In addition, mononuclear cells expressing both TRAP and cathepsin K mRNA (preosteoclasts) were identified in bone marrow in and adjacent to areas of pannus invasion in RA erosions. ODF mRNA was detected by RT-PCR in whole synovial tissues from patients with RA but not in normal synovial tissues. In addition, ODF mRNA was detected in cultured adherent synovial fibroblasts and in activated T lymphocytes derived from RA synovial tissue, which were expanded by exposure to anti-CD3.TRAP-positive, cathepsin K-positive osteoclast precursor cells are identified in areas of pannus invasion into bone in RA. ODF is expressed by both synovial fibroblasts and by activated T lymphocytes derived from synovial tissues from patients with RA. These synovial cells may contribute directly to the expansion of osteoclast precursors and to the formation and activation of osteoclasts at sites of bone erosion in RA.
View details for Web of Science ID 000085362800003
View details for PubMedID 10693863
-
What's related? Generalizing approaches to related articles in medicine
Annual Symposium of the American-Medical-Informatics-Association
HANLEY & BELFUS INC. 2000: 838–842
Abstract
We did formative evaluations of several variations to the computation of related articles for non-bibliographic resources in the medical domain.A binary model and several variations of the vector space model were used to measure similarity between documents. Two corpora were studied, using a human expert as the gold standard.Variations in term weights and stopword choices made little difference to performance. Performance was worse when documents were characterized by title words alone or by MeSH terms extracted from document references.Further studies are needed to evaluate these methods in medical information retrieval systems.
View details for Web of Science ID 000170207500171
View details for PubMedID 11080002
- Using XSL And XQL For Efficient Customised Access To Dictionary Information. 2000
- Kirrkirr: Software for browsing and visual exploration of a structured Warlpiri dictionary 2000
- What's related? Generalizing approaches to related articles in medicine. 2000
- Probabilistic Parsing Using Left Corner Language Models Advances in Probabilistic and Other Parsing Technologies edited by Bunt, H., Nijholt, A. Kluwer Academic Publishers. 2000: 105–124
- What's related? Generalizing approaches to related articles in medicine 2000
- Using XSL And XQL For Efficient Customised Access To Dictionary Information 2000
- Medline IRaCS: An Information Retrieval and Clustering System for Genomic Knowledge Acquisition 2000
- Bilingual Dictionaries for Australian Languages: User studies on the place of paper and electronic dictionaries 2000
- Probabilistic Parsing Using Left Corner Language Models. Advances in Probabilistic and Other Parsing Technologies. Kluwer Academic Publishers. 2000: 105–124
- Kirrkirr: Software for browsing and visual exploration of a structured Warlpiri dictionary Paper presented at ALLC/ACH 2000. Revised version appears in Literary and Linguistic Computing 2000; 1 (16): 123-139
-
Enriching the knowledge sources used in a maximum entropy part-of-speech tagger
Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora held in Conjunction with the 38th Annual Meeting of the Association-for-Computational-Linguistics
ASSOCIATION COMPUTATIONAL LINGUISTICS. 2000: 63–70
View details for Web of Science ID 000223092900008
- Cognition and Function in Language. edited by Fox, Barbara, A., Jurafsky, D., Michaelis, Laura, A. Stanford, CA: CSLI Publications.. 1999
- Dissociations between Argument Structure and Grammatical Relations Lexical And Constructional Aspects of Linguistic Explanation edited by Webelhuth, G., Koenig, J. P., Kathol, A. CSLI Publications. 1999: 63–78
- Dictionaries and endangered languages Paper presented at the Endangered Languages Workshop, La Trobe University, [ps, rtf; published version above, under 2002; earlier version presented at the 1999 Perth Congress of the Applied Linguistics Association of Australia]. 1999
- Complex Predicates and Information Spreading in LFG Stanford, CA: CSLI Publications 1999
- The Lexical Integrity of Japanese Causatives Studies in Contemporary Phrase Structure Grammar edited by Levine, Robert, D., Green, Georgia, M. Cambridge: Cambridge University Press. 1999: 39–79
- The Lexical Integrity of Japanese Causatives. Studies in Contemporary Phrase Structure Grammar edited by Levine, Robert, D., Green, Georgia, M. Cambridge: Cambridge University Press. 1999: 39–79
- Kirrkirr: Interactive Visualisation And Multimedia From A Structured Warlpiri Dictionary 1999
- Dictionaries and endangered languages 1999
- Dissociations between Argument Structure and Grammatical Relations. Lexical And Constructional Aspects of Linguistic Explanation CSLI Publications. 1999: 63–78
- Rethinking text segmentation models: An information extraction case study. Technical report SULTRY-98-07-01, University of Sydney 1998
- A dictionary database template for Australian Languages 1998
- Argument Structure, Valence, and Binding. Nordic Journal of Linguistics 1998; 2 (21): 107-144
- Voice and grammatical relations in Indonesian: A new perspective edited by Austin, Peter, K., Musgrave, S. 1998
- Review of David Pesetsky Zero Syntax: Experiencers and Cascades. Language 73: 1997: 608-611
-
Grammatical relations versus binding: On the distinctness of argument structure
PETER LANG AG. 1997: 79–101
View details for Web of Science ID 000074024800004
- Probabilistic Parsing Using Left Corner Language Models 1997
- Grammatical Relations versus Binding: On the Distinctness of Argument Structure. Empirical Issues in Formal Syntax and Semantics edited by Corblin, F., Godard, D., Marandin, J., M. Bern: Peter Lang ISBN 3-906757-73-0. 1997: 1
- Ergativity: Argument Structure and Grammatical Relations Stanford, CA: CSLI Publications/Cambridge University Press Dissertations in Linguistics series. ISBN: 1575860368 (pbk), 1575860376 (hbk).. 1996
- Argument structure as a locus for binding theory 1996
- Romance Complex Predicates: In defence of the right-branching structure 1996
- A Theory of Non-constituent Coordination based on Finite State Rules 1996
- Ergativity: Argument Structure and Grammatical Relations 1995
- Dissociations between Argument Structure and Grammatical Relations 1995
- Dissociating functor-argument structure from surface phrase structure: 1995
- Valency versus binding: On the distinctness of argument structure 1995
- Ergativity: Argument Structure and Grammatical Relations, PhD Thesis, Stanford The revised version has been published by CSLI Publications (see 1996), and this version is not available on the web. 1994
- The lexical integrity of Japanese causatives 1994
- Information Spreading and Levels of Representation in LFG CSLI Technical Report CSLI-93-176, Stanford University, Stanford CA. 1993
- Automatic acquisition of a large subcategorization dictionary from corpora 1993
- Analyzing the verbal noun: Internal and external constraints Japanese/Korean Linguistics 3, Stanford edited by Choi, S. CA: Stanford Linguistics Association. 1993: 236–253
- Romance is so complex Technical Report CSLI-92-168, Stanford University, Stanford CA. 1992
- Presents embedded under pasts ms., Stanford University 1992
- LFG within King's descriptive formalism ms. Stanford University, Stanford CA. 1991
- Lexical Conceptual Structure and Marathi ms. Stanford University, Stanford CA 1991