Teaching Machines to Understand Us: Noun Event Extraction by Automatically Identifying and Assessing New Linguistic Patterns
My research focuses on Information Extraction, a subcategory of Natural Language Processing. The English Gigaword Fifth Edition data set provides over 9 million news wire documents with which I seek to automate the retrieval of ‘Event Nouns’. My algorithm detects words such as ‘wedding’, ‘conference’, and ‘graduation’ by identifying new linguistic patterns and analyzing syntactic structures. Then, it sorts them into a massive list, and independently scores each entry’s accuracy. Such a rich and comprehensive collection has never been achieved, thus my work will aide future researchers in their plight to give machines the ability to more deeply understand human language.