Evaluating Textual Features and Oversampling for Automatic Stance Detection

by. Jongwon Lee | 133 Views (80 Uniq Views) | about 1 year ago
#NLP #Linguistics #TermPaper
Term Paper for Computation & Linguistic Analysis LING-L545, Indiana University

We describe a series of experiments focused on a number of basic textual features and their effectiveness at the task of automatic stance detection. Specifically, we evaluate the impact of bag-of-words (BoW) features, sentiment lexicon features, and syntactic features on the performance of a Support Vector Machine (SVM). Based on our analysis, we find that the words in a tweet offer the most insight into the stance and that adding features from sentiment lexicons can improve the performance. Additionally, we find that one target showed a performance increase when adding syntactic dependency features. In addition, we identify challenges related to class imbalance, generally small data volume, and data quality.

Stance Detection, Sentiment Analysis, Social Media, Support Vector Machine, Subjectivity and Arguing Lexicon, Synthetic Minority Oversampling, Term Frequency-Inverse Document Frequency