Phonetic-based microtext normalization for Twitter sentiment analysis

Conference Publication ResearchOnline@JCU
Satapathy, Ranjan;Guerreiro, Claudia;Chaturvedi, Iti;Cambria, Erik
Abstract

The proliferation of Web 2.0 technologies and the increasing use of computer-mediated communication resulted in a new form of written text, termed microtext. This poses new challenges to natural language processing tools which are usually designed for well-written text. This paper proposes a phonetic-based framework for normalizing microtext to plain English and, hence, improve the classification accuracy of sentiment analysis. Results demonstrated that there is a high (>0.8) similarity index between tweets normalized by our model and tweets normalized by human annotators in 85.31% of cases, and that there is an accuracy increase of >4% in terms of polarity detection after normalization.

Journal

N/A

Publication Name

IEEE International Conference on Data Mining Workshops, ICDMW

Volume

N/A

ISBN/ISSN

2375-9259

Edition

N/A

Issue

N/A

Pages Count

7

Location

New Orleans, LA, USA

Publisher

Institute of Electrical and Electronics Engineers

Publisher Url

N/A

Publisher Location

Piscataway, NJ, USA

Publish Date

N/A

Url

N/A

Date

N/A

EISSN

N/A

DOI

10.1109/ICDMW.2017.59