Sequence encoding incorporated CNN model for Email document sentiment classification

Journal Publication ResearchOnline@JCU
Liu, Sisi;Lee, Ickjai
Abstract

Document sentiment classification is an area of study that has been developed for decades. However, sentiment classification of Email data is rather a specialized field that has not yet been thoroughly studied. Compared to typical social media and review data, Email data has characteristics of length variance, duplication caused by reply and forward messages, and implicitness in sentiment indicators. Due to these characteristics, existing techniques are incapable of fully capturing the complex syntactic and relational structure among words and phrases in Email documents. In this study, we introduce a dependency graph-based position encoding technique enhanced with weighted sentiment features, and incorporate it into the feature representation process. We combine encoded sentiment sequence features with traditional word embedding features as input for a revised deep CNN model for Email sentiment classification. Experiments are conducted on three sets of real Email data with adequate label conversion processes. Empirical results indicate that our proposed SSE-CNN model obtained the highest accuracy rate of 88.6%, 74.3% and 82.1% for three experimental Email datasets over other comparative state-of-the-art algorithms. Furthermore, our performance evaluations on the preprocessing and sentiment sequence encoding justify the effectiveness of Email preprocessing and sentiment sequence encoding with dependency-graph based position and SWN features on the improvement of Email document sentiment classification.

Journal

Applied Soft Computing

Publication Name

N/A

Volume

102

ISBN/ISSN

1872-9681

Edition

N/A

Issue

N/A

Pages Count

14

Location

N/A

Publisher

Elsevier

Publisher Url

N/A

Publisher Location

N/A

Publish Date

N/A

Url

N/A

Date

N/A

EISSN

N/A

DOI

10.1016/j.asoc.2021.107104