Title: Text visualisation for feature selection in online review analysis

Authors: Keerthika Koka; Shiaofen Fang

Addresses: Department of Computer and Information Science, Indiana University Purdue University Indianapolis, 723 W. Michigan St., SL 280, Indianapolis, IN 46202, USA ' Department of Computer and Information Science, Indiana University Purdue University Indianapolis, 723 W. Michigan St., SL 280, Indianapolis, IN 46202, USA

Abstract: Opinion spamming is a reality, and it can have unpleasant consequences in the retail industry. While there are, several promising research works done on identifying the fake online reviews from genuine online reviews, there have been few involving visualisation and visual analytics. The purpose of this work is to show that feature selection through visualisation is at least as powerful as the best automatic feature selection algorithms. This is achieved by applying radial chart visualisation technique to the online review classification into fake and genuine reviews. Radial chart and the colour overlaps are used to explore the best feature selection through visualisation for classification. Parallel coordinate visualisation of the review data is also explored and compared with radial chart results. The system gives a structure to each text review based on certain attributes, compares how different or similar the structure of the different or same categories are, and highlights the key features that contribute to the classification the most. Our visualisation technique helps the user get insights into the high dimensional data by providing means to eliminate the worst features right away, pick some best features without statistical aids, understand the behaviour of the dimensions in different combinations.

Keywords: text visualisation; feature selection; radial chart; online review analysis.

DOI: 10.1504/IJBDI.2019.100887

International Journal of Big Data Intelligence, 2019 Vol.6 No.3/4, pp.202 - 211

Received: 08 Mar 2018
Accepted: 22 Jun 2018

Published online: 04 Jun 2019 *

Full-text access for editors Access for subscribers Purchase this article Comment on this article