Article: Analysing sentiments based on multi feature combination with supervised learning Journal: International Journal of Data Mining, Modelling and Management (IJDMMM) 2019 Vol.11 No.4 pp.391 - 416 Abstract: Researches on sentiment analysis are growing to a great extent and attracting wide ranges of attention from academics and industries as well. Feature generation and selection are consequent for text mining as the high dimensional feature set can affect the performance of sentiment analysis. This paper exhibits the efficacy of the proposed combined feature selection technique on machine learning classification algorithms over their individual usefulness. Initially, we transform the review datasets into the feature vector of unigram features along with bi-tagged features based on POS pattern. Next, information gain (IG), Chi squared (χ<SUP align="right"><SMALL>2</SMALL></SUP>) and minimum redundancy maximum relevancy (mRMR) feature selection methods are applied to obtain an optimal feature subset for further functionality. These features are then given input to multiple machine learning classifiers, namely, support vector machine (SVM), multinomial Naïve Bayes (MNB), Bernoulli Naïve Bayes (BNB) and logistic regression (LR) on multi domain product review datasets. The performance of the algorithm is measured by evaluation methods such as precision, recall, and F-measure. Experimental results show that the feature selection method mRMR with SVM achieved a better accuracy of 91.39, which is encouraging and comparable to the related research. Inderscience Publishers - linking academia, business and industry through research

Title: Analysing sentiments based on multi feature combination with supervised learning

Authors: Monalisha Ghosh; Goutam Sanyal

Addresses: National Institute of Technology Durapur, Mahatma Gandhi Rd, A-Zone, Durgapur, West Bengal 713209, India ' Computer Science and Engineering Department, National Institute of Technology Durapur, Mahatma Gandhi Rd, A-Zone, Durgapur, West Bengal 713209, India

Abstract: Researches on sentiment analysis are growing to a great extent and attracting wide ranges of attention from academics and industries as well. Feature generation and selection are consequent for text mining as the high dimensional feature set can affect the performance of sentiment analysis. This paper exhibits the efficacy of the proposed combined feature selection technique on machine learning classification algorithms over their individual usefulness. Initially, we transform the review datasets into the feature vector of unigram features along with bi-tagged features based on POS pattern. Next, information gain (IG), Chi squared (χ²) and minimum redundancy maximum relevancy (mRMR) feature selection methods are applied to obtain an optimal feature subset for further functionality. These features are then given input to multiple machine learning classifiers, namely, support vector machine (SVM), multinomial Naïve Bayes (MNB), Bernoulli Naïve Bayes (BNB) and logistic regression (LR) on multi domain product review datasets. The performance of the algorithm is measured by evaluation methods such as precision, recall, and F-measure. Experimental results show that the feature selection method mRMR with SVM achieved a better accuracy of 91.39, which is encouraging and comparable to the related research.

Keywords: sentiment analysis; opinion mining; text classification; feature selection method; machine learning algorithms optimal feature vector.

DOI: 10.1504/IJDMMM.2019.102728

International Journal of Data Mining, Modelling and Management, 2019 Vol.11 No.4, pp.391 - 416

Accepted: 08 Oct 2018
Published online: 02 Oct 2019 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article

Title: Analysing sentiments based on multi feature combination with supervised learning

Keep up-to-date