Optimal Feature Extraction Technique for Sentiment Analysis of Product Reviews for Product Development

Oliko, Gabriel; Otieno, Calvins; Muhambe, Titus Mukisa

Repository Home
→
Research Papers
→
School of Science Technology & Engineering
→
View Item

Optimal Feature Extraction Technique for Sentiment Analysis of Product Reviews for Product Development

Oliko, Gabriel; Otieno, Calvins; Muhambe, Titus Mukisa

URI: http://41.89.205.12/handle/123456789/2636

Date: 2025-02-07

Abstract:

Consumer review sites, social media and micro-blogs carry a wealth of information on the general perspective, experience and feedback that consumers have on products. When there is a high volume of product reviews, it can be challenging to product developers to sift through and make a decision based on consumers’ sentiments. Sentiment Analysis, a branch of Artificial Intelligence, assists in providing data to help businesses understand customers’ desire and track how brands and goods are perceived. When performing Sentiment Analysis, feature extraction, converts raw text input into a machine learning compatible format. A strong feature set is necessary in order to achieve high prediction and object classification accuracy. Identifying an optimal feature set combination is critical for increasing the overall performance of data classification. In this research, we tackle this problem by identifying an optimal feature extraction technique for product review Sentiment Analysis using a feature-level analysis. N-gram, POS and techniques based on the lexicons Stanford CoreNLP, TextBlob, and SentiWordNet in different combinations are examined. Multinomial Naïve Bayes, Lexicon and Multinomial Naïve Bayes + Unsupervised Lexicon ensemble classifiers were modeled for classification of the reviews into positive, neutral and negative classes thereby identifying the optimal feature combination. We explored optimal feature extraction technique based on real product reviews datasets for two products; a car make and model known as ―Nissan Sentra‖ and a mobile phone product known as ―Samsung Galaxy A12‖. The optimal feature extraction technique for MNB and MNB + Lexicon ensemble classifications was provided by a combination of N-Gram, Part of Speech and TextBlob features while the optimal technique for unsupervised Lexicon was provided by a combination of N-Gram, Part of Speech and VADER.

Show full item record