Amazon Review Classification and Clustering with ML

Binary Classification: Good vs Bad Reviews:

What I Did

I framed sentiment prediction as a binary classification problem, labeling reviews as good or bad based on multiple rating cutoffs (1, 2, 3, and 4). This allowed me to study how model performance changes as the definition of “positive sentiment” becomes stricter or more lenient.

For each cutoff, I trained and compared three classifiers:

  • Logistic Regression

  • Linear SVM

  • Random Forest

All models were tuned using 5-fold cross-validation and evaluated with these standard classification metrics:

  • ROC curves and AUC

  • Confusion matrices

  • Accuracy and macro F1 score

The ROC plots show how well each model separates positive and negative reviews under different cutoff definitions.

Multiclass Classification: Predicting 1–5 Star Ratings:

What I Did

I extended the binary setup to a five-class classification problem, predicting exact star ratings (1–5) directly from review text.

Using the same feature pipeline, I trained:

  • Multinomial Logistic Regression

  • Multiclass SVM (one-vs-rest)

  • Random Forest

All models were again tuned with 5-fold cross-validation to ensure fair comparison.

For each classifier, I reported:

  • Multiclass confusion matrices

  • One-vs-rest ROC curves and per-class AUC

  • Macro F1 score and accuracy

The ROC plots illustrate how well each class (rating level) is separated from the others.

Clustering: Discovering Structure Without Labels:

What I Did

To explore structure beyond labeled sentiment, I applied k-means clustering to vectorized review text, clustering reviews by product category without using labels during training.

I experimented with different values of k and evaluated clustering quality using:

  • Silhouette score (cluster separation)

  • Adjusted Rand Index (alignment with known categories)

The results show:

  • Silhouette scores across different cluster counts

  • The selected value of k that maximized cluster quality

  • How well unsupervised clusters aligned with known product labels

These metrics quantify how much structure exists in review text without supervision.

An applied machine learning project analyzing large-scale product reviews to classify sentiment and uncover structure in consumer feedback using supervised and unsupervised learning methods.

Overview:

This project applies machine learning techniques to real-world product review data to understand how textual feedback reflects user sentiment and how reviews naturally cluster based on content and tone.

Using a large corpus of Amazon product reviews, the project explores both supervised learning (sentiment classification) and unsupervised learning (clustering), emphasizing careful feature design, model evaluation, and interpretability rather than black-box performance alone.

The dataset consists of labeled product reviews containing:

  • Review text

  • Star ratings

Key preprocessing steps included:

  • Text cleaning and normalization

  • Vectorization using bag-of-words / TF-IDF representations

  • Dimensionality considerations for scalability and generalization

Special care was taken to avoid data leakage and to ensure fair train/test splits.