Multivariate Analysis of Wine Quality
[Python][PCA][Clustering][Random Forest]
§1. Abstract
I analyzed the chemical makeup of thousands of wines to figure out what makes a 'good' wine. By using advanced grouping techniques, I found that you can accurately predict a wine's quality score just by looking at its alcohol content and acidity levels, without ever tasting it.
§2. Methodology & Implementation
Conducted a comprehensive multivariate analysis on the physicochemical properties of Portuguese 'Vinho Verde' wine. Applied Principal Component Analysis (PCA) for dimensionality reduction, K-Means and Hierarchical clustering to identify natural groupings, and trained Random Forest and SVM classifiers to predict sensory quality scores. The Random Forest model achieved 82% accuracy, identifying alcohol and volatile acidity as the primary drivers of quality.
§3. Key Metrics
| Method | PCA & Clustering |
|---|---|
| Model | Random Forest |
| Accuracy | 82% |