
Chipotle Sales Analysis
Project Overview
This project focuses on analyzing sales data from Chipotle to uncover business insights and customer preferences using Python and data science techniques. The analysis leverages real transactional data to perform sales analysis, customer preference exploration, and predictive modeling.

What Was Done
1. Data Preparation and Cleaning
Loaded the Chipotle sales dataset and inspected its structure.
Identified and handled missing values, particularly in the
choice_description
column, replacing them with a placeholder to ensure clean analysis.Converted the
item_price
field from string to float for accurate calculations and modeling12.
2. Sales Analysis
Aggregated sales data by item to calculate total orders and total sales for each menu item.
Sorted and visualized the most popular and highest-grossing items using tables and pie charts, making it easy to see which products drive the most revenue2.
3. Customer Preference Analysis
Identified the top 10 most popular items based on sales volume.
Analyzed customer choices and preferences for these top items, revealing which combinations and customizations are most favored by customers3.
4. Predictive Modeling
Built a linear regression model to predict order quantities based on item price.
Enhanced the model using data scaling and hyperparameter tuning (Grid Search), improving prediction accuracy and demonstrating the application of machine learning in sales forecasting4.
5. Order Quantity Prediction with Tuned Linear Regression
Built a linear regression pipeline (with scaling) to predict how item price affects the quantity ordered.
Used Grid Search to tune model hyperparameters for optimal performance.
Evaluated the model using metrics such as Mean Squared Error (MSE), Mean Absolute Error (MAE), and R² score, comparing base and tuned models.
Visualized the relationship between item price and quantity ordered, highlighting model fit and dataset distribution1.
6. Customer Segmentation with K-Means Clustering
Aggregated sales data by order to compute total items and total spent per order.
Applied K-Means clustering to segment customers based on their purchasing behavior (number of items and total amount spent).
Visualized customer segments to identify distinct purchasing patterns and potential target groups2.
7. Cluster Quality Validation with Silhouette Analysis
Performed silhouette analysis to determine the optimal number of clusters for customer segmentation.
Calculated and plotted average silhouette scores for different cluster counts, ensuring the quality and interpretability of the customer segments3.
8. Core Sales and Preference Analysis (from initial phase)
Cleaned and prepared the Chipotle sales dataset for analysis.
Identified top-selling items and customer favorites.
Explored trends in sales and customer preferences to inform business strategies.
How This Is Useful
Business Insights: Reveals which menu items are most popular and profitable, supporting inventory and marketing strategies.
Demand Forecasting: Predicts order quantities based on pricing, aiding in pricing strategy and stock management.
Customer Understanding: Segments customers into meaningful groups, allowing for targeted promotions and personalized marketing.
Operational Efficiency: Helps optimize menu offerings and resource allocation based on data-driven insights.
Data Science Practice: Demonstrates practical use of regression, clustering, and validation techniques on real-world business data.
This project showcases how advanced analytics and machine learning can turn raw sales data into actionable insights for business growth and customer satisfaction.
