[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Notebook to Perform Market Segmentation using K-means clustering, PCA, and Auto-encoders.

License

Notifications You must be signed in to change notification settings

BrianMburu/Market_Segmentation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Market Segmentation Notebook

This notebook performs market segmentation using K-means clustering, PCA, and autoencoder methods on sales data.

Libraries Used

  • Pandas
  • Numpy
  • Seaborn
  • Matplotlib.pyplot
  • Tensorflow.keras
  • Sklearn.preprocessing
  • Sklearn.cluster
  • Sklearn.decomposition
  • Sklearn.metrics
  • Plotly.express
  • Plotly.graph_objects

Data Cleaning

Data Source : Kaggle

The notebook performs simple data cleaning by dropping null columns and converting the ORDERDATE column to datetime format.

Explanatory Data Analysis

The notebook performs the following steps for Exploratory Data Analysis:

Visualizing the count of column items.

  • Performing One Hot encoding to all categorical columns.
  • Grouping data by order date to perform further analysis.
  • Calculating Correlation Matrix.
  • Plotting distplots.
  • Visualizing the relationship between variables using pairplots.

Market Segmentation

The notebook performs market segmentation using the following steps:

  1. Fit K-means Clustering Model
    • Scale the data.
    • Find the optimal number of clusters using the elbow method.
    • Fit the K-means clustering model using the optimal number clusters (k).
  2. Apply Principle Component Analysis
    • Visualize the results.
  3. Perform Dimensionality Reduction using the Autoencoder
    • Create an autoencoder model using Keras API.
    • Fit and train the autoencoder model.
    • Use the bottleneck layer (after down-scaling layers and before up-scaling layers) to encode the sales dataframe.
    • Use the elbow method to find the optimal number of clusters.
    • Fit the encoded sales dataframe using K-means clustering model with the optimal number of clusters.
    • Apply Principle Component Analysis and visualize the results.

About

Notebook to Perform Market Segmentation using K-means clustering, PCA, and Auto-encoders.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published