Fashion AI Search & Recommendation System Prototype

This project is a prototype of a Fashion AI system that processes images, generates semantic relationships between them, simulates user interactions, and provides personalized recommendations. It uses a PostgreSQL database to store image metadata, navigation paths, user interactions, and generated recommendations.

Project Goals

Generate placeholder fashion images.
Extract metadata for these images using a Vision Transformer (ViT) model.
Create semantic links (navigation paths) between images.
Simulate user browsing behavior.
Generate personalized recommendations based on user behavior and image semantics.

Directory Structure

images/: Contains placeholder images. The generate_images.py script (run by the system during initial setup) populates this directory.
config.py: Configuration file for Hugging Face model details (e.g., model name, image size) used by FashionTagger.
fashion_tagger.py: Module containing the FashionTagger class, which loads a pre-trained Vision Transformer (ViT) model from Hugging Face and extracts metadata (features, dominant colors) from images.
generate_images.py: (Helper script, not part of the main ETL flow but used for initial setup) Generates placeholder fashion images.
etl.py: Extracts metadata for images using FashionTagger and loads it into the image_metadata table in PostgreSQL.
semantic_enrichment.py: Analyzes image metadata to create potential navigation paths between images, storing them in image_navigation_paths.
user_simulator.py: Simulates user browsing sessions and interactions (clicks) with images, storing data in user_interactions.
recommendation_engine.py: Generates personalized recommendations for specific users based on their interaction history and the semantic navigation paths, storing results in the recommendations table.
schema.sql: Contains the SQL DDL statements to create all necessary tables in the PostgreSQL database.
db_setup_instructions.md: Detailed instructions for setting up the PostgreSQL database using Docker.
requirements.txt: Lists the Python dependencies for this project.
README.md: This file, providing an overview and instructions.

Setup Instructions

Prerequisites

Python 3.8+
Docker (e.g., Docker Desktop) for running PostgreSQL.
Internet connection (for the first run of etl.py to download the Hugging Face model).

Dependencies

Clone the repository (if applicable) or ensure all project files are in a local directory.
Install Python packages: Open your terminal or command prompt, navigate to the project's root directory, and run:
```
pip install -r requirements.txt
```
This will install all necessary packages, including psycopg2-binary (for PostgreSQL interaction), Pillow (for image manipulation), and critically, transformers and torch (for loading and using pre-trained models from Hugging Face).

Database Setup

Set up PostgreSQL using Docker: Follow the detailed instructions in db_setup_instructions.md to get a PostgreSQL instance running in Docker. This guide includes commands to start the container and set up the initial database.
Set Environment Variables: Before running any of the Python scripts, you need to set the following environment variables in your terminal session. These variables allow the scripts to connect to the PostgreSQL database. Replace the example values with those you configured during the Docker setup (see db_setup_instructions.md).

For Linux/macOS:
```
export DB_HOST="localhost"
export DB_NAME="fashion_db"
export DB_USER="myuser"       # Or your chosen user from db_setup_instructions.md
export DB_PASSWORD="mypassword" # Or your chosen password
```
For Windows (Command Prompt):
```
set DB_HOST="localhost"
set DB_NAME="fashion_db"
set DB_USER="myuser"
set DB_PASSWORD="mypassword"
```
For Windows (PowerShell):
```
$env:DB_HOST="localhost"
$env:DB_NAME="fashion_db"
$env:DB_USER="myuser"
$env:DB_PASSWORD="mypassword"
```
The scripts have default fallbacks for these variables (e.g., "postgres" for user/password, "fashion_db" for dbname, "localhost" for host), but it's best practice to set them explicitly, especially if your Docker setup uses different credentials.

Running the System (Execution Order)

Execute the scripts from the project's root directory in the following order. Each script builds upon the data generated by the previous ones. The generate_images.py script is assumed to have been run by the system to create the initial images in the ./images folder.

Run python etl.py to populate image_metadata: This script now uses the FashionTagger module to extract metadata from images in the ./images directory using a pre-trained Hugging Face Vision Transformer model. The extracted metadata (including image features converted to tags, dominant colors) is loaded into the image_metadata table. This script also creates all tables defined in schema.sql if they don't exist.
```
python etl.py
```
Important Notes for etl.py:
- First Run: The first time you run etl.py, it will download the pre-trained Vision Transformer model specified in config.py (e.g., google/vit-base-patch16-224-in21k). This model can be several hundred megabytes, so an internet connection is required. Subsequent runs will use the cached model.
- Processing Time: Image processing will now take longer due to the model inference step for each image.
- Metadata Quality: The default model (google/vit-base-patch16-224-in21k) is a general-purpose Vision Transformer trained on ImageNet. As such, the description and style_tags it generates will be based on general ImageNet categories. Specific fashion attributes like garment_type, accessories, and gender are currently set to placeholder values (e.g., "unknown", [], "unisex") by FashionTagger as this base model is not fine-tuned for detailed fashion classification.
Run python semantic_enrichment.py to populate image_navigation_paths: This script analyzes the metadata in image_metadata (now including ViT-based features) to establish semantic links or potential navigation paths between images. Results are stored in image_navigation_paths.
```
python semantic_enrichment.py
```
Run python user_simulator.py to populate user_interactions: This script simulates user browsing behavior. It generates mock user sessions where users "view" and "click" on images, often following the paths defined in image_navigation_paths. These interactions are recorded in the user_interactions table.
```
python user_simulator.py
```
Run python recommendation_engine.py to generate recommendations: This script uses the data from image_metadata, image_navigation_paths, and user_interactions to generate personalized recommendations for a predefined set of target users. The recommendations and the reasoning behind them are stored in the recommendations table.
```
python recommendation_engine.py
```

Configuration

Model Selection: The Vision Transformer model used by FashionTagger is defined in config.py (HUGGING_FACE_MODEL_NAME). You can experiment by changing this to other models available on Hugging Face. However, be aware that if the output structure of a different model varies significantly, adjustments in fashion_tagger.py might be necessary to correctly interpret the model's predictions.
Database Connection: Database connection parameters (host, name, user, password) are managed via environment variables, as detailed in the "Database Setup" section.

Output

Each script will print status messages, progress, and error information to the console.
The primary output of the system is the data populated in the PostgreSQL database tables.
The recommendation_engine.py script will print the generated recommendations for the target users to the console.

You can inspect the data in the database using psql or any PostgreSQL client. Refer to db_setup_instructions.md for psql connection commands.

Example psql queries:

Check table structure (e.g., for image_metadata):
```
\d image_metadata;
```
Count rows in a table (e.g., user_interactions):
```
SELECT COUNT(*) FROM user_interactions;
```

View sample recommendations:

SELECT user_id, source_image_id, recommended_images, reasoning FROM recommendations LIMIT 5;

View navigation paths for a specific image:

SELECT * FROM image_navigation_paths WHERE source_image_id = 'img_001.jpg';

View interactions for a specific user:

SELECT image_id, clicked, timestamp FROM user_interactions WHERE user_id = 'user001' ORDER BY timestamp DESC;

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Fashion AI Search & Recommendation System Prototype

Project Goals

Directory Structure

Setup Instructions

Prerequisites

Dependencies

Database Setup

Running the System (Execution Order)

Configuration

Output

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
images		images
README.md		README.md
config.py		config.py
db_setup_instructions.md		db_setup_instructions.md
etl.py		etl.py
fashion_tagger.py		fashion_tagger.py
generate_images.py		generate_images.py
recommendation_engine.py		recommendation_engine.py
requirements.txt		requirements.txt
schema.sql		schema.sql
semantic_enrichment.py		semantic_enrichment.py
user_simulator.py		user_simulator.py

dbarenas/e-commerce_fashion_reco

Folders and files

Latest commit

History

Repository files navigation

Fashion AI Search & Recommendation System Prototype

Project Goals

Directory Structure

Setup Instructions

Prerequisites

Dependencies

Database Setup

Running the System (Execution Order)

Configuration

Output

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages