8000 GitHub - adityachitlangia/BioInsight
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

adityachitlangia/BioInsight

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

BioInsight

Project Overview

BioInsight is a data analytics and machine learning platform designed to analyze, visualize, and predict bioprocess outcomes. This project focuses on analyzing bioprocess data across different scales (1 mL and 30 L) to predict key performance indicators such as Final OD and GFPuv production.

Features

  • Data Processing & Visualization: Intuitive interface for exploring process data with interactive time series visualization
  • Feature Selection: Advanced analysis of important features with correlation matrices and distribution plots
  • Model Results: Comparison of multiple machine learning models (Random Forest, XGBoost, SVR, PLS) with performance metrics
  • Interactive Predictions: Make real-time predictions by adjusting feature values
  • PLS Component Analysis: Visualize PLS components and explained variance

Technologies Used

  • Python: Core programming language
  • Streamlit: Interactive dashboard framework
  • Scikit-learn: Machine learning model development
  • XGBoost: Gradient boosting implementation
  • Plotly: Interactive data visualization
  • Pandas/NumPy: Data manipulation and numerical operations

Project Structure

├── Wrangled_Combined_Batch_Dataset.xlsx      # Main dataset 
├── models/                                   # Trained model files
├── src/
│   ├── dashboard/
│   │   └── app.py                            # Streamlit dashboard application
│   └── model_development.py                  # Model training pipeline
└── README.md                                 # Project documentation

Installation & Setup

Prerequisites

  • Python 3.8+
  • pip package manager

Installation

  1. Clone this repository
git clone https://github.com/adityachitlangia/BioInsight.git
cd BioInsight
  1. Create and activate a virtual environment (optional but recommended)
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
  1. Install required packages
pip install -r requirements.txt

Usage

Running the Dashboard

To launch the BioInsight dashboard:

cd src/dashboard 
streamlit run app.py

The dashboard will be accessible in your web browser at http://localhost:8501

Training Models

To train or retrain the machine learning models:

python src/model_development.py

Dashboard Sections

1. Data Processing

  • View and analyze raw process data
  • Visualize missing values
  • Explore time series data by batch

2. Feature Selection

  • Examine feature importance rankings
  • Analyze feature correlations
  • Explore feature distributions with statistical insights

3. Model Results

  • Compare model performance (RMSE, R² score)
  • Visualize actual vs predicted values
  • Analyze PLS components and explained variance
  • Make interactive predictions
  • Enhanced visualization of PLS components with improved error handling

Recent Updates

  • Enhanced UI/UX:
    • Improved dashboard styling with custom CSS
    • Added card-like structures for better content organization
    • Enhanced visual hierarchy and navigation
    • Responsive design improvements
  • Improved Error Handling:
    • Better handling of data file paths
    • Robust PLS component visualization
    • Enhanced time series analysis functionality
    • Improved batch selection and feature filtering

Contributors

  • Aditya Chitlangia

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

0