BioInsight

Project Overview

BioInsight is a data analytics and machine learning platform designed to analyze, visualize, and predict bioprocess outcomes. This project focuses on analyzing bioprocess data across different scales (1 mL and 30 L) to predict key performance indicators such as Final OD and GFPuv production.

Features

Data Processing & Visualization: Intuitive interface for exploring process data with interactive time series visualization
Feature Selection: Advanced analysis of important features with correlation matrices and distribution plots
Model Results: Comparison of multiple machine learning models (Random Forest, XGBoost, SVR, PLS) with performance metrics
Interactive Predictions: Make real-time predictions by adjusting feature values
PLS Component Analysis: Visualize PLS components and explained variance

Technologies Used

Python: Core programming language
Streamlit: Interactive dashboard framework
Scikit-learn: Machine learning model development
XGBoost: Gradient boosting implementation
Plotly: Interactive data visualization
Pandas/NumPy: Data manipulation and numerical operations

Project Structure

├── Wrangled_Combined_Batch_Dataset.xlsx      # Main dataset 
├── models/                                   # Trained model files
├── src/
│   ├── dashboard/
│   │   └── app.py                            # Streamlit dashboard application
│   └── model_development.py                  # Model training pipeline
└── README.md                                 # Project documentation

Installation & Setup

Prerequisites

Python 3.8+
pip package manager

Installation

Clone this repository

git clone https://github.com/adityachitlangia/BioInsight.git
cd BioInsight

Create and activate a virtual environment (optional but recommended)

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Install required packages

pip install -r requirements.txt

Usage

Running the Dashboard

To launch the BioInsight dashboard:

cd src/dashboard 
streamlit run app.py

The dashboard will be accessible in your web browser at http://localhost:8501

Training Models

To train or retrain the machine learning models:

python src/model_development.py

Dashboard Sections

1. Data Processing

View and analyze raw process data
Visualize missing values
Explore time series data by batch

2. Feature Selection

Examine feature importance rankings
Analyze feature correlations
Explore feature distributions with statistical insights

3. Model Results

Compare model performance (RMSE, R² score)
Visualize actual vs predicted values
Analyze PLS components and explained variance
Make interactive predictions
Enhanced visualization of PLS components with improved error handling

Recent Updates

Enhanced UI/UX:
- Improved dashboard styling with custom CSS
- Added card-like structures for better content organization
- Enhanced visual hierarchy and navigation
- Responsive design improvements
Improved Error Handling:
- Better handling of data file paths
- Robust PLS component visualization
- Enhanced time series analysis functionality
- Improved batch selection and feature filtering

Contributors

Aditya Chitlangia

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

BioInsight

Project Overview

Features

Technologies Used

Project Structure

Installation & Setup

Prerequisites

Installation

Usage

Running the Dashboard

Training Models

Dashboard Sections

1. Data Processing

2. Feature Selection

3. Model Results

Recent Updates

Contributors

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
models		models
plots		plots
src		src
.gitignore		.gitignore
README.md		README.md
Wrangled_Combined_Batch_Dataset.xlsx		Wrangled_Combined_Batch_Dataset.xlsx
requirements.txt		requirements.txt

adityachitlangia/BioInsight

Folders and files

Latest commit

History

Repository files navigation

BioInsight

Project Overview

Features

Technologies Used

Project Structure

Installation & Setup

Prerequisites

Installation

Usage

Running the Dashboard

Training Models

Dashboard Sections

1. Data Processing

2. Feature Selection

3. Model Results

Recent Updates

Contributors

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages