I'm a PhD in Experimental High Energy Physics with a strong foundation in research and data analysis. After a career break, I've transitioned into data science and machine learning, focusing on projects that combine analytical rigor with clear communication. My work spans predictive modeling, natural language processing, and recommendation systems, all documented with detailed explanations to make complex concepts accessible.
-
Pediatric X-Ray Image Classification
-
Objective: Detect pneumonia in pediatric chest X-rays using deep learning.
-
Techniques: Image preprocessing, data augmentation, and transfer learning via DenseNet121 CNN.
-
Highlights: Achieved strong classification performance with limited data by leveraging pre-trained models; implemented a full pipeline from loading and transforming images to model evaluation.
-
-
-
Objective: Recommend movies using the TMDB dataset.
-
Techniques: Text analysis with NLTK, CountVectorizer, cosine similarity, Streamlit for deployment.
-
Highlights: Built and deployed a content-based recommendation system with an interactive user interface.
-
-
-
Objective: Predict whether a user will click on an advertisement.
-
Techniques: Data preprocessing, OneHotEncoder, StandardScaler, KNN, SVM, Decision Tree, Random Forest.
-
Highlights: Tackled a binary classification problem, focusing on model evaluation metrics like ROC curve and AUC score.
-
-
Identifying Gamma vs Hadron Events
-
Objective: Classify events detected by the MAGIC gamma telescope as gamma or hadron.
-
Techniques: Data analysis, classification algorithms.
-
Highlights: Applied machine learning to astrophysical data, bridging the gap between physics and data science.
-
-
-
Objective: Predict the number of Instagram likes based on followers, captions, and hashtags.
-
Techniques: Text preprocessing, TF-IDF vectorization, regression models (Linear, Ridge, Lasso, KNN, SVM, Decision Tree, Random Forest).
-
Highlights: Explored feature extraction from text data and compared multiple regression algorithms to determine the best predictor for likes.
-
-
-
Objective: Classify individuals into weight categories (e.g., Normal, Overweight, Obese) based on personal data.
-
Techniques: KNN, SVM, Decision Tree, Random Forest, hyperparameter tuning.
-
Highlights: Addressed a multiclass classification problem with both numerical and categorical data, achieving optimal results with Random Forest.
-
-
Scientific Publications:
Co-authored several research papers in high-energy physics, including studies on strange particle behavior in heavy ion collisons, particle multiplicity distributions in p+p and e+e collisions using Weibull function. -
Project Documentation: Each GitHub repository includes comprehensive README files and Jupyter notebooks with step-by-step explanations, making complex analyses understandable.
-
Programming Languages: Python
-
Data Analysis & Visualization: pandas, NumPy, matplotlib, seaborn
-
Machine Learning: scikit-learn, NLTK, Regression Analysis, Classification, Rare Event Classification, Recommendation System, Deep Learning, RNN, CNN, LSTM, Text and Sentiment Analysis.
-
Web Deployment: Streamlit
-
Version Control: Git, GitHub