I’m an AI researcher and data scientist with an M.Tech in Data Science from IIT Madras and a B.Tech in Computer Science from VIT. My core focus lies at the intersection of Reinforcement Learning (RL), Large Language Models (LLMs), and Deep Learning, where I strive to push the boundaries of AI capabilities.
Research Interests:
- Reinforcement Learning for LLMs (RL4LLMs)
- Multi-Agent Reinforcement Learning (MARL)
- Multi-Agent Reasoning (MAR)
- Communication in Multi-Agent Systems (MAS-Comm)
- Causality in Reinforcement Learning (Causality & RL)
- Representation Learning for Reinforcement Learning (RepL4RL)
I’m seeking a PhD or research position where I can drive innovative AI projects alongside a collaborative team.
- M.Tech Thesis: DNF-Net: A DL Approach for Advancing Breast Cancer Detection in Histopathology Images. (Poster / PPT)
- Built a magnification-invariant hybrid model that synergizes fuzzy logic—to explicitly handle diagnostic uncertainty (fuzziness)—with deep-learning backbones (Xception, InceptionV3, DenseNet-169) for adv 8000 anced hierarchical feature extraction, yielding a 5% accuracy gain over SOTA on BreakHis and BACH histopathology datasets—robustly validated at 40×, 100×, 200×, and 400× magnifications and across 2-/4-/8-class tasks.
- Keywords: deep-learning; fuzzy-logic; magnification-invariance; medical-image-analysis; histopathology; image-classification
- B.Tech Thesis: CXRcovNet: COVID‑19 detection from CXR images using transfer learning approaches. (Repo / PPT)
- Applied Transfer Learning techniques using pre-trained CNN models to classify COVID-19 from Chest X-Ray (CXR) images.
- Keywords: computer-vision, deep-learning, transfer-learning, covid-19, cxr, image-classification
- Reinforcement Fine-Tuning LLMs with GRPO (Repo)
- Investigated the efficacy of GRPO for RFT of LLMs, adapting models for complex reasoning and strategic tasks (demonstrated via a Wordle-style game with Qwen 2.5 7B).
- Tech Stack: Python, PyTorch, RL, LLMs, GRPO
- Keywords: rlft, grpo, llms, reinforcement-learning, fine-tuning, Reward functions, Reward hacking, Calculating loss in GRPO
- Hierarchical Reinforcement Learning (IITM CS6700 PA3) (Repo)
- Implemented and evaluated Hierarchical RL techniques (SMDP Q-Learning, Intra-Option Q-Learning) in the Taxi-v3 environment, analyzing the impact of option design on learning efficiency and policy structure.
- Tech Stack: Python, RL (Hierarchical RL, Q-Learning), OpenAI Gym
- Keywords: hierarchical-rl, smdp, intra-option-q-learning, reinforcement-learning, taxi-v3
- Dueling-DQN & Monte Carlo REINFORCE (IITM CS6700 PA2) (Repo)
- Implemented and compared Dueling-DQN (Type-1 vs Type-2) and Monte Carlo REINFORCE (with/without baseline) algorithms on Acrobot-v1 and CartPole-v1 environments.
- Tech Stack: Python, PyTorch, RL (DQN, Policy Gradient), OpenAI Gym
- Keywords: dueling-dqn, reinforce, baseline, deep-reinforcement-learning, acrobot-v1, cartpole-v1
- Temporal Difference Learning (SARSA & Q-Learning) (IITM CS6700 PA1) (Repo)
- Implemented and compared TD algorithms (SARSA and Q-Learning) in a custom 10x10 Grid World with stochastic transitions and wind effects, building a strong base in core RL concepts.
- Tech Stack: Python, RL (TD Learning, Q-Learning, SARSA), NumPy, Matplotlib
- Keywords: Temporal Difference, SARSA, Q-Learning, Gridworld, Reinforcement Learning, Stochastic Environments
- Feedforward Neural Networks (FNN) from Scratch (IITM CS6910 PA1) (Repo / W&B Report)
- Built an end-to-end NumPy-only FNN for Fashion-MNIST classification, integrating six optimizers (SGD, Momentum, NAG, RMSProp, Adam, Nadam), four activations (sigmoid, tanh, ReLU, softmax), two losses (MSE, Cross-Entropy), weight initialization (Xavier, random), regularization (L1, L2), early stopping, and W&B-driven hyperparameter sweeps.
- Tech Stack: Python, NumPy, Matplotlib, Seaborn, Scikit-learn, Weights & Biases
- Keywords: feedforward-NN, backpropagation, optimizers, activation-functions, initialization, regularization, hyperparameter-tuning
- Convolutional Neural Networks (CNN) (IITM CS6910 PA2) (Repo / W&B Report)
- A two-fold project—(i) trained a CNN from scratch in PyTorch with Bayesian hyperparameter optimization via W&B sweeps (tuning filters, kernel sizes, batch norm, dropout, augmentation), including filter visualization and guided backpropagation for interpretability, and (ii) fine-tuned a pre-trained CNN model for performance benchmarking and comparison.
- Tech Stack: Python, PyTorch, OpenCV, Weights & Biases
- Keywords: CNN, Hyperparameter Optimization, Bayesian Optimization, Data Augmentation, Filter Visualization, Guided Backpropagation, Interpretability, W&B
- Sequence-to-Sequence Learning (RNN) (IITM CS6910 PA3) (Repo / W&B Report)
- The project is fourfold: (i) model seq2seq tasks using RNNs, (ii) compare architectures including vanilla RNN, LSTM, and GRU, (iii) explore how attention mechanisms address the limitations of basic seq2seq models, and (iv) visualize component interactions within RNN-based models using the Aksharantar Dataset for English-to-Malayalam transliteration.
- Tech Stack: Python, PyTorch, Weights & Biases
- Keywords: Seq2Seq, Attention Mechanisms, RNN, LSTM, GRU, Transliteration, Encoder-Decoder, Attention Heatmaps, NLP
- Advanced Information Retrieval System (IITM CS6370) (Repo / Report)
- Built a hybrid search engine combining TF–IDF VSM, LSA, and a BERT-based reranker for top-k retrieval, with end-to-end evaluation (Precision@k, MAP, nDCG) on the Cranfield and Brown corpora.
- Tech Stack: Python, Scikit-learn, Gensim, PyTorch, Transformers
- Keywords: Information Retrieval, TF–IDF, LSA, ESA, Word2Vec, BERT Reranking, Evaluation Metrics, NLP, Semantic Search
- Dell Tweets Sentiment Analysis (kaggle)
- Performed end-to-end NLP on 25k Dell tweets (2022), including text cleaning (tokenization, lemmatization, stop-word filtering), TF-IDF vectorization, word cloud visualization, hybrid CNN-LSTM model training (cross-entropy loss, Adam optimizer), and real-time deployment via Streamlit.
- Tech Stack: Python, NLTK, Scikit-learn, TensorFlow/Keras, Streamlit
- Keywords: Keywords: Sentiment Analysis, NLP, TF-IDF, CNN-LSTM, Word Cloud, Text Preprocessing, Streamlit Deployment
- Cereals Recommendation System (Repo / Report / PPT)
- Built a clustering-based recommendation engine using K-Means (Jaccard/Euclidean) and Hierarchical clustering (Ward linkage), validated with elbow method and silhouette scores.
- Tech Stack: Python, Scikit-learn, Pandas, Matplotlib
- Keywords: Recommendation Systems, Clustering Algorithms, Cluster Validation, Unsupervised Learning, Pattern Recognition
- House Price Prediction – Kaggle (Top 35%) (Repo)
- Achieved top 35% (1427/4264) in Kaggle competition by applying XGBoost, Random Forest, and regularized regression (Lasso/Ridge/Elastic Net) with GridSearchCV/RandomizedSearchCV tuning (RMSE: 0.13).
- Tech Stack: Python, Scikit-learn, XGBoost, Pandas, NumPy
- Keywords: Regression Analysis, Hyperparameter Optimization, Regularization Techniques, Model Stacking, Kaggle Competition
- Mathematical Essays on Core ML Algorithms
- Authored a series of mathematical essays (formatted in IEEE style using LaTeX) dissecting the theoretical underpinnings, derivations, and applications of fundamental ML algorithms:
- Tech Stack: LaTeX, Python (for supporting visualizations/analysis)
- Keywords: Ml Theory, Math Foundations, Linear Regression, Logistic Regression, Decision Trees, Random Forest, Naive Bayes, SVM, LaTeX
- Beyond the Horizon: Exploring the Impact of AI on Early Cancer Detection & Diagnosis — A Comprehensive Review
- Journal: Computers in Biology and Medicine
- Submission Date: January 2025
- Manuscript ID: CIBM-D-25-00543
- Status: Under Review
Certificate/Specialization | Provider | Date Completed | Link ID |
---|---|---|---|
Advanced Large Language Model Agents | UC Berkeley | May 2025 | Soon, May 31, 2025 |
Linguistic Linked Data – Advanced Topics | German UDS Academy | May 2025 | View Certificate |
Linguistic Linked Data – Essentials | German UDS Academy | Apr 2025 | View Certificate |
Natural Language Processing | Udemy, Inc. | Aug 2023 | View Certificate |
The Complete Python Bootcamp | Udemy, Inc. | Aug 2023 | View Certificate |
Mathematics for ML & DS Specialization | DeepLearning.AI | Jun 2023 | View Certificate |
Machine Learning Specialization | DeepLearning.AI | Jan 2023 | View Certificate |
Google Digital Marketing & E-commerce Specialization | Jan 2023 | View Certificate | |
Google Data Analytics Specialization | Apr 2022 | View Certificate |