Matthew Hondrakis MattHondrakis

👋 Hi, I’m @MattHondrakis
🧠 I’m interested in Probability/Statistics, Actuarial Science, and Data Science.
🌱 I received a Data Scientist Associate Certification from DataCamp and Google Data Analytics Certificate from Coursera.
🏆 I 7A7F have a BS in Applied Mathematics from The City College of New York.
📫 How to reach me: hondrakma@gmail.com

Datasets I found most interesting:

🧩 => Structured Analysis
💫 => Unstructered Chronological Analysis
💻 => Machine Learning Model

NYC House Prices (DataAnalysis) 💫 💻
- GAM, Random Forest and Linear Regression models, predicting prices of Real Estate properties in NYC. The type of property (Condo, Apartment, etc.) is extracted from the home_details variable, which plays a crucial role in the modeling process. Models are then compared against eachother by key metrics, such as R² and Root Mean Squared Error.
Job Placement (DataAnalysis) 🧩 💻
- Validated data by checking for and appropriately dealing with missing values and outliers. Explored trends and correlations between different variables, utilizing visualizations and statistical tests. Subquently, created two models (Random Forest and Logistic Regression) predicting whether an individual received a job offer, with the best model’s values for accuracy and AUC being 0.853 and 0.932, respectively. Finally, analyzed the variable importance of each predictor in both models and compared.
Coursera Case Study: Bikes (DataAnalysis) 🧩
- Google Analytics Case Study (fictional company Cyclistic), analyzing a large dataset of more than 6 million rows of data using R to extract insights. The purpose of the Case Study is to get casual users to convert to memberships. Thorough exploratory data analysis with a conclusion providing suggestions for improvement and steps moving forward.
Starbucks (First-Git) 💫 💻
- One of the first real world datasets I ever worked on.
- Logistic Regression model predicting whether a drink is a Frappuccino based on sodium (mg). The status of 'Frappuccino' is extracted from the name of the drink using text manipulation.

Most Recent/Actively working on:

Analyses are usually done in R but sometimes replicated in Python

Dataset: Premier League

R (TidyTuesday)

Dataset: Egg Production

R (TidyTuesday)

Note: For the sake of practice, I tend to jump from one dataset to the next.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Matthew Hondrakis MattHondrakis

Block or report MattHondrakis

Datasets I found most interesting:

Most Recent/Actively working on:

Dataset: Premier League

Dataset: Egg Production

Featured Visuals: Click Image to View Analysis

Pinned Loading

Uh oh!