8000 GitHub - Reddi-Srija-R/Data-wrangling: Comprehensive Data Wrangling Techniques
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Reddi-Srija-R/Data-wrangling

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 

Repository files navigation

Data Wrangling

Overview

This repository focuses on data wrangling techniques applied to a Diabetes Prediction Machine Learning project. It is organized into two main parts:

  1. CSV Data Processing: Comprehensive handling of data in CSV format, including exploratory data analysis (EDA), data cleaning, transformation, and analysis.
  2. JSON Data Handling: A Python script for data extraction, cleaning, and transformation from JSON format, utilizing Object-Oriented Programming (OOP) principles.

CSV Data Processing

This part of the project handles the CSV data used for diabetes prediction, covering:

a. Exploratory Data Analysis (EDA):

  • Analyzed dataset characteristics.
  • Generated summary statistics and visualizations.
  • Identified patterns, trends, and outliers.

b. Data Cleaning:

  • Managed missing values and outliers.
  • Corrected data inconsistencies and removed duplicates.
  • Filtered out irrelevant features.

c. Data Transformation:

  • Applied normalization and scaling techniques.
  • Encoded categorical variables.
  • Engineered new features for improved model performance.

d. Data Analysis:

  • Conducted statistical analyses and hypothesis testing.
  • Evaluated the influence of features on diabetes prediction.

JSON Data Handling

This part provides a Python script for working with JSON data, featuring:

a. Data Extraction:

  • Reads and parses JSON data files.

b. Data Cleaning:

  • Cleans JSON data using custom methods.

c. Data Transformation:

  • Converts data into a format suitable for further analysis or model training.

d. OOP Concepts:

  • Uses classes and objects to manage data processing tasks.

e. Logging and Exception Handling:

  • Implements logging for execution tracking and debugging.
  • Includes exception handling to address potential errors.

Thank you!

Releases

No releases published

Packages

No packages published

Languages

0