8000 GitHub - dyeee/MEV-Boost-Project
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

dyeee/MEV-Boost-Project

Repository files navigation

Ethereum MEV Blocks Analysis Project

This project includes several Python scripts that manage and analyze Ethereum MEV block data.

Additionally, Jupyter Notebook versions of the MEV_boost_EDA, MEV_boost_ML, and feature_select scripts are provided for easier viewing and interaction. During the development phase, the Spyder IDE was primarily used, offering a convenient variable explorer for debugging and enhancing model training speed. This choice is particularly important as the dataset exceeds 3 million entries, which can strain the Jupyter kernel and slow down processing.

For a deeper understanding of this research👉 you can watch the video on YouTube: Ethereum MEV Blocks Analysis

Current Status

  • The get_parquet.py script has completed execution. The output file is stored in the data folder.
  • The data_process.py script has been imported into other programs and does not need to be run independently.
  • The images generated by the MEV_boost_EDA.py and MEV_boost_ML.py scripts are stored in the graphs folder.

Scripts Description

get_parquet.py

  • Purpose: Verifies the format of the file ethereum_mev_blocks_19580000_to_19589999.parquet.
  • Functionality:
    • Performs basic data processing.
    • Saves the processed data as a CSV file.

data_process.py

  • Purpose: Handles data cleaning and feature engineering.
  • Functionality:
    • Processes the payload data to identify the winning bids within the bids data.
    • Stores the results in a DataFrame called matched_df.

MEV_boost_EDA.py

  • Purpose: Performs statistical analysis on various datasets.
  • Functionality:
    • Analyzes the bids data, payload data, and matched_df DataFrame.
    • Provides insights and visualizations to understand the characteristics and distributions within these datasets.

MEV_boost_ML.py

  • Purpose: Dedicated to model training, evaluation, and optimisation.
  • Functionality:
    • Uses cleaned and processed data to train machine learning models.
    • Evaluates the model performances.
    • Chooses best hyperparameters and slot ranges.

Dataset Sources

The data used in this project is sourced from the following:

This repository contains a collection of public domain Ethereum MEV-Boost winning bid data.

Important Note

The original dataset is large, and for the purposes of model training, it was initially set up with 3 million records. In the GitHub example, this has been reduced to 500,000 records for efficiency. The dataset file used is Eden_MEV-Boost_bid_20240404.csv.

Due to the reduced sample size, you might encounter errors in the slot_range parameter while running the script MEV_boost_ML.py. Specifically, if you use the following lines:

best_train_size1, best_rf1 = RF_turning(8787590, 1201, ...)
best_train_size2, best_rf2 = RF_turning(8787590, 1201, ...)
best_train_size3, best_rf3 = RF_turning(8787590, 1201, ...)

You may encounter issues because of the insufficient parameter settings. To resolve this, you can:

  1. Adjust the Slot Range Parameter: Change the 1201 value to 201 in the RF_turning function calls:

    best_train_size1, best_rf1 = RF_turning(8787590, 201, ...)
    best_train_size2, best_rf2 = RF_turning(8787590, 201, ...)
    best_train_size3, best_rf3 = RF_turning(8787590, 201, ...)
  2. Download the Full Dataset: Alternatively, you can query and download the complete dataset from Eden Public Data. Use the following SQL query to retrieve the data:

    SELECT block_timestamp, relay, slot, block_hash, gas_used, value, num_tx, block_number, timestamp, optimistic_submission
    FROM `eden-data-public.mev_boost.bids`
    WHERE TIMESTAMP_TRUNC(block_timestamp, DAY) BETWEEN TIMESTAMP("2024-02-01") AND TIMESTAMP("2024-04-04")
    ORDER BY block_timestamp DESC
    LIMIT 3000000

    After downloading the complete dataset, replace the Eden_MEV-Boost_bid_20240404.csv file in your local directory with the newly downloaded file.

Installation

Ensure you have Python 3.6+ installed. You can install the required dependencies via pip:

pip install pandas numpy matplotlib seaborn sklearn

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published
0