10000 GitHub - EhsanMehdipour/PFT_gapfilling: This repository provides scripts for applying DINEOF and DINCAE to gap-fill total chlorophyll-a (TChla) and chlorophyll-a concentrations of five major phytoplankton functional types (PFTs) in the Atlantic Ocean.
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

This repository provides scripts for applying DINEOF and DINCAE to gap-fill total chlorophyll-a (TChla) and chlorophyll-a concentrations of five major phytoplankton functional types (PFTs) in the Atlantic Ocean.

License

Notifications You must be signed in to change notification settings

EhsanMehdipour/PFT_gapfilling

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

84 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DOI DOI

Gapfilling of phytoplankton functional types (PFT)

Objective: This repository contains scripts for implementing and analyzing two well-established satellite gap-filling methods, DINEOF and DINCAE. These methods are applied to fill gaps in total chlorophyll-a (TChla) and chlorophyll-a concentrations for five major phytoplankton functional type (PFT) datasets in the Atlantic Ocean, sourced from the Copernicus Marine Service.

Project: Assessment of gap-filling techniques applied to satellite phytoplankton composition products for the Atlantic Ocean

Gradient-filed

Time-series of Diatom

Requirements

Models:

DINEOF

DINCAE

Datasets:

PFT Dataset ID: cmems_obs-oc_glo_bgc-plankton_my_l3-multi-4km_P1D

SST Dataset ID: METOFFICE-GLO-SST-L4-NRT-OBS-SST-V2

Installation

Use Mamba or Conda for installation of necessary packages

mamba create --name PFT_gapfilling -c conda-forge --file requirements.txt

or

conda create --name PFT_gapfilling -c conda-forge --file requirements.txt

Type II regression: For type II regression in matchup analysis use pylr2. The installation is explained in the package or directly use pip and git:

pip install git+https://github.com/OceanOptics/pylr2.git

Interactive Environment

You can connect the environment to JupyterLab using:

python -m ipykernel install --user --name=PFT_gapfilling --display-name "PFT_gapfilling"

Execute

All the necessary parameters for running all the scripts are stored in the params.py file. You can define the regions of interest (ROI) by changing the data/regions.csv file for defining different regions. The data pipeline is described in the corresponding publication. A brief introduction to the workflow and all the scripts:

flowchart TD
    subgraph **Datasets**
        A1@{shape: cyl, label: "SST
        satellite product" }
        %% A1[SST]
        A2@{shape: cyl, label: "TChla + PFTs 
        satellite products" }
        %% A2[TChla+5PFT]
        A3@{shape: cyl, label: "In situ measurement" }
        %% A3[In situ]
    end

    A1-->D[Resample]
    A2-->E[Log-transform]

    subgraph **Preprocessing**

        D-->F[Merging]
        E-->F
        F-->G[ROI extraction]
        G-->H{Partitioning}

        H-->I1@{shape: cyl, label: "Training dataset"}
        H-->I2@{shape: cyl, label: "Validation dataset"}
        H-->I3@{shape: cyl, label: "Test dataset"}
    end
    I1-->J[Gap-filling]
    I2-->J
    subgraph **Processing**
        J
    end
    J-->K@{shape: cyl, label: "Reconstructed 
    satellite dataset"}
    subgraph **Postprocessing**
        K-->L[Blending]
        L-->N[Evaluation using 
        test]
        K-->M
        I3-->N
    end
    A3-->M[Evaluation using 
        in situ]
Loading
  • params.py: Dictionary containing all necessary directories and parameters.
  • function.py: Extra functions used in several scripts.
  • 1_1_preprocessing (input_analysis).ipynb: Analysis the PFT satellite dataset for missing rate and uncertainty.
  • 1_2_preprocessing (data_partitioning).ipynb: Partitioning the data into training, validation and test datasets.
  • 2_1_processing (DINCAE_random_search).jl: Setting in Julia for random hyperparameters search for DINCAE gap-filling method.
  • 2_2_processing (DINCAE_random_search).sh: SLURM parallelization of DINCAE random hyperparameter search to several HPC nodes.
  • 2_3_processing (DINEOF_random_search).py: Setting in Python for random hyperparameters search for DINEOF gap-filling method.
  • 2_4_processing (DINEOF_random_search).sh: SLURM parallelization of DINEOF hyperparameter random search to several HPC nodes.
  • 2_5_processing (DINCAE_loss_function).ipynb: Visualisation of the DINCAE loss function progress.
  • 3_1_processing (DINCAE_final).jl: Optimal reconstruction run setting in Julia for DINCAE gap-filling method.
  • 3_2_processing (DINCAE_final).sh: SLURM parallelization of DINCAE optimal reconstruction of different areas to several HPC nodes.
  • 3_3_processing (DINEOF_final).py: Optimal reconstruction run setting in Python for DINEOF gap-filling method.
  • 3_4_processing (DINEOF_final).sh: SLURM parallelization of DINEOF optimal reconstruction of different areas to several HPC nodes.
  • 4_1_postprocessing (cross_validation_DINCAE).ipynb: Cross-validation between validation and test dataset with DINCAE gap-filled product.
  • 4_1_postprocessing (cross_validation_DINEOF).ipynb: Cross-validation between validation and test dataset with DINEOF gap-filled product.
  • 4_2_postprocessing (cross_validation_error_distribution).ipynb: Plot the spatial distribution of cross-validation error.
  • 5_1_postprocessing (degree_of_smoothing).ipynb: Compute the degree of smoothing.
  • 5_2_postprocessing (gradient_field).ipynb: Compute the gradient field using sobel edge detection algorithm.
  • 6_1_postprocessing (validation_merging_regions).ipynb: Merging the original satellite dataset and gap-filled product by using feathering.
  • 6_2_postprocessing (validation_extract_matchups).ipynb: Extracting the matchups between in situ measurement and merged satellite products.
  • 6_3_postprocessing (validation_plot_matchups).ipynb: Independent validation using matchups between in situ measurement and satellite products.
  • 7_1_postprocessing (time-series).ipynb: Visualisation of the time series of the satellite products.

Credit

© Ehsan Mehdipour, 2025. (ehsan.mehdipour@awi.de)

Alfred Wegener Insitute for Polar and Marine Research, Bremerhaven, Germany

This work is licensed under the GNU General Public License v3.0 (GPL-3.0).

About

This repository provides scripts for applying DINEOF and DINCAE to gap-fill total chlorophyll-a (TChla) and chlorophyll-a concentrations of five major phytoplankton functional types (PFTs) in the Atlantic Ocean.

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

0