hydrodatasource

Free software: BSD license
Documentation: https://WenyuOuyang.github.io/hydrodatasource

Overview

Although numerous public watershed hydrological datasets are available, there are still challenges in this field:

Many datasets are not updated or included in subsequent versions after initial organization.
Some datasets remain uncovered by existing collections.
Non-public datasets cannot be directly shared.

To address these issues, hydrodatasource provides a framework to organize and manage these datasets, making them more efficient for use in watershed-based research and production scenarios.

This repository works in conjunction with hydrodataset, which focuses on public datasets for hydrological modeling. In contrast, hydrodatasource integrates a broader range of data resources, including non-public and custom datasets.

Data Classification and Sources

hydrodatasource processes data that primarily falls into three categories:

Category A Data (Public Data)

These are typically publicly available hydrological datasets from academic papers, currently including:

GAGES dataset
GRDC dataset
CRD and other reservoir datasets

Category B Data (Non-Public Data)

These datasets are often proprietary or confidential and require specific tools for formatting and integration, including:

Custom Station Data: User-prepared station data formatted according to standard specifications and converted to NetCDF format.

Category C Custom Datasets

Based on these two categories of data, we also organize a category of custom hydrological datasets, which are datasets constructed for specific research needs based on agreed standard formats.

Features and Highlights

Unified Data Management

hydrodatasource provides standardized methods for:

Structuring datasets according to predefined conventions.
Integrating various data sources into a unified framework.
Supporting data access and processing for hydrological modeling.

Compatibility with Local and Cloud Resources

Public Data: Supports data format conversion and local file operations.
Non-Public Data: Provides tools to format and integrate user-prepared data.

Modular Design

The repository structure supports diverse workflows, including:

Category A Datasets: Tools to organize and access public hydrological datasets.
Category B Data: Custom tools to clean and process station, reservoir, and basin time-series data.
Category C Custom Datasets: Support for reading data in defined standard dataset formats.

Other Interactions

hydrodatasource interacts with the following components:

hydrodataset: Provides necessary support for accessing public watershed hydrological modeling datasets for hydrodatasource.
HydroDataCompiler: Supports semi-automated processing of non-public and custom data (currently not public).

Installation

Install the package via pip:

pip install hydrodatasource

Note: The project is still in the early stages of development, so development mode is recommended.

Usage

Data Organization

The repository adopts the following directory structure for organizing data:

├── ClassA
  ├── 1st_origin
  ├── 2nd_process
├── ClassB
  ├── 1st_origin
  ├── 2nd_process
├── ClassC

1st_origin: Raw data, often from proprietary sources, in unified formats.
2nd_process: Intermediate results after initial processing and data ready for analysis or modeling.

Data Reading

The data reading code is mainly located in the reader folder. Currently, the main interface functions provided are:

Reading GRDC, GAGES, CRD and other datasets
Reading custom station data
Reading custom datasets for hydrological modeling

We will provide more detailed documentation in the future.

Name		Name	Last commit message	Last commit date
Latest commit History 350 Commits
.github		.github
data		data
docs		docs
hydrodatasource		hydrodatasource
images		images
notebooks		notebooks
scripts		scripts
tests		tests
.editorconfig		.editorconfig
.gitignore		.gitignore
.readthedocs.yaml		.readthedocs.yaml
AUTHORS.rst		AUTHORS.rst
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
README.zh.md		README.zh.md
env-dev.yml		env-dev.yml
mkdocs.yml		mkdocs.yml
requirements.txt		requirements.txt
requirements_dev.txt		requirements_dev.txt
requirements_docs.txt		requirements_docs.txt
setup.cfg		setup.cfg
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

hydrodatasource

Overview

Data Classification and Sources

Category A Data (Public Data)

Category B Data (Non-Public Data)

Category C Custom Datasets

Features and Highlights

Unified Data Management

Compatibility with Local and Cloud Resources

Modular Design

Other Interactions

Installation

Usage

Data Organization

Data Reading

About

Uh oh!

Releases 11

Packages

Uh oh!

Contributors 8

Uh oh!

Languages

License

iHeadWater/hydrodatasource

Folders and files

Latest commit

History

Repository files navigation

hydrodatasource

Overview

Data Classification and Sources

Category A Data (Public Data)

Category B Data (Non-Public Data)

Category C Custom Datasets

Features and Highlights

Unified Data Management

Compatibility with Local and Cloud Resources

Modular Design

Other Interactions

Installation

Usage

Data Organization

Data Reading

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 11

Packages 0

Uh oh!

Contributors 8

Uh oh!

Languages

Packages