Stars
数据建设与大数据技术知识体系,包含hadoop、hive、spark、flink主流框架和系列框架,数据中台、数据湖、数据治理、数仓建设、数据化转型等
汇总Apache Hudi中的一些Demo,便于快速上手Apache Hudi(Apache Hudi Demos to help beginners know about Hudi)
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
The official home of the Presto distributed SQL query engine for big data
Virtual whiteboard for sketching hand-drawn like diagrams
Dinky is a real-time data development platform based on Apache Flink, enabling agile data development, deployment and operation.
Make stream processing easier! Easy-to-use streaming application development framework and operation platform.
An experimental materialized view solution based on TiDB/TiKV and Flink with strong consistency support.
镜像:https://scaffrey.coding.net/p/hosts/git / https://git.qvq.network/googlehosts/hosts
DataX集成可视化页面,选择数据源即可一键生成数据同步任务,支持RDBMS、Hive、HBase、ClickHouse、MongoDB等数据源,批量创建RDBMS数据同步任务,集成开源调度系统,支持分布式、增量同步数据、实时查看运行日志、监控执行器资源、KILL运行进程、数据源信息加密等。
【Java面试+Java学习指南】 一份涵盖大部分Java程序员所需要掌握的核心知识。
给flink开发的web系统。支持页面上定义udf,进行sql和jar任务的提交;支持source、sink、job的管理;可以管理openshift上的flink集群
Logback appender for Apache Kafka
Java library and command-line application for converting LightGBM models to PMML
SeaTunnel is a next-generation super high-performance, distributed, massive data integration tool.
Wormhole is a SPaaS (Stream Processing as a Service) Platform
Support agile DataOps Based on Flink, DataX and Flink-CDC, Chunjun with Web-UI
flink learning blog. http://www.54tianzhisheng.cn/ 含 Flink 入门、概念、原理、实战、性能调优、源码解析等内容。涉及 Flink Connector、Metrics、Library、DataStream API、Table API & SQL 等内容的学习案例,还有 Flink 落地应用的大型项目案例(PVUV、日志存储、百亿数据实时去…
SQL-based streaming analytics platform at scale