Stars
A better notebook for Scala (and more)
Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
ADAM is a genomics analysis platform with specialized file formats built using Apache Avro, Apache Spark, and Apache Parquet. Apache 2 licensed.
Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials,…
Google Cloud Dataflow provides a simple, powerful model for building both batch and streaming parallel data processing pipelines.
Fault tolerant job scheduler for Mesos which handles dependencies and ISO8601 based schedules
Painlessly create beautiful matplotlib plots.
Recipes for using Python's pandas library
An interactive data visualization tool which brings matplotlib graphics to the browser using D3.