8000 GitHub - jklukas/sql-mysteries: Materials for SQL Mysteries Revealed: Window functions, Explain Plans, and Distributed Data
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
This repository was archived by the owner on Jan 22, 2022. It is now read-only.
/ sql-mysteries Public archive

Materials for SQL Mysteries Revealed: Window functions, Explain Plans, and Distributed Data

Notifications You must be signed in to change notification settings

jklukas/sql-mysteries

Repository files navigation

SQL Mysteries Revealed

Window functions, Explain Plans, and Distributed Data

Iterations

This talk has been presented at:

Viewing content

This repo collects some Jupyter notebooks used for teaching. You should be able to view the static notebooks directly in GitHub without installing anything on your machine.

Overview

While the examples given here use PostgreSQL and Amazon Redshift, the concepts and syntax mostly adhere to the SQL standard, so are applicable to other relational databases like MySQL and SQL server.

Here's the general flow of the talk:

  • Define and get familiar with the small dataset we'll be using
  • Review the syntax for SQL aggregations (function like SUM and COUNT)
  • Introduce window functions as an extension of aggregations
  • Use a window function to build a query for recreating current state of a table from a history of changes
  • Introduce Amazon Redshift and data warehouses
  • Iterate on our query and table structure to make them perform well with 1 TB+ of data

Further reading

Setting up the environment

If you want to execute these notebooks, you'll need to have a few things installed.

First, you'll need python 3 installed (which is beyond the scope of these instructions), then:

git clone https://github.com/jklukas/sql-mysteries
cd sql-mysteries/
python3 -m venv ./venv
source venv/bin/activate
pip install -r requirements.txt

Then fire up the Jupyter notebook server:

jupyter notebook

You'll also need an instance of Postgres to connect to. A good option is to install Docker and then launch:

docker run -e POSTGRES_PASSWORD=mysecretpassword -d -p5432:5432 postgres

To delete the Postgres container:

docker ps  # Look to see what the CONTAINER ID for postgres is
docker stop f0a2e8b1a83e  # replace with the relevant id
docker rm f0a2e8b1a83e

About

Materials for SQL Mysteries Revealed: Window functions, Explain Plans, and Distributed Data

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  
0