tea-tasting is a Python package for the statistical analysis of A/B tests featuring:
- Student's t-test, Z-test, bootstrap, and quantile metrics out of the box.
- Extensible API that lets you define and use statistical tests of your choice.
- Delta method for ratio metrics.
- Variance reduction using CUPED/CUPAC, which can be combined with the Delta method for ratio metrics.
- Confidence intervals for both absolute and percentage changes.
- Checks for sample-ratio mismatches.
- Power analysis.
- Multiple hypothesis testing (family-wise error rate and false discovery rate).
- Simulated experiments, including A/A tests.
tea-tasting calculates statistics directly within data backends such as BigQuery, ClickHouse, DuckDB, PostgreSQL, Snowflake, Spark, and many other backends supported by Ibis. This approach eliminates the need to import granular data into a Python environment.
tea-tasting also accepts dataframes supported by Narwhals: cuDF, Dask, Modin, pandas, Polars, PyArrow.
uv pip install tea-tasting
>>> import tea_tasting as tt
>>> data = tt.make_users_data(seed=42)
>>> experiment = tt.Experiment(
... sessions_per_user=tt.Mean("sessions"),
... orders_per_session=tt.RatioOfMeans("orders", "sessions"),
... orders_per_user=tt.Mean("orders"),
... revenue_per_user=tt.Mean("revenue"),
... )
>>> result = experiment.analyze(data)
>>> result
metric control treatment rel_effect_size rel_effect_size_ci pvalue
sessions_per_user 2.00 1.98 -0.66% [-3.7%, 2.5%] 0.674
orders_per_session 0.266 0.289 8.8% [-0.89%, 19%] 0.0762
orders_per_user 0.530 0.573 8.0% [-2.0%, 19%] 0.118
revenue_per_user 5.24 5.73 9.3% [-2.4%, 22%] 0.123
Learn more in the detailed user guide. Additionally, see the guides on more specific topics:
The tea-tasting repository includes examples as copies of the guides in the marimo notebook format. You can either download them from GitHub and run in your marimo environment, or you can run them as WASM notebooks in the online playground.
To run the examples in your marimo environment, clone the repository and change the directory:
git clone git@github.com:e10v/tea-tasting.git && cd tea-tasting
Install marimo, tea-tasting, and other packages used in the examples:
uv venv && uv pip install marimo tea-tasting polars ibis-framework[duckdb] tqdm
Launch the notebook server:
uv run marimo edit examples
Now you can choose and run the example notebooks.
To run the examples as WASM notebooks in the online playground, open the following links:
- User guide.
- Data backends.
- Power analysis.
- Multiple hypothesis testing.
- Custom metrics.
- Simulated experiments.
WASM notebooks run entirely in the browser on Pyodide and thus have some limitations. In particular:
- Tables and dataframes render less attractively because Pyodide doesn't always include the latest packages versions.
- You can't simulate experiments in parallel because Pyodide currently doesn't support multiprocessing.
- Other unpredictable issues may arise, such as the inability to use duckdb with ibis.
The package name "tea-tasting" is a play on words that refers to two subjects:
- Lady tasting tea is a famous experiment which was devised by Ronald Fisher. In this experiment, Fisher developed the null hypothesis significance testing framework to analyze a lady's claim that she could discern whether the tea or the milk was added first to the cup.
- "tea-tasting" phonetically resembles "t-testing", referencing Student's t-test, a statistical method developed by William Gosset.