Clover: Closed-Loop Verifiable Code Generation [paper]

Abstract

The use of large language models for code generation is a rapidly growing trend in software development. However, without effective methods for ensuring the correctness of generated code, this trend could lead to any number of undesirable outcomes. In this paper, we lay out a vision for addressing this challenge: the Clover paradigm, short for Closed-Loop Verifiable Code Generation, which reduces correctness checking to the more accessible problem of consistency checking. At the core of Clover lies a checker that performs consistency checks among code, docstrings, and formal annotations. The checker is implemented using a novel integration of formal verification tools and large language models. We provide a theoretical analysis to support our thesis that Clover should be effective at consistency checking. We also empirically investigate its feasibility on a hand-designed dataset (CloverBench) featuring annotated Dafny programs at a textbook level of difficulty. Experimental results show that for this dataset, (i) LLMs are reasonably successful at automatically generating formal specifications; and (ii) our consistency checker achieves a promising acceptance rate (up to 87%) for correct instances while maintaining zero tolerance for incorrect ones (no false positives).

Setup Guide

1. Environment Setup

Option 1: Using Python venv

# Create and activate a virtual environment
python3 -m venv clover_env
source clover_env/bin/activate

# Install requirements
pip install "sglang[openai]" tqdm

Option 2: Using the provided configuration

We've created a ZSH configuration file to make working with Clover easier:

source .clover_zshrc

2. Install Dafny

Dafny is required to run the Clover experiments. You can install it with:

macOS

brew install dafny

Linux

For Linux installation (Ubuntu, Debian, Arch), see the detailed instructions in clover_env_setup.md.

Basic Ubuntu/Debian installation:

# Install dependencies
sudo apt install -y mono-complete dotnet-sdk-7.0

# Download and install Dafny
VERSION=4.3.0
wget https://github.com/dafny-lang/dafny/releases/download/v${VERSION}/dafny-${VERSION}-x64-ubuntu-20.04.zip
unzip dafny-${VERSION}-x64-ubuntu-20.04.zip -d dafny
sudo mv dafny /usr/local/
echo 'export PATH="$PATH:/usr/local/dafny"' >> ~/.bashrc
source ~/.bashrc

Other platforms

Download from the Dafny GitHub releases and add it to your PATH.

3. Set OpenAI API Key

Set your OpenAI API key as an environment variable:

export OPENAI_API_KEY=your_api_key_here

Recommended: Use a private key file (not committed to git)

For security, create a private key file that won't be committed to git:

# Create the private key file
cat > private_key.sh << 'EOF'
#!/bin/bash
export OPENAI_API_KEY="your-actual-api-key-here"
EOF

# Make it executable
chmod +x private_key.sh

# Use it
source private_key.sh

This approach is automatically supported by the included .clover_zshrc and set_openai_key.sh scripts.

4. Run Experiments

Quick Testing

cd clover
python exps.py --num-trial 1 --dafny-path $(which dafny) --model gpt-4o

Full Experiment

cd clover
python exps.py --num-trial 50 --dafny-path $(which dafny) --model gpt-4o

Debug Mode

cd clover
python debug_exp.py --num-trial 1 --verbose 2 --model gpt-4o --dafny-path $(which dafny)

Model Options

By default, the code now uses the gpt-4o model. You can specify a different model with the --model parameter:

python exps.py --model gpt-4-1106-preview --dafny-path $(which dafny)

Will receive 87% acceptance rate on CloverBench (Ground-Truth) dataset with gpt-4-1106-preview in March 2024.

Citation

@misc{sun2023clover,
      title={Clover: Closed-Loop Verifiable Code Generation}, 
      author={Chuyue Sun and Ying Sheng and Oded Padon and Clark Barrett},
      year={2023},
      eprint={2310.17807},
      archivePrefix={arXiv},
      primaryClass={cs.SE}
}

LLM Cache Feature

To reduce costs and speed up experiments, Clover now includes a caching system for LLM calls. This system:

Saves responses from OpenAI API calls to avoid redundant requests
Significantly speeds up repeated benchmark runs
Provides detailed statistics on cache performance

Running with Cache

Use the provided script to run Clover with caching enabled:

./run_clover_cached.sh --test abs

Controlling the Cache

You can enable or disable the cache with environment variables:

# Enable caching (default)
export ENABLE_LLM_CACHE=1

# Disable caching
export ENABLE_LLM_CACHE=0

For more details, see README_LLM_CACHE.md.

Name		Name	Last commit message	Last commit date
Latest commit History 96 Commits
clover		clover
clover_end2end		clover_end2end
clover_mbpp		clover_mbpp
dataset		dataset
figures		figures
.DS_Store		.DS_Store
.clover_zshrc		.clover_zshrc
.gitignore		.gitignore
HOW_TO_RUN.md		HOW_TO_RUN.md
LICENSE		LICENSE
README.md		README.md
README_LLM_CACHE.md		README_LLM_CACHE.md
clover_env_setup.md		clover_env_setup.md
run_clover.sh		run_clover.sh
run_clover_cached_with_logs.sh		run_clover_cached_with_logs.sh
set_openai_key.sh		set_openai_key.sh
test_llm_cache.sh		test_llm_cache.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Clover: Closed-Loop Verifiable Code Generation [paper]

Abstract

Setup Guide

1. Environment Setup

Option 1: Using Python venv

Option 2: Using the provided configuration

2. Install Dafny

macOS

Linux

Other platforms

3. Set OpenAI API Key

Recommended: Use a private key file (not committed to git)

4. Run Experiments

Quick Testing

Full Experiment

Debug Mode

Model Options

Citation

LLM Cache Feature

Running with Cache

Controlling the Cache

About

Uh oh!

Releases

Packages

Contributors 3

Uh oh!

Languages

License

ChuyueSun/Clover

Folders and files

Latest commit

History

Repository files navigation

Clover: Closed-Loop Verifiable Code Generation [paper]

Abstract

Setup Guide

1. Environment Setup

Option 1: Using Python venv

Option 2: Using the provided configuration

2. Install Dafny

macOS

Linux

Other platforms

3. Set OpenAI API Key

Recommended: Use a private key file (not committed to git)

4. Run Experiments

Quick Testing

Full Experiment

Debug Mode

Model Options

Citation

LLM Cache Feature

Running with Cache

Controlling the Cache

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Uh oh!

Languages

Packages