DR.FIX is a research project that provides a collection of data race skeletons extracted from Uber's Go codebase. These skeletons are created through program slicing to retain key elements such as racy variables and concurrency constructs, making them valuable for research in concurrency bug detection and analysis.
This repository is part of our research on automated data race detection and fixing. For more details, please refer to our paper: DR.FIX: Automatically Fixing Data Races at Industry Scale.
These data race skeletons are valuable for:
- Training data race detectors
- Analyzing common patterns in Go data races
- Studying the relationship between write operations and race conditions
- Developing new techniques for race detection and prevention
cmd/analyzer/
: Contains the Go analyzer that detects write operations on racy variablesinternal/analyzer/
: Core analysis logic for detecting write operationsscripts/
: Python scripts for processing and verifying the skeletonsdata/skeletons/
: Directory containing the data race skeletonsoutput/
: Directory for analysis results and combined files
- Go 1.16 or later
- Python 3.8 or later
- Make
-
Clone the repository:
git clone https://github.com/uber-research/drfix.git cd drfix
-
Create and activate the Python virtual environment:
python3 -m venv venv source venv/bin/activate # On Windows, use: venv\Scripts\activate pip install -r requirements.txt
Note: You need to activate the virtual environment in each new terminal session where you want to run Python commands.
Build the Go analyzer:
make build
This will compile the analyzer binary and place it in the bin/
directory.
Run the analysis:
make analyze
This will:
- Process all skeleton pairs in
data/skeletons/
- Generate analysis results in
output/analyzer_results.csv
- Create a combined file in
output/combined.go
Run all tests:
make test
This will run both Go and Python tests.
To enable debug output from both the Go analyzer and Python script:
DEBUG=1 make analyze
The Go analyzer (cmd/analyzer/main.go
) performs AST-based analysis to detect write operations on racy variables. It outputs results in JSON format:
{
"file": "path/to/file.go",
"has_write": true,
"line_number": 42,
"line_content": "racyVar0 = 42",
"error": null
}
The Python script (scripts/process.py
) verifies the skeletons and generates a comprehensive CSV report. It checks:
- Write operations on racy variables
- Syntactical validity of Go code
- Presence of required concurrency constructs
- File characteristics and statistics
The verification script generates a CSV file with the following columns:
case_id
: Directory name (e.g., D7959843)file1_name
: Name of the first filefile2_name
: Name of the second file
file1_status
: Boolean indicating if first file contains a write operation (True/False)file1_line
: Line containing write operation in first file (if True)file2_status
: Boolean indicating if second file contains a write operation (True/False)file2_line
: Line containing write operation in second file (if True)has_write
: TRUE if either file contains a write to a racy variablerace_type
: Type of race conditionread-write
: One file reads while the other writes to the same variablewrite-write
: Both files write to the same variable
file1_size
: Size of first file in bytesfile2_size
: Size of second file in bytesfile1_line_count
: Number of lines in first filefile2_line_count
: Number of lines in second filefile1_has_package
: Whether first file has package declaration (True/False)file2_has_package
: Whether second file has package declaration (True/False)file1_has_comments
: Whether first file has comments (True/False)file2_has_comments
: Whether second file has comments (True/False)
error_message
: Any error messages encountered during analysis
Example CSV output:
case_id,file1_name,file1_status,file1_line,file2_name,file2_status,file2_line,has_write,race_type,file1_size,file2_size,file1_line_count,file2_line_count,file1_has_package,file2_has_package,file1_has_comments,file2_has_comments,error_message
D13766163,read2.go,True,"v22, racyVar0 = v2.v37.Func45(v13)",write1.go,True,"v22, racyVar0 = v2.v37.Func45(v13)",TRUE,write-write,1239,1239,60,60,True,True,False,False,
D15314495,read1.go,True,racyVar0 += int64(v10),write2.go,True,racyVar0 += int64(v10),TRUE,read-write,980,980,25,25,True,True,False,False,
D12646115,read2.go,True,"for _, racyVar0 := range v1 {",write1.go,True,"for _, racyVar0 := range v1 {",TRUE,read-write,273,273,13,13,True,True,False,False,
This project is licensed under the MIT License - see the LICENSE file for details.
If you use this work in your research, please cite:
@article{behrang2025drfix,
author = {Behrang, Farnaz and Zhang, Zhizhou and Saioc, Georgian-Vlad and Liu, Peng and Chabbi, Milind},
title = {{DR.FIX: Automatically Fixing Data Races at Industry Scale}},
year = {2025},
issue_date = {June 2025},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
volume = {9},
number = {Programming Language Design and Implementation},
articleno = {166},
numpages = {37},
url = {https://doi.org/10.1145/3729265},
doi = {10.1145/3729265},
journal = {Proceedings of the ACM on Programming Languages},
month = {jun},
keywords = {Data Races, Program Analysis, Static Analysis, Dynamic Analysis, Go}
}
We welcome contributions in the following areas:
- New data race skeletons
- Improvements to the verification tools
- Analysis of race patterns and write operations
- Documentation and examples
Please read our Code of Conduct and Contributing Guidelines before contributing.