PRF (Program Repair Framework) provides a framework consisting of various tools from fault localization at different level of granularity to fast, safe patch validation to a highly configurable fix report generation for developing automated program repair (APR) tools. The goal of PRF is to help APR researchers re-use components common to all test-based generate-and-validate APR techniques and make it easier for them to build their prototypes. With the help of extant APR libraries such as ASTOR, developers of APR tools will only need to focus on design and implementation of more effective patch generation algorithms. Furthermore, since PRF is highly configurable, they will be able customize other aspects of their systems, such as fault localization, patch validation, and patch prioritization/filtering.
This document presents an overview of the system. We have also provided the users with a pre-configured patch generation plugin based on the recent APR tool CapGen for demonstration purposes.
This repository contains source code of PRF, so you can directly clone the repository, compile it, and install the Maven plugin on your local machine. You can follow the instructions to clone, compile and install PRF Maven plugin. Along with the Maven plugin, other library files also get installed on your local machine, so the APR system developers can make their own patch generation as well as patch prioritization or filtering plugins. In the section that follows, you will find a couple of example buggy projects that you can try to fix using our patch generation plugin for CapGen.
Step 0: Please read system requirements
section and make sure you have at least Maven, Git, JDK 1.7
(in case you wish to fix Defects4J projects), and JDK 1.8 (to run
the core part of our CapGen plugin) installed on your computer
and the environment variable JAVA_HOME
is pointing to the
installation patch of JDK 1.7. For example, on Unix machine that
I am using, JDK 1.7 is installed under /opt/jdk1.7.0_80
.
So, I use the following command to set JAVA_HOME
variable.
export JAVA_HOME=/opt/jdk1.7.0_80
After this, when I run the command mvn -ver
, I see something like
the following
Apache Maven 3.5.4 (1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T13:33:14-05:00)
Maven home: /opt/apache-maven-3.5.4
Java version: 1.7.0_80, vendor: Oracle Corporation, runtime: /opt/jdk1.7.0_80/jre
Default locale: en_US, platform encoding: UTF-8
OS name: "mac os x", version: "10.15.5", arch: "x86_64", family: "mac"
Step 1: Use the following command to clone the repository on your computer.
git clone https://github.com/ali-ghanbari/prf
Step 2: Navigate to the project directory, compile, and install the project on your local repository by using the following commands.
cd prf
mvn clean install -Dhttps.protocols=TLSv1.2
Downloading all the dependencies and making the JAR files might take up to
a minute depending on the load of your computer and the speed of your
Internet connection. Please note that the command line switch
-Dhttps.protocols=TLSv1.2
is necessary when you are compiling projects
using JDK 1.7 and older. This is a security measure that is in place
since 2018. After seeing (green) BUILD SUCCESS
message on your screen,
your PRF Maven plugin, and the related library files, are ready to use.
During the first reading, you may skip the following subsection and see the examples in the next section so that you can experience PRF in action while a patch generation plugin is installed on it.
Assuming that you have already followed the instructions given in setup section to install PRF Maven plugin, and its companion libraries, in this section we will give detailed instructions for setting up PRF for fixing bugs in arbitrary Maven-based projects.
In order to be able to use PRF Maven plugin, you need to introduce
it as a plugin in the POM file of the target project. This can be done
by adding the following template XML snippet under the <plugins>
tag
in the pom.xml
of the target project.
<plugin>
<groupId>edu.utdallas</groupId>
<artifactId>prf-maven-plugin</artifactId>
<version>1.0-SNAPSHOT</version>
<!-- <configuration> -->
<!-- ***************************** PROFILER OPTIONS ******************************* -->
<!-- <flOptions>OFF</flOptions> -->
<!-- <flStrategy>OCHIAI</flStrategy> -->
<!-- <cgOptions>OFF</cgOptions> -->
<!-- <testCoverage>false</testCoverage> -->
<!-- <failingTests> -->
<!-- <failingTest>fully.qualified.test.Class1::testMethod1</failingTest> -->
<!-- ... -->
<!-- <failingTest>fully.qualified.test.ClassN::testMethodN</failingTest> -->
<!-- </failingTests> -->
<!-- ************************** PATCH GENERATOR OPTIONS *************************** -->
<!-- <patchGenerationPlugin> -->
<!-- <name>dummy-patch-generation-plugin</name> -->
<!-- <parameters> -->
<!-- <parameter1>Value for parameter 1</parameter1> -->
<!-- ... -->
<!-- <parameterN>Value for parameter N</parameterN> -->
<!-- </parameters> -->
<!-- </patchGenerationPlugin> -->
<!-- ************************** PATCH VALIDATOR OPTIONS *************************** -->
<!-- <parallelism>0</parallelism> -->
<!-- <timeoutConstant>5000</timeoutConstant> -->
<!-- <timeoutPercent>0.5</timeoutPercent> -->
<!-- *********************** FIX REPORT GENERATION OPTIONS ************************ -->
<!-- <patchPrioritizationPlugin> -->
<!-- <name>dummy-patch-prioritization-plugin</name> -->
<!-- <parameters> -->
<!-- <parameter1>Value for parameter 1</parameter1> -->
<!-- ... -->
<!-- <parameterK>Value for parameter K</parameterK> -->
<!-- </parameters> -->
<!-- </patchPrioritizationPlugin> -->
<!-- ****************************** GENERAL OPTIONS ******************************* -->
<!-- <targetClasses> -->
<!-- <param>${project.groupId}.*</param> -->
<!-- </targetClasses> -->
<!-- <excludedClasses> </excludedClasses> -->
<!-- <targetTests> -->
<!-- <param>${project.groupId}.*Test</param> -->
<!-- <param>${project.groupId}.*Tests</param> -->
<!-- </targetTests> -->
<!-- <excludedTests> </excludedTests> -->
<!-- <childJVMArgs> -->
<!-- <childJVMArg>-Xmx16g</childJVMArg> -->
<!-- ... -->
<!-- <childJVMArg>Mth argument to the child JVM</childJVMArg> -->
<!-- </childJVMArgs> -->
<!-- </configuration> -->
<!-- ************ DEPENDENCIES FOR EXTERNAL LIBRARIES, E.G., PLUGINS ************** -->
<!-- <dependencies> -->
<!-- <dependency> 1, e.g., your awesome patch generation plugin</dependency> -->
<!-- <dependency> 2, e.g., your cool patch prioritization plugin</dependency>-->
<!-- ... -->
<!-- </dependencies> -->
</plugin>
The parts under the tag <congigutation>
are all optional; we have shown their
default values (or a description of the general form of their values) in comments.
The part for dependencies is also optional. Obviously, if the repair process does
not take advantage of any external library, e.g. a patch generation/prioritization
plugin, there will be no need for a dependency. We stress that the XML snippet
shown above is just a template, and the users will need to specialize values and
get rid of those descriptions and/or ellipsis marks.
In what follows, we explain each group of options starting from top.
The current version of PRF allows the user to fine-tune the profiler via 4 set of
parameters. Using <flOptions>
the user specifies the level of granularity of fault
localization or turn it off altogether. This parameter takes values OFF
for no
fault localization (the default option), CLASS_LEVEL
for class-level fault
localization, METHOD_LEVEL
for method-level fault localization, and LINE_LEVEL
for source line level fault localization. Fault localization is off by default.
It is worth mentioning that the more fine-grained class-level fault localization,
as opposed to file-level fault localization, is not popular in fault localization
literature. However, most of the times these two coincide and we one can always
aggregate class-level suspiciousness values to calculate file-level fault localization
information1️⃣.
Using the parameter <flStrategy>
the user can specify the fault localization
formula to be used for calculating suspiciousness values for each program elements.
In the current version of PRF, this parameter, takes values OCHIAI
and TARANTULA
.
By default, the value OCHIAI
will be used, i.e. PRF will calculate Ochiai suspiciousness
values for each program element. APR prototype developers can extend PRF with more
suspiciousness calculation formulae. Please refer to our JavaDoc shipped with the
source code for more details. It is worth noting that this parameter will make sense
only when fault localization is not off.
Using the parameter <cgOptions>
the user can choose the call-graph analysis they
desire. By default, this parameter is set to OFF
meaning that no call-graph analysis
shall take place. The parameter can take other values, e.g., DYNAMIC
(to do dynamic analysis), CHA
(to do static class hierarchy analysis), RTA
(tom do static rapid type analysis), or ZERO_CFA
(to do static 0-CFA analysis).
Please note that the values CHA
, RTA
, and ZERO_CFA
will make PRF to fork a
thread alongside the profiler to do the static analysis. In case DYNAMIC
is used,
dynamic call-graph analysis shall be done during profiling.
Using <testCoverage>
the user can determine whether any coverage information should
be passed to other components. If true, and fault localization is off, line-level
coverage shall be computed. If this parameter is true or fault localization is activated,
the level of granularity for the coverage information shall be dictated by the parameter
<flOptions>
.
By default, profiler shall infer failing test cases. However, sometimes, due to configuration
issues (that happen a lot in case of Defects4J programs), the inferred results might not
be accurate. Therefore, the users manually list the set of test cases that they expect
to fail. Please note that PRF expects the fully qualified name of test class followed by
a separator, which could be ::
, :
, or simply .
, and the name of the test case.
By default, dummy-patch-generator-plugin
shall be used during patch generation phase.
This plugin does nothing but looking for the patches under the base directory of the
target project. The plugin expects the patches to be located under a directory named
patches-pool
. This directory is expected to contain a subdirectory for ech patch.
class files for each patch should reside inside those subdirectories. In case the
subdirectories contain source code for the patches as well, upon reporting the fix
report, the diff
of the patched file and their corresponding original file shall
be displayed.
The user can pick a patch generation plugin using the parameter <patchGenerationPlugin>
.
Under the tag for this parameter, it is necessary to specify the name of the desired
patch generation plugin (if left unspecified, dummy-patch-generator-plugin
shall be
used). After naming a patch generation plugin, depending on the plugin, the user might
want to pass one or more arguments to the plugin. Arguments are specified using a
key-value format, written in <key>value</key>
form under the tag <parameters>
under
<patchGenerationPlugin>
. The user might pass arbitrary number of parameters to the
plugin. For example, in order to run CapGen patch generation plugin on the bug Chart-1
from Defects4J, I had to add the plugin as follows.
<patchGenerationPlugin>
<name>capgen</name>
<parameters>
<launcherJDKHomeDirectory>/Library/Java/JavaVirtualMachines/jdk1.8.0_231.jdk/Contents/Home</launcherJDKHomeDirectory>
<subject>Chart</subject>
<bugId>1</bugId>
</parameters>
</patchGenerationPlugin>
Please note that if the user wishes to use a custom patch generation plugin, they want
to put the plugin on the classpath by specifying a dependency after closing
<configuration>
tag. For example, for CapGen patch generation plugin we have to
use the following dependency specification.
<dependencies>
<dependency>
<groupId>edu.utdallas</groupId>
<artifactId>capgen-prf-plugin</artifactId>
<version>1.0-SNAPSHOT</version>
</dependency>
</dependencies>
Patch validator is a subsystem of PRF that given a pool of generated patches runs
the existing test suite against each patch to find the patches that pass all the
test cases, a.k.a. plausible patches. In order to do it efficiently, and to use the
full power of modern microprocessors, PRF validates patches in parallel using a
work-stealing algorithm. The degree of parallelism can be specified using the
parameter <parallelism>
. By default, this value is 0, meaning that all the available
CPU cores shall be used to validates patches parallely. Any value less than 0 or
greater than or equal the number of available CPU cores have the same effect.
The user can request for sequential patch validation by setting <parallelism>
to 1 or opt for lower degree of parallelism.
As we have mentioned in our paper, patching sometimes might create infinite loops.
In theory there is no way to know that upfront, but by setting a timeout for test
execution, one is more likely to identify and filter out such patches. To that end,
PRF receives two arguments from the user: <timeoutConstant>
and <timeoutPercent>
.
The timeout for a test case t
is calculated by <timeoutConstant> + (1 + <timeoutPercent>) * t(T)
,
where T(t)
denotes the original execution time for the test case (measured in the
profiling phase). Default values for <timeoutConstant>
and <timeoutPercent>
are
5000 and 0.5, respectively. Therefore, a test case taking more than 1.5 times of its
original execution time plus 5 seconds, is deemed timed out. A patched with at least
one timed out test is deemed timed out and shall not be reported to the user.
Fix report generator, as the name suggests, is responsible for generating fix reports. This is nothing but showing the list of plausible patches (with enough information to enable the users to apply the patches) so that the user can find adopt the genuine fix for the bug among the plausible patches.
By default, PRF uses dummy-patch-prioritization-plugin
which simply puts the
patches in an arbitrary order. However, the users can have their own patch
prioritization plugin. Through the parameter <patchPrioritizationPlugin>
, the
users can specify the name of a patch generation plugin and pass parameters to
the plugin. This is quite similar to what we have in the case of picking a custom
patch generation plugin, and we don't feel it is necessary to repeat explaining the
procedure here.
PRF uses regular expressions to identify the application classes of the target
project during profiling, patch validation, and fix report generation.
The user can add arbitrarily many regular expressions to target as many application
classes as they desire. By default, ${project.groupId}.*
shall be used
as the only regular expression in <targetClasses>
. The user can exclude some
application classes with the help of <excludeClasses>
, thereby achieving desired
precision in specifying application classes.
Last but not least, since test cases of some large and complex projects might be demanding, we have provided the user with a mechanism through which they can specify the maximum amount of allotted heap space (and/or permanent generation space, if applicable) for the child profiler or validator processes.
In order to get a feeling of how PRF operates with a patch generation plugin installed on it, we have shipped this version of PRF with a patch generation plugin for the recent APR tool CapGen. CapGen is a source code level APR system that is able to fix 22 bugs from the well-known bug database Defects4J. We have included two sample bugs in this repository but instructions for other Maven-based projects are also given. Last but not least, we have uploaded a demo video on YouTube demonstrating all these steps.
JFree Chart is a Maven-based real-world Java project with more than 96K LoC and 2,200 unit tests. Chart-1 is one of the 26 bugs of this project in Defects4J bug database, and it is one of the 22 Defects4J bugs that CapGen fixes. CapGen generates 138 candidate patches for this bug among which one of them passes all the test cases and that particular patch happens to be a genuine fix for the bug.
After installing PRF Maven plugin, and CapGen patch generation plugin on your local repository, as described in the previous section, please follow the instructions given below to have PRF fix the bug using its CapGen patch generation plugin.
It is worth noting that if you want to apply PRF with CapGen plugin
on it to a Maven-based projects other than those stated in their original
paper, you want to
have the tool Understand installed on
your computer and configured properly. CapGen uses the tool to generate
fix ingredients used during repair. Furthermore, due to the specific
design on CapGen, the name of the folder containing the buggy project
should be of the form [ProjectName]_[BugId]_buggy
where [ProjectName]
denotes a project name in Defects4J (namely, Chart, Lang, Math, etc.) and
[BugId]
is a number which depending on the project can be any number greater
than or equal 1 (e.g., for Chart this [BugId]
should be between 1 and 26,
while for Math, this could be between 1 and 106). For example, the folder
containing the bug Chart-1 from Defects4J should be named Chart_1_buggy
.
Since we did not have access to the source code of CapGen, we could not
obviate this peculiarity.
Step 1: Navigate to the bug project folder, adjust the POM file, build the project, and invoke PRF.
Please note that in order to be able to invoke CapGen, you will need to make sure
that you have adjusted the POM file for the project correctly so that our CapGen
plugin will know the location of JDK 1.8 home directory. To do so, after navigating
to the example bug project, you can open the file pom.xml
via your favorite editor and
search for the term launcherJDKHomeDirectory
. You will need to update the value
of the corresponding tag so that it denotes the location of JDK 1.8 home directory
on your computer.
cd examples
cd Chart_1_buggy
# Please make sure you have adjusted the POM file as described above
mvn clean test -DskipTests -Dhttps.protocols=TLSv1.2
mvn edu.utdallas:prf-maven-plugin:run -Dhttps.protocols=TLSv1.2
After invoking the commands andseeing (green) BUILD SUCCESS
message due to the
last command, you will be able to review the fix results.
Step 2: Fix report for the current version of PRF appears directly on your
screen right above the BUILD SUCCESS
message. However, you can always redirect
the output of the tool to a file, if you desire so. The fix report for Chart-1 looks
something like the following on my computer.
=====================================
PRF Fix Report
=====================================
1.
File: .../prf/examples/Chart_1_buggy/source/org/jfree/chart/renderer/category/AbstractCategoryItemRenderer.java
Patch:
1797c1797
< if (dataset == null) {
---
> if (dataset != null) {
---------------------------------------------
Apache Commons Math is another Maven-based real-world Java code base with 106 bugs in Defects4J bug database; CapGen is able to fix 13 of them. Math-65 is one bugs fixable by CapGen, and it is due to its size that we included it among our examples. The steps needed to invoke PRF on this bug is more or less the same as in Example 1.
You will need to navigate to the bug project folder, adjust the POM file, and invoke PRF.
cd examples
cd Math_65_buggy
# Please make sure you have configured the POM file as described in the previous example
mvn clean test -DskipTests -Dhttps.protocols=TLSv1.2
mvn edu.utdallas:prf-maven-plugin:run -Dhttps.protocols=TLSv1.2
After seeing (green) BUILD SUCCESS
message, due to the last command,
you will be able to review the fix report that appears above the
BUILD SUCCESS
message. In my case, the fix report looks like the following.
=====================================
PRF Fix Report
=====================================
1.
File: .../prf/examples/Math_65_buggy/src/main/java/org/apache/commons/math/optimization/general/AbstractLeastSquaresOptimizer.java
Patch:
258c258
< chiSquare += ((residualsWeights[i]) * residual) * residual;
---
> chiSquare += residual * residual / residualsWeights[i];
---------------------------------------------
You can watch our demo YouTube video showing how you can setup PRF and apply it on one of the example projects.
Currently, PRF is extendible via two kinds of plugins, namely patch generation plugin and patch prioritization plugin. Developers of APR prototypes, can construct their own patch generation/prioritization plugin by implementing the two interfaces provided by the PRF framework. In this section, we only explain how the developers can make their own patch generation plugin; making a custom patch prioritization plugin is similar, and we omit the details.
The first step is to import the companion library of PRF into the plugin's project. After installing the library will be available at the Maven local repo. Thus, if the developer uses Maven to make their plugin, they can use the following XML snippet to import the library into their project.
<dependencies>
<!-- ... -->
<dependency>
<groupId>edu.utdallas</groupId>
<artifactId>prf-core</artifactId>
<version>1.0-SNAPSHOT</version>
</dependency>
</dependencies>
In case the developer uses Gradle as their build system, they can use the following command in their build script (provided that Gradle is configured to look into the correct repository).
implementation 'edu.utdallas:prf-core:1.0-SNAPSHOT'
After importing the library, implementing the interface PatchGenerationPlugin
is
all the developer needs to do. Implementing the interface is more or less
application-specific and we cannot say much about it here. One thing that we want to
mention at this point is to defining the name of the plugin. Any class implementing
PatchGenerationPlugin
has to implement a method named name
. This method receives
void and returns a String
. The string value is intended to be the name of the plugin.
This value is compared (in a case-insensitive) manner with the name provided in
<patchGenerationPlugin>
in the target POM file. Another method with the same signature
is description
. This method is intended to return a short description of the plugin.
This description is to be displayed on the screen by PRF. We have documented the intent
of the classes and method using JavaDoc, and we suggest reviewing the documentation
in the code.
PRF is a pure Java program, so it is platform independent. Be that as it may, the example programs that we use, i.e. Defects4J, and the CapGen patch generation plugin have certain peculiarities that call for a specific versions of JDK or Maven. Here is the system configuration you might want to have in order to be able to apply PRF with CapGen plugin on example Defects4J bugs.
- JDK: Oracle Java SE Development Kit 7u80 (recommended for Defects4J) and JDK 8u171 or higher needed for running the core part of CapGen patch generation plugin.
- Build System: Maven 3+
- Version Control System: Git 2.17.1+
- OS: Ubuntu Linux or Mac OS X
- RAM: 16+ GB
- Disk Space: 200+ MB free space
1️⃣ J. Sohn and S. Yoo, "FLUCCS: using code and change metrics to improve fault localization," in ISSTA'17.