PostgreSQL's review bottleneck, episode 3
As PostgreSQL gears up for its 8.4 release, contributors to the popular database project are debating on the pgsql-hackers mailing list how to handle two large patches. The immediate issue is whether to include the patches in the 8.4 or 8.5 release, but the larger issue is a review system that suffers from a shortage of peer reviewers and that has improved only marginally in the last two releases, despite concerns raised in 2007 and 2006. The current discussion offers a snapshot of the growing pains that large free software projects find themselves increasingly facing.
PostgreSQL development is based upon a series of CommitFests — periods in which patches are accepted and reviewed that are followed by development releases. Between CommitFests, no new patches are accepted. The trouble is that two patches in particular, SE-PostgreSQL, which adds Security Enhanced Linux's security model, and Hot Standby, which allows queries on databases during archival recoveries, have not been fully reviewed, and have prolonged the current CommitFest. Although developer Robert Haas suggests that at least three other patches may also be delaying the release cycle, most of the discussion has centered on SE-PostgreSQL and Hot Standby.
Part of the debate over the two patches concerns exactly what to do with
them. As Bernd Helmle points out, with the
current CommitFest already over three and a half months old, and the next
one not due until May, "That means we're essentially closed to new
patches for six months, which is a really long time. To put it another way,
for every week the core team spends reworking the existing patches, it will
be another week before someone can get feedback on any new patches
submitted now.
"
Moreover, core team member Tom Lane says, prolonging the current CommitFest until the patches are ready means that the 8.4 release will not happen until the fall of 2009, rather than in late spring. Such a release date would mean that the next release will take almost a year to produce, which is unacceptably long in most contributors' views.
Given this situation, Lane says, community has to decide whether to delay the release of each of these patches to the 8.5 release, delay long enough for the patches to be properly handled, or else include only a limited feature set for each of them as a compromise solution.
In the case of SE-PostgreSQL, several contributors seem open to dropping it
altogether. "To be brutally honest, I don't care about the feature at
all: the only thing I ever do with SE Linux is turn it off,
" Haas says,
and one or two others agree.
Unsurprisingly, this attitude sits poorly with KaiGai Kohei, the developer of
SE-PostgreSQL. Demanding a rationale for the proposed rejection of his
patch, Kohei notes that, given the growing popularity of cloud computing,
database security is becoming increasingly important. "If we can
include these features in a timely fashion,
" Kohei writes,
"it makes PostgreSQL more attractive
".
Kohei is supported by Dave Page, who is concerned
that delaying or rejecting SE-PostgreSQL, which is sponsored by the Exploratory IT
Human Resources Project "will send a fine message to those
companies that have sponsored development work — that we will
arbitrarily reject large patches that have been worked on following the
procedures that we require.
" Page is concerned that "we
will rapidly find that no company wants to sponsor features for PostgreSQL
in the future.
"
In the same thread that these scheduling and content issues are being discussed, PostgreSQL contributors are also debating the reasons that the review system is not working as well as it should. Heikki Linnakangas suggests that the situation was to some extent inevitable because "big patches simply take a long time to mature."
Others suggested that the problem was that final approval of all patches
must be given by the core team, and the work load has simply become too
large. As Helmle says,
"core developers are too busy with reviewing stuff during the
CommitFest. Because of this, it's really hard to get the necessary time of
somebody who is able to evaluate the architecture of a new feature and
(more important) its side-effects on the whole system.
" Under these
circumstances, Helmle questions whether delaying the acceptance of features will do anything to improve the release cycle.
A large part of the discussion of the review system centered on possible
improvements to it. Haas raises the possibility
of adding a "FeedbackFest
" at the end of CommitFests to focus
the entire project on patch reviews, and also a policy that, once a patch
was rejected, it would be declared dead if a corrected version was not
resubmitted within two weeks. In much the same vein, Jaime Casanova
suggested a policy under which large patches submitted late in a CommitFest
would not be guaranteed a review. "Maybe that will [encourage]
authors to send patches more often and more early,
" Casanova writes.
Another possibility, raised
by Riggs is to overlap releases, so that submitters of rejected patches
could move their contributions to a later release and know when it was
likely to be included. However, this idea was quickly shot
down by Lane, who points out that "key committers are
overstressed already.
" In fact, in Lane's view, overlapping releases
would only add to the problem because, "everyone will find it more
interesting/fun to work on new patches instead
" of
reviewing. "The current system at least gives non-committers
developers some motivation to help with that stuff, because they know their
patches won't be looked at until beta is over.
"
Much of the discussion about solutions was about a system that would
automatically send reminders about the status of patches — a solution
that everyone in the discussion seemed to agree would be more efficient
than the present reliance on a wiki page for each CommitFest. Josh Berkus,
who is co-lead for the present CommitFest agreed, writing
that "My inability to systematically send reminder e-mails to
submitters and reviewers — or, for that matter, even track when they
were assigned or last updated — has been a significant drag on the
effectiveness of the CommitFests. Some patches stalled, and I missed
them.
"
Possible solutions for notification included Patchwork and Review Board. However, as the
merits of different solutions were debated, Berkus notes
that "our review/commit process is so peculiar to our project that
using *any* prebuilt solution would require us to change our process to
support the tool. And I can't imagine this group doing that.
" The
possibility of writing a custom application was raised
by Haas, but no decisions were made to start such a project.
At this point, discussion petered out into a discussion of what SE-PostgreSQL and Hot Standby required in order to be included in the 8.4 release. One possible stumbling block for SE-PostgreSQL may have been removed when Kohei explained that the security policy of the patch, which no project member apparently felt competent to review, didn't need review because it had already been tested by SE Linux, the upstream project.
A decision on what to do with the two patches should be made within a week, according to Berkus.
Until then, what is interesting about the discussion to outsiders is how it shows one project attempting to deal with growth. From the discussion, it seems that PostgreSQL has outgrown policies and procedures that once served it well, and is still adjusting to the change. Like many other free software projects these days, PostgreSQL is facing the challenge of its own success.
Index entries for this article | |
---|---|
GuestArticles | Byfield, Bruce |
Posted Feb 5, 2009 3:55 UTC (Thu)
by jamesh (guest, #1159)
[Link]
If a developer can get a patch to the maturity of the SE-PostgreSQL patch and have no idea whether the patch will be merged indicates a problem. If there are problems with the approach of the patch (or if the feature is not considered appropriate), they should be identified as early as possible. Presumably the SE-PostgreSQL guys had some idea of what the implementation would look like, so reviewing that design might have helped here (and would probably be easier than a full code review).
For smaller patches, it probably help to review early rather than waiting for a CodeFest. It might also make sense to merge them if it isn't during a freeze leading up to a release. The longer the submission cycle, the less likely that a contributor will stick around.
Posted Feb 5, 2009 6:00 UTC (Thu)
by cpeterso (guest, #305)
[Link]
Posted Feb 5, 2009 9:11 UTC (Thu)
by ber (subscriber, #2142)
[Link] (1 responses)
Posted Feb 5, 2009 9:21 UTC (Thu)
by dlang (guest, #313)
[Link]
PostgreSQL's review bottleneck, episode 3
Linux developers often ask people to split large patches into many (sometimes many, many) small patches that can be merged incrementally. Developers can more easily review and accept/reject small patches. In comparison, PostgreSQL sounds like it uses a very "Big Bang" integration process compared to Linux's "Katamari Damacy" development process. :)
PostgreSQL's review bottleneck, episode 3
Sounds like they are in need more reviewers - with full professionals
being best. This brings up the question of how to pay the
salaries. Free Software communities around a product of solution like
PostgreSQL (strictly speaking this is not a "project")
are in need of funding at least for coordination and janitory
work. My proposal is that the ones profiting from PostgreSQL are pay for
it (volunteeringly of course).
PostgreSQL's review bottleneck, episode 3
PostgreSQL's review bottleneck, episode 3