Removing uninitialized_var()

By Jonathan Corbet
December 19, 2012

Compiler warnings can be life savers for kernel developers; often a well-placed warning will help to avert a bug that, otherwise, could have been painful to track down. But developers quickly tire of warnings that appear when the relevant code is, in fact, correct. It does not take too many spurious warnings to cause a developer to tune out compiler warnings altogether. So developers will often try to suppress warnings for correct code — a practice which can have undesirable effects in the longer term.

GCC will, when run with suitable options, emit a warning if it believes that the value of a variable might be used before that variable is set. This warning is based on the compiler's analysis of the paths through a function; if it believes it can find a path where the variable is not initialized, an "uninitialized variable" warning will result. The problem is that the compiler is not always smart enough to know that a specific path will never be taken. As a simple example, consider uhid_hid_get_raw() in drivers/hid/uhid.c:

    size_t len;
    /* ... */
    return ret ? ret : len;

A look at the surrounding code makes it clear that, in the case where ret is set to zero, the value of len has been set accordingly. But the compiler is unable to figure that out and warns that len might be used in an uninitialized state.

The obvious response to such a warning is to simply change the declaration of len so that the variable starts out initialized:

    size_t len = 0;

Over the years, though, this practice has been discouraged on the kernel mailing lists. The unneeded initialization results in larger code and a (slightly) longer run time. And, besides, it is most irritating to be pushed around by a compiler that is not smart enough to figure out that the code is correct; Real Kernel Hackers don't put up with that kind of thing. So, instead, a special macro was added to the kernel:

    /* <linux/compiler-gcc.h> */
    #define uninitialized_var(x) x = x

It is used in declarations in this manner:

    size_t uninitialized_var(len);

This macro has the effect of suppressing the warning, but it doesn't cause any additional code to be generated by the compiler. This macro has proved reasonably popular; a quick grep shows over 280 instances in the 3.7+ mainline repository. That popularity is not surprising: it allows a kernel developer to turn off a spurious warning and to document the fact that the use of the variable is, indeed, correct.

Unfortunately, there are a couple of problems with uninitialized_var(). One is that, at the same time that it is fooling GCC into thinking that the variable is initialized, it is also fooling it into thinking that the variable is used. If the variable is never referenced again, the compiler will still not issue an "unused variable" warning. So, chances are, there are a number of excess variables that have not been removed because nobody has noticed that they are not actually used. That is a minor irritation, but one could easily decide that it is tolerable if it were the only problem.

The other problem, of course, is that the compiler might just be right. During the 3.7 merge window, a patch was merged that moved some extended attribute handling code from the tmpfs filesystem into common code. In the process of moving that code, the developer noticed that one variable initialization could be removed, since, it seemed, it would pick up a value in any actual path through the function. GCC disagreed, issuing a warning, so, when this developer wrote a second patch to remove the initialization, he also suppressed the warning with uninitialized_var(). Unfortunately, GCC knew what it was talking about in this case; that code had just picked up a bug where, in a specific set of circumstances, an uninitialized value would be passed to kfree() with predictably pyrotechnic results. That bug had to be tracked down by other developers; it was fixed by David Rientjes on October 17. At that time, Hugh Dickins commented that it was a good example of how uninitialized_var() can go wrong.

And, of course, this kind of problem need not be there from the outset. The code for a given function might indeed be correct when uninitialized_var() is employed to silence a warning. Future changes could introduce a bug that the compiler would ordinarily warn about, except that the warning will have been suppressed. So, in a sense, every uninitialized_var() instance is a trap for the unwary.

That is why Linus threatened to remove it later in October, calling it "an abomination" and saying:

The thing is moronic. The whole thing is almost entirely due to compiler bugs (*stupid* gcc behavior), and we would have been better off with an explicit (unnecessary) initialization that at least doesn't cause random crashes etc if it turns out to be wrong.

In response, Ingo Molnar put together a patch removing uninitialized_var() outright. Every use is replaced with an actual initialization appropriate to the type of the variable in question. A special comment ("/* GCC */") is added as well to make the purpose of the initialization clear.

The patch was generally well received and appears to be ready to go. In October, Ingo said that he would keep it out of linux-next (to avoid creating countless merge conflicts), but would post it for merging right at the end of the 3.8 merge window. As of this writing, that posting has not occurred, but there have been no signs that the plans have changed. So, most likely, the 3.8 kernel will lack the uninitialized_var() macro and developers will have to silence warnings the old-fashioned (and obviously correct) way.

Index entries for this article
Kernel	Development model/Kernel quality
Kernel	uninitialized_var()

Removing uninitialized_var()

Posted Dec 20, 2012 2:47 UTC (Thu) by nevets (subscriber, #11875) [Link] (2 responses)

I remember when uninitialized_var() was introduced. I actually liked the idea. IIRC, the idea was you could turn it off in one place and recheck all the locations that it was used. Even though I liked it, I always avoided using it. I rather be reminded of things that may be uninitialized.

My experience was that, depending on which version of gcc you used, it may or may not complain about a possible uninitialized var. I had some code that gcc 4.6+ had no issue with, but 4.5.x did. I would constantly get a patch to initialize the variable (a pointer) to NULL. I refused each patch simply because 4.6 didn't complain, and more importantly, if the variable was used outside of the expected path, a NULL pointer would crash it. If I later changed the code where the variable was used without the proper initialization, the NULL was no better than anything else, and it would hide the bug just as much as uninitialized_var() would.

I'm a bit weary of default initialization. You need to look at the code to determine if what you initialized the variable to is any better than it being random.

Removing uninitialized_var()

Posted Dec 20, 2012 14:18 UTC (Thu) by clugstj (subscriber, #4020) [Link] (1 responses)

NULL is better than "anything else". The reason to set a variable to a consistent invalid value is that then the code fails consistently. Consistent bugs are much easier to track down.

Removing uninitialized_var()

Posted Dec 21, 2012 5:29 UTC (Fri) by nevets (subscriber, #11875) [Link]

No, NULL may be better than uninitialized_var(), but it is not better than hiding gcc from warning about it.

Yes, a NULL pointer is easy to debug after a crash, but if it requires a tight race to get to a point where NULL will crash, that means you wont detect the bug until the crash happens. If that crash happens while on a production system, it's still a major issue.

My point is that uninitialized_var() isn't very good because it may hide bugs, but so is blindly initializing something, even with NULL. It's still hiding a bug.

Removing uninitialized_var()

Posted Dec 20, 2012 6:50 UTC (Thu) by error27 (subscriber, #8346) [Link]

I've always thought uninitialize_var() macro was a bit ugly to look at.

The problem with setting pointers to NULL is that it silences the GCC warning, but it now triggers a "dereferencing NULL" warning in Smatch. Uninitialize_var() solved this problem nicely. I wish there were some macro that Smatch could understand.

struct foo *p = SILENCE_GCC;

Removing uninitialized_var()

Posted Dec 20, 2012 12:11 UTC (Thu) by mjthayer (guest, #39183) [Link] (8 responses)

I wonder whether it is quite fair to the gcc people to call this a "compiler [bug] (*stupid* gcc behavior)"? Not that I know very much about compiler internals, but instinctively I wouldn't expect getting this sort of warning right to be an entirely trivial task.

Removing uninitialized_var()

Posted Dec 20, 2012 14:48 UTC (Thu) by kpfleming (subscriber, #23250) [Link]

The primarily complication is that developers (and users) use widely varying versions of GCC. Over time, of course, GCC gets better and better at avoiding false warnings, and developers tend to use the latest-and-greatest version on their systems. Users, though, like to stick with long-term-support distributions that have older compiler versions (some still using GCC 4.3.x), but also want their software to build without warnings. When the user reports a bug about a compiler warning, the developer frequently responds that it doesn't happen with the latest GCC, and wants to close it.

Solving the user's problem ends up requiring ugly workarounds that change the code (typically in non-functional ways) just to silence compiler warnings. Given that, it's quite reasonable to classify a compiler warning that is generated from code that cannot be trusted to have exhaustive proof of the potential failure as a 'bug' in the compiler.

Removing uninitialized_var()

Posted Dec 20, 2012 15:03 UTC (Thu) by nix (subscriber, #2304) [Link] (2 responses)

It can, of course, never be right in all circumstances (that would require a solution to the halting problem). It can sometimes be sure, and says as much ('is used uninitialized'). Often, it's not sure. What's odious is that there are lots of apparently simple cases that GCC can't yet handle, e.g.

if (foo)
/* initialize a */

/* later */

if (foo) /* value not affected by code above */
/* use a */

is a particularly notorious case.

Removing uninitialized_var()

Posted Dec 20, 2012 19:38 UTC (Thu) by zlynx (guest, #2285) [Link]

I recently read something discussing common threading pitfalls and code similar to yours came up.

It is kind of amazing how many non-intuitive optimizations can be made on code like that.

For example, it might be rewritten to something like:

initialize a
do stuff
use a
if (!foo)
undo using a

or rewritten into two function blocks, one for if (foo) and one for if (!foo).

or the initialization of a might be moved down into the other if (foo) block.

So anyway, it is entirely possible that after GCC swizzles the code around it cannot tell anymore. It might have actually used a without even looking at foo, intending to undo it later.

Removing uninitialized_var()

Posted Dec 28, 2012 20:38 UTC (Fri) by stevenb (guest, #11536) [Link]

That case is handled just fine, at least AFAICT:

$ cat t.c
extern int foo_p (int *, int);
extern int foo (int, int);
extern void bar (void);

int foo_p (int *x, int y)
{
int a;
if (*x) a = y;
bar ();
if (*x) return a;
bar ();
return -1;
}

int foo (int x, int y)
{
int a;
if (x) a = y;
bar ();
if (x) return a;
bar ();
return -1;
}
$
$ gcc-4.4 -S -O2 -W -Wall -Wextra t.c
t.c: In function ‘foo_p’:
t.c:7: warning: ‘a’ may be used uninitialized in this function
$
$ gcc-4.6 -S -O2 -W -Wall -Wextra t.c
t.c: In function ‘foo_p’:
t.c:7:7: warning: ‘a’ may be used uninitialized in this function [-Wuninitialized]
$
$ ./xgcc -B. -S -O2 -W -Wall -Wextra t.c --version
xgcc (GCC) 4.8.0 20121226 (experimental) [trunk revision 194725]
Copyright (C) 2012 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

$ ./xgcc -B. -S -O2 -W -Wall -Wextra t.c
t.c: In function 'foo_p':
t.c:7:7: warning: 'a' may be used uninitialized in this function [-Wmaybe-uninitialized]
int a;
^
$

In foo_p, the compiler cannot know whether bar will change the value at *x and the warning is valid. In foo, which is your example, there is no warning. In general, there will be no warning if the compiler can prove that the code in your "/* later */" doesn't change the result of the tested conditional. This is implemented in tree-ssa-uninit.c.

Still, it's true that there will be false positives, or missed warnings. Proving whether a variable is used uninitialized is not an easy problem.

Removing uninitialized_var()

Posted Jan 4, 2013 10:16 UTC (Fri) by mlopezibanez (guest, #66088) [Link]

There is indeed a limit to what GCC can guess correctly, but there are also quite a number of bugs and deficiencies in the warning machinery. See http://gcc.gnu.org/wiki/Better_Uninitialized_Warnings for a list of problems (the page may be slightly outdated by now) and http://gcc.gnu.org/PR24639 Solving these problems would require substantial work. Unfortunately, there is no enough people working on GCC to even start such work in the near future.

I wonder if kernel devs would be less frustrated with gcc if they tried to fix GCC bugs rather than work-around them. Well, perhaps they will become even more frustrated. ;-)

Removing uninitialized_var()

Posted Jan 4, 2013 12:05 UTC (Fri) by heijo (guest, #88363) [Link] (2 responses)

It's impossible (http://en.wikipedia.org/wiki/Rice_theorem).

Removing uninitialized_var()

Posted Jan 4, 2013 23:24 UTC (Fri) by nix (subscriber, #2304) [Link] (1 responses)

It's impossible to get it right *in the general case*. You can get it right in an arbitrarily large percentage of special cases (and indeed, GCC often does, when it says 'is used uninitialized' rather than 'may be used uninitialized').

Removing uninitialized_var()

Posted Jan 13, 2013 0:39 UTC (Sun) by oak (guest, #2786) [Link]

GCC not detecting all the cases where variable might be uninitialized (and it requiring optimizations enabled to find what it can find) is one thing, but it's not what annoys people. Wrong warnings are what annoy them.

As noted, GCC does these checks using information gathered with the optimization passes. A bug in GCC 4.5 and earlier was that it did the warning before all optimization passes had been done and therefore could give warnings for code paths that weren't relevant (= dead, would never happen). This was (at least mostly) fixed in GCC 4.6 and there is/are bugs about it in the GCC bugzilla.

Even if kernel people wouldn't fix GCC bugs instead of kludging kernel code, at least they could file bugs against GCC, or if there's already a bug on the issue, add a pointer to it and comment in code about which GCC version fixes that bug.

(On quick glance of these kind of bugs in GCC bug tracker, most of them seem to be pretty old and might be already fixed with GCC 4.6, I think they need a bit of a re-testing & cleanup.)

Removing uninitialized_var()

Posted Dec 21, 2012 2:22 UTC (Fri) by giraffedata (guest, #1954) [Link] (6 responses)

The reason I don't initialize a variable to avoid the false positive warning is that it makes the code harder to read. It suggests that the variable's value is meaningful in places where it isn't. That requires more time and short term memory for the reader to understand the code.

I just disable the warning (by compiler option). I do appreciate it finding my used-before-set variables, but it isn't worth the false positives or dirtying of the code.

Removing uninitialized_var()

Posted Dec 21, 2012 5:32 UTC (Fri) by dvdeug (guest, #10998) [Link] (5 responses)

Anytime this comes up, the code is hard to read. Variables that don't have valid values in chunks of code are problematic, and GCC wouldn't issue this error in a case where there's no branching or other stuff to confuse the issue. If you can't rearrange it so that it's clear to GCC that it's set before used, whether you initialize it isn't going to make it significantly more unclear to readers.

Removing uninitialized_var()

Posted Dec 21, 2012 9:02 UTC (Fri) by hummassa (guest, #307) [Link] (4 responses)

My response was going to be similar; "if you managed to confuse the compiler, you lost your human readers a long time ago."

Removing uninitialized_var()

Posted Dec 21, 2012 17:58 UTC (Fri) by bronson (subscriber, #4806) [Link] (3 responses)

Yes, well said! That's worth hanging on the wall.

Removing uninitialized_var()

Posted Dec 21, 2012 20:14 UTC (Fri) by giraffedata (guest, #1954) [Link] (2 responses)

As the article points out, it's amazingly easy to confuse the compiler. In the cases where I've had to deal with this, there was clearly no practical alternative way to write the code. The only practical ways to shut the compiler up were to give a variable a fake value or disable the warning class.

I do agree that code with variables that are sometimes meaningless is harder to read than code without. I don't agree that when such meaningless variables exist, the code is equally hard to read when you assign values to them.

Here's the analogy: You're filling in an online medical history form. Question 1: sex. Question 2: if female, when was your last menstruation? No doctor will be the least bit concerned when he sees a man leave Question 2 blank. He might be confused if he sees a date in there for a man. In fact, he would be well justified in thinking Question 1 might be an error in that case. GCC is the automatic field checker that notices Question 2 is blank and makes the man put a date in there before it will accept the form, because it isn't smart enough to know what menstruation is and simply insists that every field be filled in, to avoid accidental omissions.

Removing uninitialized_var()

Posted Dec 21, 2012 21:22 UTC (Fri) by hummassa (guest, #307) [Link] (1 responses)

My point is that, as easy as it is to confuse gcc or clang, confusong the programmer who has to read the code after you is even easier! Let is substitute your question #2 for "did you ever have had eclampsia?" And now the comparison is fair. 50 percent of male respondents will not know what it is and many will answer wrongly.

Removing uninitialized_var()

Posted Dec 21, 2012 23:13 UTC (Fri) by apoelstra (subscriber, #75205) [Link]

> My point is that, as easy as it is to confuse gcc or clang, confusong the programmer who has to read the code after you is even easier!

This is not always true. For a small-enough function (or small-enough block), a programmer can probably build a mental parse tree just as well as the compiler can -- and then, because he is a human gifted with all manner of high-level knowledge and context, he can easily see things that the compiler might miss.

And in my experience, "small enough" includes many non-trivial functions of 5-10 lines. Also in my experience, these are the sorts of functions that trigger uninitialized variable warnings.