Removing uninitialized_var()
GCC will, when run with suitable options, emit a warning if it believes that the value of a variable might be used before that variable is set. This warning is based on the compiler's analysis of the paths through a function; if it believes it can find a path where the variable is not initialized, an "uninitialized variable" warning will result. The problem is that the compiler is not always smart enough to know that a specific path will never be taken. As a simple example, consider uhid_hid_get_raw() in drivers/hid/uhid.c:
size_t len; /* ... */ return ret ? ret : len;
A look at the surrounding code makes it clear that, in the case where ret is set to zero, the value of len has been set accordingly. But the compiler is unable to figure that out and warns that len might be used in an uninitialized state.
The obvious response to such a warning is to simply change the declaration of len so that the variable starts out initialized:
size_t len = 0;
Over the years, though, this practice has been discouraged on the kernel mailing lists. The unneeded initialization results in larger code and a (slightly) longer run time. And, besides, it is most irritating to be pushed around by a compiler that is not smart enough to figure out that the code is correct; Real Kernel Hackers don't put up with that kind of thing. So, instead, a special macro was added to the kernel:
/* <linux/compiler-gcc.h> */ #define uninitialized_var(x) x = x
It is used in declarations in this manner:
size_t uninitialized_var(len);
This macro has the effect of suppressing the warning, but it doesn't cause any additional code to be generated by the compiler. This macro has proved reasonably popular; a quick grep shows over 280 instances in the 3.7+ mainline repository. That popularity is not surprising: it allows a kernel developer to turn off a spurious warning and to document the fact that the use of the variable is, indeed, correct.
Unfortunately, there are a couple of problems with uninitialized_var(). One is that, at the same time that it is fooling GCC into thinking that the variable is initialized, it is also fooling it into thinking that the variable is used. If the variable is never referenced again, the compiler will still not issue an "unused variable" warning. So, chances are, there are a number of excess variables that have not been removed because nobody has noticed that they are not actually used. That is a minor irritation, but one could easily decide that it is tolerable if it were the only problem.
The other problem, of course, is that the compiler might just be right. During the 3.7 merge window, a patch was merged that moved some extended attribute handling code from the tmpfs filesystem into common code. In the process of moving that code, the developer noticed that one variable initialization could be removed, since, it seemed, it would pick up a value in any actual path through the function. GCC disagreed, issuing a warning, so, when this developer wrote a second patch to remove the initialization, he also suppressed the warning with uninitialized_var(). Unfortunately, GCC knew what it was talking about in this case; that code had just picked up a bug where, in a specific set of circumstances, an uninitialized value would be passed to kfree() with predictably pyrotechnic results. That bug had to be tracked down by other developers; it was fixed by David Rientjes on October 17. At that time, Hugh Dickins commented that it was a good example of how uninitialized_var() can go wrong.
And, of course, this kind of problem need not be there from the outset. The code for a given function might indeed be correct when uninitialized_var() is employed to silence a warning. Future changes could introduce a bug that the compiler would ordinarily warn about, except that the warning will have been suppressed. So, in a sense, every uninitialized_var() instance is a trap for the unwary.
That is why Linus threatened to remove it
later in October, calling it "an abomination
" and saying:
In response, Ingo Molnar put together a
patch removing uninitialized_var() outright. Every use is
replaced with an actual initialization appropriate to the type of the
variable in question. A special comment
("/* GCC */
") is added as well to make the
purpose of the initialization clear.
The patch was generally well received and appears to be ready to go. In
October, Ingo said that he would keep it
out of linux-next (to avoid creating countless merge conflicts), but would
post it for merging right at the end of the 3.8 merge window. As of this
writing, that posting has not occurred, but there have been no signs that
the plans have changed. So, most likely, the 3.8 kernel will lack the
uninitialized_var() macro and developers will have to silence
warnings the old-fashioned (and obviously correct) way.
Index entries for this article | |
---|---|
Kernel | Development model/Kernel quality |
Kernel | uninitialized_var() |
Posted Dec 20, 2012 2:47 UTC (Thu)
by nevets (subscriber, #11875)
[Link] (2 responses)
My experience was that, depending on which version of gcc you used, it may or may not complain about a possible uninitialized var. I had some code that gcc 4.6+ had no issue with, but 4.5.x did. I would constantly get a patch to initialize the variable (a pointer) to NULL. I refused each patch simply because 4.6 didn't complain, and more importantly, if the variable was used outside of the expected path, a NULL pointer would crash it. If I later changed the code where the variable was used without the proper initialization, the NULL was no better than anything else, and it would hide the bug just as much as uninitialized_var() would.
I'm a bit weary of default initialization. You need to look at the code to determine if what you initialized the variable to is any better than it being random.
Posted Dec 20, 2012 14:18 UTC (Thu)
by clugstj (subscriber, #4020)
[Link] (1 responses)
Posted Dec 21, 2012 5:29 UTC (Fri)
by nevets (subscriber, #11875)
[Link]
Yes, a NULL pointer is easy to debug after a crash, but if it requires a tight race to get to a point where NULL will crash, that means you wont detect the bug until the crash happens. If that crash happens while on a production system, it's still a major issue.
My point is that uninitialized_var() isn't very good because it may hide bugs, but so is blindly initializing something, even with NULL. It's still hiding a bug.
Posted Dec 20, 2012 6:50 UTC (Thu)
by error27 (subscriber, #8346)
[Link]
The problem with setting pointers to NULL is that it silences the GCC warning, but it now triggers a "dereferencing NULL" warning in Smatch. Uninitialize_var() solved this problem nicely. I wish there were some macro that Smatch could understand.
struct foo *p = SILENCE_GCC;
Posted Dec 20, 2012 12:11 UTC (Thu)
by mjthayer (guest, #39183)
[Link] (8 responses)
Posted Dec 20, 2012 14:48 UTC (Thu)
by kpfleming (subscriber, #23250)
[Link]
Solving the user's problem ends up requiring ugly workarounds that change the code (typically in non-functional ways) just to silence compiler warnings. Given that, it's quite reasonable to classify a compiler warning that is generated from code that cannot be trusted to have exhaustive proof of the potential failure as a 'bug' in the compiler.
Posted Dec 20, 2012 15:03 UTC (Thu)
by nix (subscriber, #2304)
[Link] (2 responses)
if (foo)
/* later */
if (foo) /* value not affected by code above */
is a particularly notorious case.
Posted Dec 20, 2012 19:38 UTC (Thu)
by zlynx (guest, #2285)
[Link]
It is kind of amazing how many non-intuitive optimizations can be made on code like that.
For example, it might be rewritten to something like:
initialize a
or rewritten into two function blocks, one for if (foo) and one for if (!foo).
or the initialization of a might be moved down into the other if (foo) block.
So anyway, it is entirely possible that after GCC swizzles the code around it cannot tell anymore. It might have actually used a without even looking at foo, intending to undo it later.
Posted Dec 28, 2012 20:38 UTC (Fri)
by stevenb (guest, #11536)
[Link]
$ cat t.c
int foo_p (int *x, int y)
int foo (int x, int y)
$ ./xgcc -B. -S -O2 -W -Wall -Wextra t.c
In foo_p, the compiler cannot know whether bar will change the value at *x and the warning is valid. In foo, which is your example, there is no warning. In general, there will be no warning if the compiler can prove that the code in your "/* later */" doesn't change the result of the tested conditional. This is implemented in tree-ssa-uninit.c.
Still, it's true that there will be false positives, or missed warnings. Proving whether a variable is used uninitialized is not an easy problem.
Posted Jan 4, 2013 10:16 UTC (Fri)
by mlopezibanez (guest, #66088)
[Link]
I wonder if kernel devs would be less frustrated with gcc if they tried to fix GCC bugs rather than work-around them. Well, perhaps they will become even more frustrated. ;-)
Posted Jan 4, 2013 12:05 UTC (Fri)
by heijo (guest, #88363)
[Link] (2 responses)
Posted Jan 4, 2013 23:24 UTC (Fri)
by nix (subscriber, #2304)
[Link] (1 responses)
Posted Jan 13, 2013 0:39 UTC (Sun)
by oak (guest, #2786)
[Link]
As noted, GCC does these checks using information gathered with the optimization passes. A bug in GCC 4.5 and earlier was that it did the warning before all optimization passes had been done and therefore could give warnings for code paths that weren't relevant (= dead, would never happen). This was (at least mostly) fixed in GCC 4.6 and there is/are bugs about it in the GCC bugzilla.
Even if kernel people wouldn't fix GCC bugs instead of kludging kernel code, at least they could file bugs against GCC, or if there's already a bug on the issue, add a pointer to it and comment in code about which GCC version fixes that bug.
(On quick glance of these kind of bugs in GCC bug tracker, most of them seem to be pretty old and might be already fixed with GCC 4.6, I think they need a bit of a re-testing & cleanup.)
Posted Dec 21, 2012 2:22 UTC (Fri)
by giraffedata (guest, #1954)
[Link] (6 responses)
I just disable the warning (by compiler option). I do appreciate it finding my used-before-set variables, but it isn't worth the false positives or dirtying of the code.
Posted Dec 21, 2012 5:32 UTC (Fri)
by dvdeug (guest, #10998)
[Link] (5 responses)
Posted Dec 21, 2012 9:02 UTC (Fri)
by hummassa (guest, #307)
[Link] (4 responses)
Posted Dec 21, 2012 17:58 UTC (Fri)
by bronson (subscriber, #4806)
[Link] (3 responses)
Posted Dec 21, 2012 20:14 UTC (Fri)
by giraffedata (guest, #1954)
[Link] (2 responses)
I do agree that code with variables that are sometimes meaningless is harder to read than code without. I don't agree that when such meaningless variables exist, the code is equally hard to read when you assign values to them.
Here's the analogy: You're filling in an online medical history form. Question 1: sex. Question 2: if female, when was your last menstruation? No doctor will be the least bit concerned when he sees a man leave Question 2 blank. He might be confused if he sees a date in there for a man. In fact, he would be well justified in thinking Question 1 might be an error in that case. GCC is the automatic field checker that notices Question 2 is blank and makes the man put a date in there before it will accept the form, because it isn't smart enough to know what menstruation is and simply insists that every field be filled in, to avoid accidental omissions.
Posted Dec 21, 2012 21:22 UTC (Fri)
by hummassa (guest, #307)
[Link] (1 responses)
Posted Dec 21, 2012 23:13 UTC (Fri)
by apoelstra (subscriber, #75205)
[Link]
This is not always true. For a small-enough function (or small-enough block), a programmer can probably build a mental parse tree just as well as the compiler can -- and then, because he is a human gifted with all manner of high-level knowledge and context, he can easily see things that the compiler might miss.
And in my experience, "small enough" includes many non-trivial functions of 5-10 lines. Also in my experience, these are the sorts of functions that trigger uninitialized variable warnings.
Removing uninitialized_var()
Removing uninitialized_var()
Removing uninitialized_var()
Removing uninitialized_var()
Removing uninitialized_var()
Removing uninitialized_var()
Removing uninitialized_var()
/* initialize a */
/* use a */
Removing uninitialized_var()
do stuff
use a
if (!foo)
undo using a
Removing uninitialized_var()
extern int foo_p (int *, int);
extern int foo (int, int);
extern void bar (void);
{
int a;
if (*x) a = y;
bar ();
if (*x) return a;
bar ();
return -1;
}
{
int a;
if (x) a = y;
bar ();
if (x) return a;
bar ();
return -1;
}
$
$ gcc-4.4 -S -O2 -W -Wall -Wextra t.c
t.c: In function ‘foo_p’:
t.c:7: warning: ‘a’ may be used uninitialized in this function
$
$ gcc-4.6 -S -O2 -W -Wall -Wextra t.c
t.c: In function ‘foo_p’:
t.c:7:7: warning: ‘a’ may be used uninitialized in this function [-Wuninitialized]
$
$ ./xgcc -B. -S -O2 -W -Wall -Wextra t.c --version
xgcc (GCC) 4.8.0 20121226 (experimental) [trunk revision 194725]
Copyright (C) 2012 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
t.c: In function 'foo_p':
t.c:7:7: warning: 'a' may be used uninitialized in this function [-Wmaybe-uninitialized]
int a;
^
$
Removing uninitialized_var()
Removing uninitialized_var()
Removing uninitialized_var()
Removing uninitialized_var()
The reason I don't initialize a variable to avoid the false positive warning is that it makes the code harder to read. It suggests that the variable's value is meaningful in places where it isn't. That requires more time and short term memory for the reader to understand the code.
Removing uninitialized_var()
Removing uninitialized_var()
Removing uninitialized_var()
Removing uninitialized_var()
As the article points out, it's amazingly easy to confuse the compiler. In the cases where I've had to deal with this, there was clearly no practical alternative way to write the code. The only practical ways to shut the compiler up were to give a variable a fake value or disable the warning class.
Removing uninitialized_var()
Removing uninitialized_var()
Removing uninitialized_var()