vmsplice(): the making of a local root exploit

Posted Feb 15, 2008 6:54 UTC (Fri) by JoeF (guest, #4486)
In reply to: vmsplice(): the making of a local root exploit by jimparis
Parent article: vmsplice(): the making of a local root exploit

Using ">> 9" is essentially a comment saying "I'm converting this to a 512-byte sector count".

No. It only says that you shift a value by 9 bits to the right.
It may say to you that you are converting a value to a 512-byte sector count. It does not necessarily say that to others.

It's such a fundamentally basic operation that anyone working with filesystem or disk code would understand it immediately.

That mindset is what results in a lot of the flaws in all kinds of code. There will always be somebody working on the code for whom it isn't obvious. I venture that if you can't see that, you must still be in your first job and haven't yet had the task to maintain somebody else's code. I can tell you from (painful) experience that this kind of stuff is among the worst stuff out there.
">>9" makes implicit assumptions, and implicit assumptions, even if they may seem reasonable to the original author, are often not obvious to maintainers, possibly years down the road.
Oh, and just in case, I am writing filesystem code. I would never even get the idea to write something like ">>9" without a comment, or better, in a macro.

vmsplice(): the making of a local root exploit

Posted Feb 15, 2008 7:43 UTC (Fri) by jimparis (guest, #38647) [Link] (3 responses)

> I venture that if you can't see that, you must still be in your
> first job and haven't yet had the task to maintain somebody else's code.

And I venture that, in my experience (which is not as limited as you impolitely presume),
"somebody else's code" is a whole lot more difficult to work with when I have to dig through
arbitrary levels of opaque macros to figure out what they're really trying to do.

Clearly we have a difference in opinion.  I prefer in code that is written clearly enough to
be self-explanatory.  Consider:
  inode->i_blocks = bytes >> 9;
vs:
  inode->i_blocks = BYTES_TO_BLOCKS(bytes);

If you maintain that the second version conveys more information than the first, then I'm
afraid we will just have to disagree.

vmsplice(): the making of a local root exploit

Posted Feb 16, 2008 1:52 UTC (Sat) by dododge (guest, #2870) [Link]

> Consider:
>   inode->i_blocks = bytes >> 9;
> vs:
>   inode->i_blocks = BYTES_TO_BLOCKS(bytes);

It's a bit of an unfair example, though, because you're computing
against a value called "bytes" and assigning it to something called
"blocks".  You've put enough context around the expression to make
it clear what the shift is trying to accomplish.

The problem is when someone assumes ">> 9" is inherently
self-documenting and throws it into the middle of a much more
complex statement.  Consider:

  process_frag_2(sig,(get_ent(curr) >> 9) + 2,HEX_ENCODE);
vs:
  process_frag_2(sig,BYTES_TO_BLOCKS(get_ent(curr)) + 2,HEX_ENCODE);

When I'm reading code, I'd much rather see the latter.  It doesn't
just tell me why the shift is being done; it even adds useful
information about the APIs for get_ent() and process_frag_2().

vmsplice(): the making of a local root exploit

Posted Feb 16, 2008 17:13 UTC (Sat) by bronson (subscriber, #4806) [Link] (1 responses)

I guess we will disagree.  Your way is still hostile to new maintainers because it carries so
much implicit information.  What if your code code has this?

   inode->i_blocks = bytes >> 8;

What is a new maintainer to think?  Even if he does recognize that this expression is
different from the others, he's still confused.  Did you auto-optimize the expression (bytes
>> 9)*2?  Is it typoed or a thinko?

Will every potentially confusing use of the shift operator in your code include a comment
stating the author's intention?  If so, then will that convention still be true once someone
else has modified your code?

Also, you've given yourself a rather heavy restriction on variable naming haven't you?  Will
all your variable names really contain "blocks" and "bytes" as appropriate?  You'll probably
want to add this restriction as a comment to each file to which it applies or it won't last
long once other people start committing patches to your code!

Finally, your technique only works for trivial expressions like assignment.  What happens if
you need to actually perform a calculation?

A BYTES_TO_BLOCKS macro solves all these problems by making the conversion explicit.  Implicit
rules and unwritten conventions always make life hard for new maintainers.  Always.

vmsplice(): the making of a local root exploit

Posted Jan 11, 2009 4:06 UTC (Sun) by dibbles_you (guest, #45004) [Link]

cd /to/linux/kernel
# find . -name "*.c" | xargs grep ">>" | awk -F ">>" '{ for(n=2;n<=NF;n++) { $c=$n; sub(/^=/,"",$c); sub(/ /,"",$c); gsub(/$/,"",$c); gsub(/$/,"",$c); $c=substr($c,1,1); if ($c>='0' && $c<='9') print "number"; else print "text"; } }' > a
# cat a| grep text | wc -l
# cat a| grep number | wc -l

number=19877
text=4936

so it would seem the linux kernel believes numbers are usually clearer for others to read and understand.

if you believe otherwise submit a patch to the kernel removing all occurrences and replacing them with meaningful #defines.

#define 'ing all numbers is stupid, using identifiers that describe what they are should allow anybody to understand their context, if you use lots of operators and numbers in a single expression split them up to use multiple variables with appropriate names, if they are in conditions use multiple conditions with comments if necessary. The compiler will optimize all these away and make your code more readable without have 15 editors open or using code navigators to work out that the actual value of CLUSTERS_DIVIDE_BY_BYTES_IN_SECTORS_MULTIPLIED_BY_PI(x) is 2.

PS. the script is an approximation.