[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
|
|
Subscribe / Log in / New account

Insulating layer?

Insulating layer?

Posted Oct 13, 2024 19:43 UTC (Sun) by Wol (subscriber, #4433)
In reply to: Insulating layer? by smurf
Parent article: On Rust in enterprise kernels

> Well the workaround is easy[1], you disallow everything, except under proscribed circumstances.

Or you disallow UB inasmuch as you can! And don't create new UB!

If the standard doesn't define it, you have to impose a sensible default (eg, on a 2s complement machine, you assume 2s complement), and allow a flag to change it eg --assume-no-overflow.

At which point you get a logical machine that you can reason about without repeatedly getting bitten in the bum. And the compiler writers can optimise up the wazoo for the benchmarks, but they can't introduce nasty shocks and make them the default.

Cheers,
Wol


to post comments

Insulating layer?

Posted Oct 13, 2024 22:33 UTC (Sun) by khim (subscriber, #9252) [Link] (2 responses)

> Or you disallow UB inasmuch as you can! And don't create new UB!

How would that work without buy-in on the compiler users side? The Rust handling of UB (which actually works fine so far) hinges not on one, but two pillars:

  1. Language developers try to reduce number of UBs as much as sensible
  2. Language users try to avoid triggering the remaining UBs as much as possible

But C/C++ community is 100% convinced that doing something about UBs is responsibility of “the other side”. Just read “What Every C Programmer Should Know About Undefined Behavior” (parts one, two, three) and What every compiler writer should know about programmers

Developers demand that compiler developers should ignore the language specifications and accept program with “sensible code” as programs without UB and, as we saw, when asked what is “sensible code” immediately present something that Rust also declares as unspecified. That's very symptomatic: that example shows, again (as if articles above weren't enough), how both sides are entirely uninterested in working together toward common goal in a C/C++ world.

Defining padding as containing zeros is extremely non-trivial, because many optimizations rely on ability of the compiler to “separate” a struct (or union) into a set of fields and then “assemble” them back. And because the ability to have predictable padding is very rarely needed the decision, in a Rust world, was made to go with making it unspecified. Note that's is a compromise: usually reading uninitialized value is UB in Rust (to help compiler), but reading padding is not UB. From the reference: Uninitialized memory is also implicitly invalid for any type that has a restricted set of valid values. In other words, the only cases in which reading uninitialized memory is permitted are inside unions and in “padding” (the gaps between the fields/elements of a type).

This decision doesn't make any side 100% happy: compiler makers would like to make reading padding UB to simplify their job and compiler users would like to make it zero to simplify their job, but “unspecified but not UB” is good enough for both sides to, grudgingly, accept it.

Yet such compromise may not ever save the world where it's always the other side that have to do the job!

> At which point you get a logical machine that you can reason about without repeatedly getting bitten in the bum.

Nope. Logical machine that high-level language uses to define behavior of a program would always be different from “what the real hardware is does”. That's what separates high-level language from low-level language, after all. You can make it easier to understand or harder to understand, but someone who refuses to accept the fact that virtual machine used by high-level language definition even exists may always find a way to get bitten in the bum.

Insulating layer?

Posted Oct 14, 2024 13:24 UTC (Mon) by Wol (subscriber, #4433) [Link] (1 responses)

> How would that work without buy-in on the compiler users side? The Rust handling of UB (which actually works fine so far) hinges not on one, but two pillars:

Well, maybe, if we didn't have stupid rules like "don't use signed integer arithmetic because overflow is undefined", the user space developers might buy in. I'm far more sympathetic to the developers when they claim there's too much UB, than I am to compiler developers trying to introduce even more UB.

In that particular case, I believe we have -fno-wrap-v or whatever the option is to tell the compiler what to do, but that should be the norm, not the exception, and it should be the default.

If the compiler devs change to saying "this is UB. This is what a user-space developer would expect, therefore we'll add a (default) option to do that, and an option that says 'take advantage of UB when compiling'", then I suspect the two sides would come together pretty quickly.

Compiler devs know all about UB because they take advantage of it. User devs don't realise it's there until they get bitten. The power is in the compiler guys hands to make the first move.

And if it really is *undefinable* data (like reading from a write-only i/o port), then user space deserves all the brickbats they'll inevitably get ... :-)

Cheers,
Wol

Insulating layer?

Posted Oct 14, 2024 15:19 UTC (Mon) by khim (subscriber, #9252) [Link]

> If the compiler devs change to saying "this is UB. This is what a user-space developer would expect, therefore we'll add a (default) option to do that, and an option that says 'take advantage of UB when compiling'", then I suspect the two sides would come together pretty quickly.

This was tried, too. And, of course, what the people who tried that have found out that was that every “we code for the hardware” developer have their own idea about what compiler should do (mostly because they code for different hardware).

> The power is in the compiler guys hands to make the first move.

Why? They have already done the first step and created a specification that lists what is or isn't UB. If developers have a concrete proposals – they could offer changes to it. And yes, some small perceentage of C/C++ community works with developers. But most “we code for the hardware” guys don't even bother to even look and read it… how can compiler developers would know that their changes to that list would be treated any better then what they already gave?

> And if it really is *undefinable* data (like reading from a write-only i/o port), then user space deserves all the brickbats they'll inevitably get ... :-)

They would probably find a way to complain even in that case…

> User devs don't realise it's there until they get bitten.

And that's precisely the problem: if language have UBs and users of said language “only realise it's there when they get bitten” then such language couldn't be made safe. Ever.

Either developers have to accept and handle UBs pro-actively or language should have no UBs at all. Latter case is limiting because certain low-level things couldn't be expressed in the language without UBs thus, for low-level stuff, safe and reliable language couldn't be made unless developers think about UBs before they are bitten.

And the only known way to introduce such a radical non-technical change is to create a new community. And that's the end of story: C/C++ couldn't be made safe not because there are some technical, unsolvable, issues, but because it's community refuses to accept the fact that there are technical, unsolvable, issues (that have to be worked around on social level).


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds