8000 runtime: canonical 48-bit address bug in lfstack · Issue #49405 · golang/go · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

runtime: canonical 48-bit address bug in lfstack #49405

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Sonicadvance1 opened this issue Nov 6, 2021 · 44 comments
Closed

runtime: canonical 48-bit address bug in lfstack #49405

Sonicadvance1 opened this issue Nov 6, 2021 · 44 comments
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@Sonicadvance1
Copy link

What version of Go are you using (go version)?

go version go1.16.2 linux/amd64

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/home/ryanh/.cache/go-build"
GOENV="/home/ryanh/.config/go/env"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOINSECURE=""
GOMODCACHE="/home/ryanh/go/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/home/ryanh/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/lib/go-1.16"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/lib/go-1.16/pkg/tool/linux_amd64"
GOVCS=""
GOVERSION="go1.16.2"
GCCGO="gccgo"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD="/dev/null"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build1471403295=/tmp/go-build -gno-record-gcc-switches"

What did you do?

If lfstack gets any pointer on amd64 platforms where the top bit is set then an assert is thrown.
In most cases this won't occur since the kernel will consume the top bit of VA, leaving the userspace with 47bits of VA.
This is a logic bug around address canonicalization.

Check is here: https://github.com/golang/go/blob/master/src/runtime/lfstack.go#L28
Pointer sign extension is here: https://github.com/golang/go/blob/master/src/runtime/lfstack_64bit.go#L52

I don't have a basic application or environment for reproducing this. Easy to see the logic bug.

What did you expect to see?

While it is correct that you must canonicalize the pointer by sign extending it on 48bit systems (I'm not even going to bring up LA57). The check inside of lfstack is incorrect since it assumes the sign extension will never occur. Thus assuming it will never receive a true 48-bit pointer.
This assumption is broken in environments where you get 48-bit pointers with the high bit set.
This is purely a simple logic bug, it shouldn't assert in this case.

What did you see instead?

An assert on 48-bit address with high bit set.

Sonicadvance1 added a commit to Sonicadvance1/FEX that referenced this issue Nov 6, 2021
Only a partial fix for FEX-Emu#1330, still needs preemption disabled to work.

On x86-64 hosts the Linux kernel resides in the top bit of VA which
isn't mapped in to userspace.
This means that userspace will never receive pointers living with that
top bit set unless you're running a 57bit VA host.

This results in userspace pointers never needing to do the sign
extending pointer canonicalization. But additionally some applications
actually don't understand the pointer canonicalization.
This results in bugs like: golang/go#49405
Now if you're running on a 57bit VA host, this will end up behaving like
FEX but it seems like no one in golang land has really messed with 57bit
VA yet.

In AArch64, when configured with a 48bit VA, the userspace gets the full
48bit VA space and on EL mode switch has the full address range change
to the kernel's 48bit VA.
This means that we will /very/ likely allocate pointers in the high
48bit space since Linux currently allocates top-down.

So behave more like x86-64, hide the to
8000
p 128TB of memory space from the
guest before boot.

Testing: Took the M1Max 15ms to 21ms allocate the top 128TB.
@ianlancetaylor ianlancetaylor changed the title AMD64: Canonical 48-bit address bug in lfstack runtime: canonical 48-bit address bug in lfstack Nov 7, 2021
@ianlancetaylor
Copy link
Contributor

Thanks. Note that this code is deeply system dependent. Can the problem you mention actually happen on any existing system? If not, it's not worth worrying about.

@Sonicadvance1
Copy link
Author

Currently the only real concern is if anyone tries running on Intel's latest server platforms with LA57 enabled.
Which doesn't pass the userspace a 48-bit (or higher) pointer unless explicitly asked.
Otherwise it would likely just be toy environments attempting a 48-bit VA space like AArch64 platforms. Switching memory mappings on CPL change.

@thanm thanm added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Nov 8, 2021
@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Jul 7, 2022
@mknyszek mknyszek moved this to Triage Backlog in Go Compiler / Runtime Jul 15, 2022
@seankhliao seankhliao added this to the Unplanned milestone Aug 20, 2022
@assistcontrol
Copy link

This bug hit FreeBSD, where it caused all Go packages to fail. We have spent a long time without any Go packages being available.

We found this issue yesterday and by working around it we have packages again.

So, it's not theoretical.

@assistcontrol
Copy link

I should mention: our workaround required changing a sysctl, so it's not something we can account for in the ports tree. This will affect end-users too.

@randall77
Copy link
Contributor

@assistcontrol

Can you explain what your addresses look like that are triggering this assert?
You are on amd64, right?

@assistcontrol
Copy link

@randall77 Yeah, this occurred on amd64 builders (jailed on an amd64 host).

You asked a good question, but I don't know how to answer that. How can I get you that information?

@dbaio
Copy link
dbaio commented Apr 13, 2025

Hi @randall77
On FreeBSD (developer branch), we are noticing this issue on newer CPUs that have the LA57 feature.

Building Go cmd/dist using /home/dbaio/ports/lang/go121/work/go-freebsd-amd64-bootstrap. (go1.20 freebsd/amd64)
cmd/dist
runtime: lfstack.push invalid packing: node=0x6d66afc44d8d40 cnt=0x1 packed=0x66afc44d8d400001 -> node=0x66afc44d8d40
fatal error: lfstack.push

runtime stack:
runtime.throw({0xa3a2b1?, 0xaffb50e700000000?})
	/usr/home/dg/go/go-bootstrap/go-freebsd-amd64-bootstrap/src/runtime/panic.go:1047 +0x5d fp=0x84c48df00 sp=0x84c48ded0 pc=0x43457d
runtime.(*lfstack).push(0x2?, 0x84c48dfb0?)
	/usr/home/dg/go/go-bootstrap/go-freebsd-amd64-bootstrap/src/runtime/lfstack.go:29 +0x125 fp=0x84c48df40 sp=0x84c48df00 pc=0x40bbc5
runtime.(*spanSetBlockAlloc).free(...)
	/usr/home/dg/go/go-bootstrap/go-freebsd-amd64-bootstrap/src/runtime/mspanset.go:322
runtime.(*spanSet).reset(0xe828b0)
	/usr/home/dg/go/go-bootstrap/go-freebsd-amd64-bootstrap/src/runtime/mspanset.go:264 +0x87 fp=0x84c48df70 sp=0x84c48df40 pc=0x42f727
runtime.finishsweep_m()
	/usr/home/dg/go/go-bootstrap/go-freebsd-amd64-bootstrap/src/runtime/mgcsweep.go:260 +0x9b fp=0x84c48dfb0 sp=0x84c48df70 pc=0x423b1b
runtime.gcStart.func1()
	/usr/home/dg/go/go-bootstrap/go-freebsd-amd64-bootstrap/src/runtime/mgc.go:668 +0x17 fp=0x84c48dfc0 sp=0x84c48dfb0 pc=0x45ec17
runtime.systemstack()
	/usr/home/dg/go/go-bootstrap/go-freebsd-amd64-bootstrap/src/runtime/asm_amd64.s:496 +0x49 fp=0x84c48dfc8 sp=0x84c48dfc0 pc=0x463d69

Full build log is here: https://people.freebsd.org/~dbaio/go/go121-gohan06-logs.txt

I can run some tests. Just guide me, please.

Details about the CPU:

CPU: Intel(R) Xeon(R) Silver 4314 CPU @ 2.40GHz (2400.00-MHz K8-class CPU)
  Origin="GenuineIntel"  Id=0x606a6  Family=0x6  Model=0x6a  Stepping=6
  Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
  Features2=0x7ffefbff<SSE3,PCLMULQDQ,DTES64,MON,DS_CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA,CX16,xTPR,PDCM,PCID,DCA,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,TSCDLT,AESNI,XSAVE,OSXSAVE,AVX,F16C,RDRAND>
  AMD Features=0x2c100800<SYSCALL,NX,Page1GB,RDTSCP,LM>
  AMD Features2=0x121<LAHF,ABM,Prefetch>
  Structured Extended Features=0xf3bfbfff<FSGSBASE,TSCADJ,SGX,BMI1,HLE,AVX2,FDPEXC,SMEP,BMI2,ERMS,INVPCID,RTM,PQM,NFPUSG,PQE,AVX512F,AVX512DQ,RDSEED,ADX,SMAP,AVX512IFMA,CLFLUSHOPT,CLWB,PROCTRACE,AVX512CD,SHA,AVX512BW,AVX512VL>
  Structured Extended Features2=0x40417f5e<AVX512VBMI,UMIP,PKU,OSPKE,AVX512VBMI2,GFNI,VAES,VPCLMULQDQ,AVX512VNNI,AVX512BITALG,TME,AVX512VPOPCNTDQ,LA57,RDPID,SGXLC>
  Structured Extended Features3=0xbc040412<FSRM,MD_CLEAR,PCONFIG,IBPB,STIBP,L1DFL,ARCH_CAP,SSBD>
  XSAVE Features=0xf<XSAVEOPT,XSAVEC,XINUSE,XSAVES>
  IA32_ARCH_CAPS=0x6bdeb<RDCL_NO,IBRS_ALL,SKIP_L1DFL_VME,MDS_NO,TSX_CTRL,TAA_NO>
  AMD Extended Feature Extensions ID EBX=0x200<WBNOINVD>
  VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID,VID,PostIntr
  TSC: P-state invariant, performance statistics

When building on another box (same FreeBSD version) that doesn't have LA57, it works just fine.

@randall77
Copy link
Contributor

The obvious experiment is to change this line

addrBits = 48
from 48 to 57. That almost certainly will fix your problem. The question is, is that safe?
It is reducing the tag space from 19 to 10 bits. I don't think that's a showstopper (some of the other archs have that number) but it is worrying, as those tags are there to defend against the ABA ambiguity in our lock-free linked list. A tag collision that results in not detecting an ABA situation would be really bad.

@randall77
Copy link
Contributor

Is there any way to mark a binary to tell the OS that we don't want LA57?

@arrowd
Copy link
arrowd commented Apr 14, 2025

On FreeBSD it is possible with elfctl -e +la48 /path/to/binary

@prattmic
Copy link
Member
prattmic commented Apr 14, 2025

Just to clarify: on Linux, I believe mmap will only return an address above 1<<47 if you pass a hint address above 1<<47 to mmap. I assume this is not true on FreeBSD? That mmap with hint address of 0 can return a higher address unless the elfctl -e +la48 is set?

@bsdjhb
Copy link
bsdjhb commented Apr 14, 2025

Just to clarify: on Linux, I believe mmap will only return an address above 1<<47 if you pass a hint address above 1<<47 to mmap. I assume this is not true on FreeBSD? That mmap with hint address of 0 can return a higher address unless the elfctl -e +la48 is set?

Correct, if LA57 is enabled, then mmap(NULL, ...) can return addresses above 1 << 47. The main thread stack is also likely to be above 1 << 47 as well since main thread stacks are typically allocated at the "top" of the user address space.

@randall77
Copy link
Contributor

Does elfctl -e +la48 just set this bit?

https://github.com/freebsd/freebsd-src/blob/6fbd1bed6e7bf880a6cc579b06bdc6476983613a/sys/sys/elf_common.h#L825

Can we just do that? Probably just add that constant to the writes on both sides of this if statement:

if *flagRace {

@bsdjhb
Copy link
bsdjhb commented Apr 14, 2025

Does elfctl -e +la48 just set this bit?

https://github.com/freebsd/freebsd-src/blob/6fbd1bed6e7bf880a6cc579b06bdc6476983613a/sys/sys/elf_common.h#L825

Can we just do that? Probably just add that constant to the writes on both sides of this if statement:

go/src/cmd/link/internal/ld/elf.go

Line 798 in 80bff42

if *flagRace {

Yes, you could just set the bit there. Note that this bit is amd64-only. We have not implemented a control for RISC-V. Are you doing anything comparable to disable LA57 on other OS's such as Linux currently?

@randall77
Copy link
Contributor

Yes, you could just set the bit there. Note that this bit is amd64-only.

That's fine.

We have not implemented a control for RISC-V.

We are currently assuming 56 bit (userspace) pointers for riscv. See #54104.

Are you doing anything comparable to disable LA57 on other OS's such as Linux currently?

I don't believe so. Linux is much better about never breaking existing binaries, so any la57 support would require explicit opt-in by the binary.
https://docs.kernel.org/arch/x86/x86_64/5level-paging.html#user-space-and-large-virtual-address-space

@randall77
Copy link
Contributor

The one thing I'm not sure about is how to handle external linking. Is there a command-line flag to the linker to set this bit?
I guess we could shell out to elfctl, but that requires assuming the existence of another tool.

@emaste
Copy link
emaste commented Apr 14, 2025

Linux is much better about never breaking existing binaries, so any la57 support would require explicit opt-in by the binary.

Just to make sure I understand correctly, this is only the case for x86 (i.e., #54104 is the same issue on Linux on RISC-V)?

Is there a command-line flag to the linker to set this bit? I guess we could shell out to elfctl, but that requires assuming the existence of another tool.

There is no cmdline flag. elfctl will exist on any FreeBSD build host, but of course that doesn't help with cross-compiling. I assume go just uses the existing system linker or a cross-linker from e.g. binutils or LLVM?

These flags are just bits in an NT_FREEBSD_FEATURE_CTL ELF note, which comes from /usr/lib/crt1.o (and friends). Note, elfctl requires that the note already exist.

@randall77
Copy link
Contributor

Just to make sure I understand correctly, this is only the case for x86 (i.e., #54104 is the same issue on Linux on RISC-V)?

Yes, this bug is only about x86. Go supports 56-bit pointers on riscv, since the fix for #54104 went in a few years ago.

I assume go just uses the existing system linker or a cross-linker from e.g. binutils or LLVM?

I believe so. I'm not sure the exact search path. I'm sure you can override it with some environment variables.

@kostikbel
Copy link

Or if go developers insist on requiring LA48 on amd64, something like this might work:

diff --git a/src/cmd/link/internal/ld/elf.go b/src/cmd/link/internal/ld/elf.go
index 6ff1d94383..2cebb0ab39 100644
--- a/src/cmd/link/internal/ld/elf.go
+++ b/src/cmd/link/internal/ld/elf.go
@@ -753,6 +753,7 @@ const (
 	ELF_NOTE_FREEBSD_FEATURE_CTL_TAG   = 4
 	ELF_NOTE_FREEBSD_VERSION           = 1203000 // 12.3-RELEASE
 	ELF_NOTE_FREEBSD_FCTL_ASLR_DISABLE = 0x1
+	ELF_NOTE_FREEBSD_FCTL_LA48         = 0x10
 )
 
 const ELF_NOTE_FREEBSD_NAME = "FreeBSD\x00"
@@ -797,9 +798,10 @@ func elfwritefreebsdsig(out *OutBuf) int {
 	out.WriteString(ELF_NOTE_FREEBSD_NAME)
 	if *flagRace {
 		// The race detector can't handle ASLR, turn the ASLR off when compiling with -race.
-		out.Write32(ELF_NOTE_FREEBSD_FCTL_ASLR_DISABLE)
+		out.Write32(ELF_NOTE_FREEBSD_FCTL_ASLR_DISABLE |
+		    ELF_NOTE_FREEBSD_FCTL_LA48)
 	} else {
-		out.Write32(0)
+		out.Write32(ELF_NOTE_FREEBSD_FCTL_LA48)
 	}
 
 	return int(sh.Size)

@randall77
Copy link
Contributor

Yes, that is what I was suggesting here.
That only works for internal linking though. If we're using the system-provided linker that code isn't executed.

@emaste
Copy link
emaste commented Apr 14, 2025

Does go link the system C startup objects when using the system linker? The system linker doesn't have any knowledge of the note, it just passes through from the csu bits.

@kostikbel
Copy link

Yes, that is what I was suggesting here. That only works for internal linking though. If we're using the system-provided linker that code isn't executed.

But then the build already rely on the execution of the external program (ld), and can rely on the elfctl presence and usability as well.

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/665475 mentions this issue: cmd/link: add la48 to freebsd note section

@ayang64
Copy link
Member
ayang64 commented Apr 15, 2025

confirmed:

before patch:

ayan@kiwi:/tmp/hello$ go version
go version devel go1.25-ba7b8ca336 Mon Apr 14 15:10:49 2025 -0700 freebsd/amd64
ayan@kiwi:/tmp/hello$ elfctl -l ./hello.local
Known features are:
noaslr          Disable ASLR
noprotmax       Disable implicit PROT_MAX
nostackgap      Disable stack gap
wxneeded        Requires W+X mappings
la48            amd64: Limit user VA to 48bit
File './hello.local' features:
noaslr          'Disable ASLR' is unset.
noprotmax       'Disable implicit PROT_MAX' is unset.
nostackgap      'Disable stack gap' is unset.
wxneeded        'Requires W+X mappings' is unset.
la48            'amd64: Limit user VA to 48bit' is unset.

after patch:

$ elfctl -l ./hello.local
Known features are:
noaslr          Disable ASLR
noprotmax       Disable implicit PROT_MAX
nostackgap      Disable stack gap
wxneeded        Requires W+X mappings
la48            amd64: Limit user VA to 48bit
File './hello.local' features:
noaslr          'Disable ASLR' is unset.
noprotmax       'Disable implicit PROT_MAX' is unset.
nostackgap      'Disable stack gap' is unset.
wxneeded        'Requires W+X mappings' is unset.
la48            'amd64: Limit user VA to 48bit' is set.
6D40

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/665815 mentions this issue: runtime: align taggable pointers more so we can use low bits for tag

@randall77
Copy link
Contributor

I found a few extra bits in the cushions of my couch, so we may be ok here. (See CL above.)

If that CL works out, then we can just bump to 57 address bits on freebsd.

The only hesitation I have is that that CL is nontrivial enough that backporting is dicey. If we want to backport something here (and I suspect we do not, but I'm willing to hear arguments), the la48 bit CL might be safer.

@kostikbel
Copy link

I am still curious why it is fine to use 56/57 bits on RISC-V and PPC AIX, while the tags are considered too short on LA57 amd64.

@assistcontrol
Copy link

Now that Go downloads and builds toolchains as needed, FreeBSD is strongly considering providing only the most recent release, rather than multiple versions.

Not backporting the CL is really all the justification I'd need to drop the old versions, so I wouldn't do any difficult backporting work if it were just for FreeBSD's benefit.

@randall77
Copy link
Contributor

I am still curious why it is fine to use 56/57 bits on RISC-V and PPC AIX, while the tags are considered too short on LA57 amd64.

Different levels of risk tolerance?

We also give second-class ports a fair amount of leeway in deciding what is best for them. That said, I think I would have pushed back on those changes had I seen them. Water under the bridge.

More generally, though, I don't think anyone has an idea about how big a tag space would be "safe". I have the feeling that we probably want more bits for large (in the # of processors sense) machines. Those machines tend to be amd64/arm64. But even there I'm not relying on any concrete arguments, just intuition.
My hesitation comes down to the fact that if we get a tag collision, really bad things happen. Very nondeterministic heap corruption, with no indication at all of the underlying cause. We could already be in a state where we don't have enough tag bits, and we'd never know (as long as the collision rate was not too high). Just a contributor to the background "maybe that was just a cosmic ray hitting the wrong bit?" failures.

@randall77
Copy link
Contributor

Not backporting the CL is really all the justification I'd need to drop the old versions, so I wouldn't do any difficult backporting work if it were just for FreeBSD's benefit.

A backport would be available earlier than the 1.25 release. (In a few weeks, say, versus a few months for 1.25.) But point taken, thanks.

@assistcontrol
Copy link

Not backporting the CL is really all the justification I'd need to drop the old versions, so I wouldn't do any difficult backporting work if it were just for FreeBSD's benefit.

A backport would be available earlier than the 1.25 release. (In a few weeks, say, versus a few months for 1.25.) But point taken, thanks.

Ahh ok, well I was pretty optimistic there. It would of course be nice to get this into the ports tree sooner rather than later so we can stop universally limiting pointer width on the builders (with all of its downstream effects).

So, if it's possible to backport to 1.24, it'd make a difference for us. But we also understand that sometimes dicey is dicey.

gopherbot pushed a commit that referenced this issue Apr 24, 2025
Currently we assume alignment to 8 bytes, so we can steal the low 3 bits.
This CL assumes alignment to 512 bytes, so we can steal the low 9 bits.

That's 6 extra bits!

Aligning to 512 bytes wastes a bit of space but it is not egregious.
Most of the objects that we make tagged pointers to are pretty big.

Update #49405

Change-Id: I66fc7784ac1be5f12f285de1d7851d5a6871fb75
Reviewed-on: https://go-review.googlesource.com/c/go/+/665815
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Auto-Submit: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
@randall77
Copy link
Contributor
randall77 commented May 7, 2025

CL 665815 got us some extra bits, so I will just bump freebsd up to 57.

I'm curious, is @Sonicadvance1 still around? We've kind of co-opted their issue for a freebsd one. The original report was on linux.

@Sonicadvance1
Copy link
Author
Sonicadvance1 commented May 7, 2025

CL 665815 got us some extra bits, so I will just bump freebsd up to 57.

I'm curious, is @Sonicadvance1 still around? We've kind of co-opted their issue for a freebsd one. The original report was on linux.

I haven't been paying attention to this bug since I reported it. Since I have been running on Linux this bug wasn't ever encountered on my 57-bit VA system (AMD Zen 4 Threadripper here) since Linux won't give you 57-bit addresses unless you explicitly ask for it.

If that CL fixes the bug generically then it would be reasonable to close this.

@randall77
Copy link
Contributor

We will still limit linux to 48 bits, but the option to go larger is available.

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/670715 mentions this issue: runtime: increase freebsd/amd64 pointer size from 48 to 57 bits

@kev009
Copy link
kev009 commented May 7, 2025

A note, FreeBSD adopted the same behavior as Linux where LA57 is now opt-in to preserve binary compatibility.

https://cgit.freebsd.org/src/commit/?id=7a8440bf0894f910f871e91a660a9f66dbb23f0b

There is a small window in the development branch where this is not the case, but it is not worth worrying about as it will never be in a release.

If the pointer size increase has no drawbacks, CL 670715 looks good for future proofing. Otherwise, if that is some kind of compromise, it may be warranted to go the other route of adding the LA48 ELF note to guarantee the bits for the tag.

@clausecker
Copy link

@kev009 More specifically, there is now a sysctl to configure the behaviour of binaries that neither say that they need la48 nor say that they support la57. The default is to use la48 for such binaries, but it can be overridden. Go should in any case tag binaries explicitly so they run correctly even if this sysctl is set to a non-default value.

@github-project-automation github-project-automation bot moved this from Triage Backlog to Done in Go Compiler / Runtime May 14, 2025
@kev009
Copy link
kev009 commented May 15, 2025

@randall77 you should probably update the comment. FreeBSD is not shipping an LA57 default.

@randall77
Copy link
Contributor

They were 2 weeks ago. Not really inaccurate, just out-of-date.
If anyone is depending on CL descriptions in the Go repository for what FreeBSD is doing, they should rethink their information intake pipeline.

@kev009
Copy link
kev009 commented May 15, 2025

@randall77 interesting take but ok. The intent is to provide accurate information to people working within the go codebase that do not follow FreeBSD development.

Separately it seems more durable to match whatever you do for Linux now and forever since we have adapted the same behavior.. so this change can probably be backed out altogether.

@randall77
Copy link
Contributor

FreeBSD contributors came to us and asked us to fix this a few weeks ago and we did. Maybe they changed their mind since then? (TBH, I'm not sure you speak for them.) In any case, I'm not sure what would be gained by backing this out now.

@kev009
Copy link
kev009 commented May 15, 2025

@randall77 it's a volunteer project so yes and without particular authority. There is a distinction between the base OS and all the third party "ports" software which might be why you feel whip-lashed -- I happen to work on both in an "official" capacity and thought I was clear on the lack of need for this one week ago. In https://cgit.freebsd.org/src/commit/?id=7a8440bf0894f910f871e91a660a9f66dbb23f0b and prior discussion, it was decided that preserving binary compatibility as Linux does is the right default because of this incident.

My fear is that Go under FreeBSD now has an unnecessary difference from amd64 Linux. Will that show up in any way now or in the future as a bug, regression? If yes, reversion is the preferred option and the compiled binaries your earlier approach to stamp with the LA48 ELF note as in CL 665475 seems better. If there is no performance or safety impact then this concern is perhaps just superstition and not worth more churn.

@kostikbel
Copy link

It is fine either way. With the revert, FreeBSD/amd64 is more like Linux. Without the revert, Go programs on FreeBSD/amd64 have access to much larger UVA if supported by hw.

Obviously my initial motivation was to give the access to larger VA by default. Since two issues were found, one with jemalloc and another with Go, the tweak of defaults was applied. Being able to use large amount of memory by Go on FreeBSD sounds like an advantage to me.

@emaste
Copy link
emaste commented May 15, 2025

it is worrying, as those tags are there to defend against the ABA ambiguity in our lock-free linked list

If this is a real concern then we should revert, so that go on FreeBSD/amd64 is not arbitrarily less reliable than go on x86_64 Linux.

Sirherobrine23 pushed a commit to Sirherobrine23/go that referenced this issue May 24, 2025
Because freebsd is now enabling la57 by default.

Fixes golang#49405

Change-Id: I30f7bac8b8a9baa85e0c097e06072c19ad474e5a
Reviewed-on: https://go-review.googlesource.com/c/go/+/670715
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
Development

No branches or pull requests

0