-
Notifications
You must be signed in to change notification settings - Fork 18k
hash/maphash: add Bytes and String #42710
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I found Ideally |
See #2205. I didn't propose |
This is not how it's supposed to be used. You're supposed to do:
Can you try that instead? |
The runtime drops from ~22ns to 20ns, but it's still much slower compared to a direct hash on a |
@dsnet, is there a fundamental reason why Write+Sum64 should be so much slower than a single call? |
I see several reasons for slowdown:
|
I see. So the copy during the large writes is a bug - we should just hash those bytes directly. Then the benchmark, which hashes a multiple of 64 bytes, would have no copies at all. At that point it seems like the only win is Write+Sum64 combined knowing that the final copy isn't needed. But maybe that's too small a win. |
Change https://golang.org/cl/278758 mentions this issue: |
Change https://golang.org/cl/278759 mentions this issue: |
I played with this a bit, because I also hit this recently. CLs 278758 and 278759 and 278760 were what I could squeeze out of it. They help a fair amount for large writes, but not much for little ones, which is what I'm working on (trying to reduce the performance regression due to tailscale/tailscale@aa9d7f4). |
Change https://golang.org/cl/278760 mentions this issue: |
Thanks for working on optimizations, @josharian. We should probably put this proposal on hold until we've made the current API as fast as it can be. |
FWIW, I've optimized as much as I know how. I don't see a way to make the current API any faster for small writes. The need to make writes work identically regardless of how they got chunked really ties the hands of the implementation. |
Placed on hold. |
While attempting to fix google/starlark-go#16 I rediscovered this issue and was about to file a bug report, but then stumbled on this one. I'll share my benchmark here in case it's useful.
|
Provides a minor performance win. name old time/op new time/op delta Hash8Bytes-8 16.5ns ± 2% 16.5ns ± 4% ~ (p=0.407 n=27+29) Hash320Bytes-8 58.5ns ± 2% 55.0ns ± 2% -6.01% (p=0.000 n=29+28) Hash1K-8 195ns ± 1% 190ns ± 2% -2.23% (p=0.000 n=30+30) Hash8K-8 1.59µs ± 2% 1.57µs ± 2% -0.88% (p=0.002 n=30+30) name old speed new speed delta Hash8Bytes-8 484MB/s ± 2% 485MB/s ± 4% ~ (p=0.417 n=27+29) Hash320Bytes-8 5.47GB/s ± 2% 5.82GB/s ± 2% +6.39% (p=0.000 n=29+28) Hash1K-8 5.26GB/s ± 1% 5.39GB/s ± 2% +2.29% (p=0.000 n=30+30) Hash8K-8 5.16GB/s ± 2% 5.21GB/s ± 2% +0.89% (p=0.002 n=30+30) Updates #42710 Change-Id: Ia0d7264b648f96099202de21c6b69a9c1776f6c8 Reviewed-on: https://go-review.googlesource.com/c/go/+/278759 Trust: Josh Bleecher Snyder <josharian@gmail.com> Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
The existing code makes copies of every byte it hashes. When passed a large chunk of memory, Write and WriteString can skip the copying and initSeed for most of it. To ensure that Write, WriteByte, and WriteString continue to generate output that depends only on the sequence of bytes, expand the grouping test to include WriteString and interleaved calls. Also, make the test process a lot more data, to ensure that Write* handled full buffers correctly. name old time/op new time/op delta Hash8Bytes-8 17.1ns ± 3% 16.5ns ± 2% -3.26% (p=0.000 n=29+27) Hash320Bytes-8 74.9ns ± 2% 58.5ns ± 2% -21.86% (p=0.000 n=30+29) Hash1K-8 246ns ± 3% 195ns ± 1% -20.82% (p=0.000 n=29+30) Hash8K-8 1.87µs ± 2% 1.59µs ± 2% -15.04% (p=0.000 n=26+30) name old speed new speed delta Hash8Bytes-8 468MB/s ± 3% 484MB/s ± 2% +3.36% (p=0.000 n=29+27) Hash320Bytes-8 4.28GB/s ± 2% 5.47GB/s ± 2% +27.97% (p=0.000 n=30+29) Hash1K-8 4.17GB/s ± 3% 5.26GB/s ± 1% +26.28% (p=0.000 n=29+30) Hash8K-8 4.38GB/s ± 2% 5.16GB/s ± 2% +17.70% (p=0.000 n=26+30) Updates #42710 Change-Id: If3cdec1580ffb3e36fab9865e5a9d089c0a34bec Reviewed-on: https://go-review.googlesource.com/c/go/+/278758 Trust: Josh Bleecher Snyder <josharian@gmail.com> Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
This helps a lot for larger writes. name old time/op new time/op delta Hash8Bytes-8 16.5ns ± 4% 16.8ns ± 2% +1.76% (p=0.000 n=29+30) Hash320Bytes-8 55.0ns ± 2% 41.4ns ± 2% -24.64% (p=0.000 n=28+28) Hash1K-8 190ns ± 2% 130ns ± 3% -31.65% (p=0.000 n=30+30) Hash8K-8 1.57µs ± 2% 1.05µs ± 3% -33.01% (p=0.000 n=30+29) name old speed new speed delta Hash8Bytes-8 485MB/s ± 4% 476MB/s ± 2% -1.73% (p=0.000 n=29+30) Hash320Bytes-8 5.82GB/s ± 2% 7.72GB/s ± 3% +32.55% (p=0.000 n=28+29) Hash1K-8 5.39GB/s ± 2% 7.88GB/s ± 3% +46.32% (p=0.000 n=30+30) Hash8K-8 5.21GB/s ± 2% 7.77GB/s ± 3% +49.28% (p=0.000 n=30+29) Updates #42710 Change-Id: Idaf4b2a8a41fc62fc16b54c9358cf2cc7009cf29 Reviewed-on: https://go-review.googlesource.com/c/go/+/278760 Trust: Josh Bleecher Snyder <josharian@gmail.com> Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
All my optimizations are in. I still think we need new API to make small writes faster. |
Ran into this issue when benchmarking my application. Here's the worst case scenario: func BenchmarkMaphash(b *testing.B) {
var hash maphash.Hash
var data [127]byte
for i := 0; i < b.N; i += 1 {
hash.Reset()
hash.Write(data[:])
hash.Sum64()
}
} 27% of the time is spent in I would propose adding a seed := maphash.MakeSeed()
sum := seed.Sum64(data) IMO this fits the API better than a function that takes |
This proposal would also make the code more readable. Before: var seed = maphash.MakeSeed() // global
var hash maphash.Hash
hash.SetSeed(seed)
_, _ = hash.Write(data)
sum := hash.Sum64() After: var seed = maphash.MakeSeed() // global
sum := seed.Sum64(data) |
For some more context, I'm hashing a set of strings and a unix timestamp (int64) to use as a map key. The old code used to sort the strings, append them together, and use strconv to build a string as a key. The new code uses maphash to hash each string and XOR the hashes together (so order is irrelevant). Then I hash the timestamp with a different seed and XOR it to the hash. It's 5x faster now and has reduced my service's CPU usage by 5%. I'm using the Maybe something like? var nameSeed = maphash.MakeSeed()
var timeSeed = maphash.MakeSeed()
func myHash(names []string, timestamp int64) (sum uint64) {
for _, name := range names {
sum ^= nameSeed.HashString(name)
}
sum ^= timeSeed.HashInt64(timeSeed)
return sum
} The current |
People were pinging me privately to unhold this per comments above, so done. |
Change https://go.dev/cl/392494 mentions this issue: |
Thanks @josharian for optimizing the current API. It looks like you got about 2X, and that there's still about 2X on the table. I mailed CL 392494 showing the 2X that remains. The remaining 2X gap seems fundamental, so it seems reasonable to add new API. I hesitate to add methods directly on maphash.Seed. That seems like confusing different concepts. And adding new concurrency-safe methods on maphash.Hash confuses when a Hash can be shared by multiple goroutines. That leaves top-level functions, analogous to sha1.Sum and so on, as @dsnet originally suggested. |
In the CL I used:
Thumbs up/down: do people think it would be better to use
? |
What about? func Sum[Bytes interface{ []byte | string }](b Bytes) uint64 |
It's unclear that generics are making things clearer in that case, and I don't know how to implement the function body. |
Does anyone object to adding Bytes and String as described in #42710 (comment)? |
Based on the discussion above, this proposal seems like a likely accept. |
No change in consensus, so accepted. 🎉 |
Update, Mar 16 2022 - Current proposal is #42710 (comment).
The overhead of constructing a
Hash
object in order to callHash.Write
greatly diminishes its utility for small buffers.Consider the following micro-benchmark:
On my machine this produces:
Directly using
runtime_memhash
is around ~4x faster since it avoids all the unnecessary state contained byHash
that are relatively pointless when hashing a single small string.I propose adding the following API:
where the function is a thin wrapper over
runtime_memhash
. It takesSeed
as an input to force the user to think about the application of seeds.I chose the name
Sum
to be consistent withmd5.Sum
orsha1.Sum
.The text was updated successfully, but these errors were encountered: