-
Notifications
You must be signed in to change notification settings - Fork 3.9k
tscache: switch to skiplist implementation by default, remove Request interface #20232
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tscache: switch to skiplist implementation by default, remove Request interface #20232
Conversation
8af3889
to
f5f7c2c
Compare
This change just passed Jepsen: https://teamcity.cockroachdb.com/viewLog.html?buildId=421110. I'll run it a few times, but at this point, I don't expect anything to change. Also, to confirm that the change will produce benchmark numbers comparable to what we saw in the initial investigation, I ran
The only surprising part is that my testing here didn't seem to affect 99th percentile latency. I think that's because the initial testing was using a single node cluster and the testing here is on a three node cluster, so there are other dominant factors that are now influencing the 99th percentile latency. |
f5f7c2c
to
5b1f93c
Compare
Reviewed 4 of 4 files at r1, 2 of 2 files at r2, 1 of 1 files at r3, 9 of 9 files at r4, 4 of 4 files at r5, 10 of 10 files at r6. pkg/storage/replica.go, line 1995 at r6 (raw file):
I got confused by these names. Do you need both of these variables or can you get away with a single variable? I don't see any mutations of Comments from Reviewable |
The new `sklImpl` tscache structure includes a large number of preemption points when race tested. This change adds a new `util.EnableRacePreemptionPoints` function so that these preemption points are only used in specific tests.
This change makes `treeImpl` thread-safe, as is reflected by its `ThreadSafe` method. This makes the structure slightly less efficient because it will result in locking at a finer-granularity. Now that it's not used by default, that's no a huge deal.
Because all `Cache` implementation must now be thread-safe, this change removes the `ThreadSafe` method entirely. It then removes the `tsCacheMu` lock on `Store`.
This change removes the `requests` btree from treeImpl. Since `treeImpl` is no longer used by default, this optimization is no longer important. This will allow us to remove `tscache.Request` completely.
Now that all `tscache.Cache` implementations are thread safe and do nothing in their `ExpandRequests` methods, we can remove the idea of a `tscache.Request` entirely. We do this by removing the associated methods and type and instead exporting the `Add` method to be used directly. This avoids all associated overhead of the `Request` struct, which is a big improvement for `sklImpl`.
5b1f93c
to
0cf73ec
Compare
TFTR! Review status: 3 of 17 files reviewed at latest revision, 1 unresolved discussion, all commit checks successful. pkg/storage/replica.go, line 1995 at r6 (raw file): Previously, tschottdorf (Tobias Schottdorf) wrote…
Done. Comments from Reviewable |
🎉 🎉 🎉 |
Follow up to #19508.
Closes #4768.
Closes #6190.
Closes #14518.
Closes #16901.
Closes #16796.
Closes #20123.
Part of #17172.
This change switches the default
tscache.Cache
implementation tosklImpl
. In doing so, it performs a series of cleanup steps that improve the external interface oftscache.Cache
and improvesklImpl's
performance moderately. This comes at the cost of removing some optimizations built into thetreeImpl
. Since we're no longer going to use that outside of testing, its performance matters less, so the tradeoff is worth it. The change is broken into a series of steps so that each one is self-contained:tscache
package to createsklImpl
instances by default.treeImpl
thread-safe by adding internal locking.ThreadSafe
method fromtscache.Cache
. Remove thetsCacheMu
lock fromStore
.requests
tree optimization fromtreeImpl
.Request
type and{Add,Expand}Request
methods from thetscache
package.tscache.Cache
now exports itsAdd
method, which is used byReplica
directly.Before merging this, I want to see at least a few successful runs of Jepsen with the change enabled.
I'll also add some concrete benchmarks from before and after this change.