-
Notifications
You must be signed in to change notification settings - Fork 565
Add preemption logic to GlobalState and typecheck #2407
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
main/pipeline/pipeline.h
Outdated
ast::ParsedFilesOrCancelled typecheck(std::unique_ptr<core::GlobalState> &gs, std::vector<ast::ParsedFile> what, | ||
const options::Options &opts, WorkerPool &workers); | ||
const options::Options &opts, WorkerPool &workers, bool cancelable = false, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cancelable
is now an explicit parameter because the fast path calls typecheck
during preemption but is not cancelable.
It prevents issues with the following interleaving:
- Preemption task is scheduled.
- Slow path is canceled.
- Preemption task runs, sees that
wasTypecheckingCanceled
istrue
, and aborts fast path.
cc @DarkDimius -- you might have strong opinions on this part of the code, as it impacts the core GlobalState abstraction. |
// [IDE] We want blocking operations to be more responsive to preemption requests. | ||
inline static constexpr std::chrono::milliseconds PREEMPTION_BLOCK_INTERVAL() { | ||
using namespace std::chrono_literals; | ||
return 20ms; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are there any downsides of changing the other value from 250ms to 20ms?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question. I can time some runs on pay-server to see if it impacts throughput.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope! Perf seems unaffected running typecheck on pay-server.
Logic is not yet used.
…e slow path cancellation and preemption.
06c4423
to
60f324b
Compare
@@ -1491,85 +1491,6 @@ bool GlobalState::wasModified() const { | |||
return wasModified_; | |||
} | |||
|
|||
bool GlobalState::wasTypecheckingCanceled() const { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved to TypecheckEpochManager
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome cleanup
@@ -250,21 +239,6 @@ class GlobalState final { | |||
UnorderedMap<NameRef, std::string> dslPlugins; | |||
bool wasModified_ = false; | |||
|
|||
// In LSP mode: Used to linearize operations involving lastCommittedLSPEpoch. | |||
const std::shared_ptr<absl::Mutex> epochMutex; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved to TypecheckEpochManager
using namespace std; | ||
namespace sorbet::core::lsp { | ||
|
||
void TypecheckEpochManager::assertConsistentThread(optional<thread::id> &expectedThreadId, string_view method, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't know all of the thread IDs at construction time, so this assertion primarily asserts that we call specific routines from a consistent thread ID.
if (preemptionManager.has_value()) { | ||
// Now that we are no longer running a slow path, run a preemption task that might have snuck in while we were | ||
// finishing up. No others can be scheduled. | ||
(*preemptionManager)->tryRunScheduledPreemptionTask(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not 100% happy that TypecheckEpochManager
knows about PreemptionTaskManager
, but I don't have a better solution at the moment.
@@ -915,11 +917,19 @@ ast::ParsedFilesOrCancelled resolve(unique_ptr<core::GlobalState> &gs, vector<as | |||
} | |||
|
|||
ast::ParsedFilesOrCancelled typecheck(unique_ptr<core::GlobalState> &gs, vector<ast::ParsedFile> what, | |||
const options::Options &opts, WorkerPool &workers) { | |||
const options::Options &opts, WorkerPool &workers, bool cancelable, | |||
optional<shared_ptr<core::lsp::PreemptionTaskManager>> preemptionManager) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could make this non-optional and default to a new preemption manager object (so that preemption will never actually occur), but that would add the overhead of a reader lock acquisition to every file.
I believe I've addressed the feedback you've given me, @DarkDimius. I'm working on benchmarking Sorbet w/ the 20ms change. Other than that, this is ready for review! |
// - Worker threads grab as a reader lock, and routinely gives up and re-acquire the lock to allow other requests to | ||
// pre-empt. | ||
// - Typechecking coordinator thread grabs as a writer lock when there's a preemption function, which halts all | ||
// worker threads. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// worker threads. | |
// worker threads. Only typechecking coordinator can do this and this enforces that multiple threads can't get mutable access to global state. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like the cleanup.
I think it's safe enough now, especially with thread-id checks.
looking forward to rollout!
The 20ms change has no measurable impact on performance. Typechecking pay-server (on my branch on my devbox, which has some typechecking failures) takes ~same time with and without the change. |
Add preemption logic to GlobalState and typecheck. The new logic is currently unused.
This change introduces two new classes:
TypecheckEpochManager
: Contains all of the logic related to slow path cancelation. Previously, all of this lived directly inGlobalState
.PreemptionTaskManager
: Contains all of the logic related to preemption. Passed into thetypecheck
method.Motivation
This logic is needed to enable requests to preempt the slow path. The flow goes like this:
trySchedulePreemptionTask
), which succeeds only if the slow path is still happening.withEpochLock
protects against races between the slow path ending and a preemption task getting scheduled.pipeline::typecheck
, attempts to run a preemption task with each turn of the coordination loop (it wakes up with maximum latency ~20ms) usingtryRunScheduledPreemptionTask
.tryRunScheduledPreemptionTask
grabstypecheckMutex
as a write lock (exclusive lock). Worker threads that are typechecking individual files conduct their work while holding a read lock (shared lock) ontypecheckMutex
vialockPreemption
. These threads periodically reacquire the lock, and will block if a thread is attempting to acquire a write lock. Thus, the preemption task only begins once all worker threads are blocked and are not performing any work.Test plan
I added automated tests for the core GlobalState logic. There's no easy way to test
typecheck
itself. I intend to add end-to-end tests for this mechanism once I hook up LSP to the preemption logic.