Fix InvalidStateError in manager.py#in_pending_state caused by race condition #967
+30
−7
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Background
In my jupyter system, kernel start, restart and shutdown are relatively frequent, comparing to other environment. It is due to additional features like online judge.
In this environment, Sentry reported
InvliadStateError
. It is very common error, occurs hundreds times a day.Introduction
in_pending_state
is a decorator on async functions. If someone invokes a decorated function,.ready
is instantiated and finishes(done) after the function invokation.I think we can isolate
in_pending_state
from kernel management, and consider this as a general coroutine management service..ready
is a coroutine that instantiated when it becomes pending, and finished when it becomes ready(=not pending)The Problem and The Solution
Original implementation does not consider multiple executing functions. So the implementation can cause a single
.ready
coroutine toset_result
twice. This causesInvalidStateError
. I have coded a simple test case for this.To fix this, more general mechanism is needed. Additionally, we don't have to associate this mechanism to kernel lifecycle. Previous
._attempted_start
adds unnecessary coupling to the code. Therefore I isolated the mechanism from the details of the decorated function, so introduced._ready_count
and removed.attempted_start
.Help Needed
However, I'm not sure about how to associate this new mechanism with
owns_kernel
.Might be related
jupyter-server/jupyter_server#1247