Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC 2: Skeleton for ExecutionContext #15350

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

ysbaddaden
Copy link
Contributor

Integrates the skeleton as per RFC #0002.

  • Add the ExecutionContext module;
  • Add the ExecutionContext::Scheduler module;
  • Add the execution_context compile-time flag.

When the execution_context flag is set:

  • Don't load Crystal::Scheduler;
  • Plug ExecutionContext instead of Crystal::Scheduler in spawn, Fiber, ...

This is only the skeleton: there are no implementations (yet). Trying to compile anything with -Dexecution_context will obviously fail until the follow up pull requests implementing at least the ST and/or MT context are merged.

refs #15342

- Add the `ExecutionContext` module;
- Add the `ExecutionContext::Scheduler` module;
- Add the `execution_context` compile-time flag.

When the `execution_context` flag is set:

- Don't load `Crystal::Scheduler`;
- Plug `ExecutionContext` instead of `Crystal::Scheduler` in `spawn`,
  `Fiber`, ...

This is only the skeleton: there are no implementations (yet). Trying to
compile anything with `-Dexecution_context` will obviously fail for the
time being.
The current ST and MT schedulers use a distinct pool per thread, which
means we only need the thread safety for execution contexts that will
share a single pool for a whole context.
The point is to avoid parallel enqueues while running the event loop, so
we get better control to where and how the runnable fibers are enqueued;
for example all at once instead of one by one (may not be as effective
as it sounds).

More importantly for Execution Contexts: it avoids parallel enqueues
while the eventloop is running which sometimes leads to confusing
behavior; for example when deciding to wake up a scheduler/thread we
musn't interryupt the event loop (obviously).

This is working correctly for the Polling (Epoll, Kqueue) and IOCP event
loop implementations. I'm less confident with the libevent one where the
external library executes arbitrary callbacks.
src/concurrent.cr Show resolved Hide resolved
src/crystal/system/unix/signal.cr Outdated Show resolved Hide resolved
src/crystal/system/thread.cr Outdated Show resolved Hide resolved

{% raise "ERROR: execution contexts require the `preview_mt` compilation flag" unless flag?(:preview_mt) %}

module ExecutionContext
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thought: I'm wondering:

  • Should this be in the top-level namespace or rather in Crystal::ExecutionContext?
  • Should this be a publicly documented module?

For comparison, Scheduler is in the Crystal namespace and nodoc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Being the module to hold all the execution contexts and being the public interface definition to all the implementations, I'd say it better be documented?

Copy link
Contributor Author

@ysbaddaden ysbaddaden Jan 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or maybe it's methods could be :nodoc: for starters, but the module itself must appear.

I think having both ExecutionContext and Crystal::ExecutionContext would just be terribly confusing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if documented, it could make some sense to move it into the Crystal namespace because it's a component of the Crystal runtime. Like Crystal::EventLoop.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand the Crystal namespace is to keep internal runtime details, which the event loop currently is: none of its interface is public.

I'd move the global Thread to become Crystal::Thread for example, for the same reason.

But execution contexts have a public interface —at least constructors and #spawn— so taking the global ExecutionContext namespace makes sense to me.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Crystal is not just for internal details. It can also serve for public details of the runtime.
I think it's generally a good idea to collect implicit types of the runtime in the Crystal namespace instead of spreading them out in the top level namespace. Of course it's up to interpretation which types should be considered for that.
Anyway, we may not be able to change this for existing types, but we have the option when introducing a new one and should make a good decision.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this is one of the first time we introduce something that's really on the edge between private and public.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm inclined to Crystal:: whether the api is public or not.

ExecutionContext is a generic name that might clash with user's code.

I agree with @straight-shoota that the Crystal namespace is not necessarily internal things.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Allright, let's go for the Crystal namespace.

I'm starting to believe that we should have put things under the Fiber namespace; these are tools to handle fibers after all:

  • Fiber::Channel
  • Fiber::Mutex
  • Fiber::WaitGroup
  • Fiber::ExecutionContext

This is what we did with Thread::Mutex for example.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fiber:: works as well for me, it has the same benefits as using Crystal. And is clearly related to the runtime.

src/execution_context/scheduler.cr Outdated Show resolved Hide resolved
ysbaddaden and others added 3 commits January 17, 2025 14:09
We need this because we don't load crystal/scheduler anymore when
execution contexts are enabled.
Comment on lines +75 to +76
# that being said, we can still trust the *current_fiber* local variable
# (it's the only exception)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: Is this note relevant? Even current_fiber isn't used anywhere after the context swap.

Copy link
Contributor Author

@ysbaddaden ysbaddaden Jan 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I believe so. It's a note to the future selves.

Explaining that we switched context and we can't trust the local variables, including self (local var) and ivars (accessed through self), is useful knowledge. Doubly so because I think it used to work: the same thread/scheduler kept resuming the same fibers, but that's no longer true.

It will prevent someone to use @dead_fiber instead of Thread.current.dead_fiber for example, or trying to cache Thread.current or trusting #thread —yes, I fell for the first one at least 🤕

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That mostly concerns the whole comment. I'm particularly wondering about the second paragraph referring to current_fiber.

Comment on lines +61 to +63
{% unless flag?(:interpreted) %}
thread.dead_fiber = current_fiber if current_fiber.dead?
{% end %}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue: Having two checks for dead fiber cleanup up on every single context swap seems like it might be a bit wasteful.
I reckon in the vast majority of use cases involving any remotely significant amount of IO (or other frequent fiber-swap points) dead fibers appear only at a very low percentage of context swaps. But these checks impact performance of every single one of them. They're relatively simple, but it still adds up. Fast context swaps are important for efficient concurrency.

Also this causes another effect when there is actually a dead fiber to clean up, happening during the context swap might delay it significantly (depending on the release procedure).

This could also cause unnecessary performance penalties. When a thread picks up a fiber, but before actually resuming it, it notices it has a dead fiber's stack to clean up and occupies itself with that. This delays executing the fiber because the thread has already reserved it, while another thread might be ready to run it already.

Overall, dead fiber cleanup shouldn't be time critical, really. The only critical property is that it must be delayed until after the fiber has really
So we could do it at opportune, controlled moments instead of checking for it on every stack swap. For example, when a thread is idle or when spawning a new fiber and the stack pool is empty (maybe we can reclaim the stack of a dead fiber).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose the dead marking could perhaps be moved into the cleanup of Fiber#run so it really only happens when a fiber is dead, avoiding the check on every swapcontext.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can indeed move the before-swap check to Fiber#run.

As for the after-swap... I don't see alternatives 😢

  • a queue of dying fibers in the stack pool? we'd still have to set the fiber's state as truly dead after swapcontext;

  • GC finalizers? that would delay the recycling and freeing to when the GC decides to collect, and stacks would end up in whatever stack pool (unless we share a single pool for the process). Each stack being 8MB of virtual memory it quickly adds up and we can easily reach OOM.

Maybe we can optimize passing the dead fiber?

  • we can't use @dead_fiber as explained in the note below (a fiber may be resumed by any thread) 😢

  • use a @@dead_fiber thread local instead of having to resolve the @@current thread local on Thread then checking Thread#dead_fiber? 🤔

  • pass the dead fiber as a local variable through assembly... there could be a setcontext function that wouldn't save the current context (why bother?) but would instead set the dead fiber to some register, while swapcontext wouldn't bother (but would still need to tell where to set the deaf fiber)... the after-swap check would then be a local compare/jump 🤔

NOTE: we want to recycle the stacks —I measured and it's much faster to avoid the mmap syscall on every spawn— and we'd still need to safely unmap the stacks anyway.

NOTE: Go doesn't have this issue because it allocates a 2KB stack in the GC HEAP and reallocates when needed, it eventually lets the GC collect it.


{% unless flag?(:interpreted) %}
if fiber = Thread.current.dead_fiber?
fiber.execution_context.stack_pool.release(fiber.@stack)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since a fiber can only be resumed by its owning execution context, and that #stack_pool is shared by the context, we could merely call stack_pool.release(fiber.@stack) here 🤔

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Review
Development

Successfully merging this pull request may close these issues.

3 participants