[RFC] IO-stealing analogue to work-stealing
IO stealing is analog to work-stealing and means that worker thread without work will try to steal IO completions (CQEs) from other worker's IoContexts. The work stealing algorithm is modified to check a victims CQ after findig their work queue empty.
This approach in combination with future additions (global notifications on IO completions, and lock free CQE consumption) are a realistic candidate to replace the completer thread without loosing its benefits.
To allow IO stealing the CQ must be synchronized which is already the
case with the IoContext::cq_lock
.
Currently stealing workers always try to pop a single CQE (this could
be configurable).
Steal attempts are recorded in the IoContext's Stats object and
successfully stolen IO continuations in the AbstractWorkStealingWorkerStats
.
I moved the code transforming CQEs into continuation Fibers from reapCompletions into a seperate function to make the rather complicated function more readable and thus easier to understand.
Remove the default CallerEnvironment template arguments to make the code more explicit and prevent easy errors (not propagating the caller environment or forgetting the function takes a caller environment).
io::Stats
now need to use atomics because multiple thread may increment
them in parallel from EMPER and the OWNER.
And since using std::atomic<T*>
in std::map
is not easily possible we
use the compiler __atomic_*
builtins.
Add, adjust and fix some comments.