* Nicholas Mc Guire <der.herr@xxxxxxx> wrote: > Signed-off-by: Nicholas Mc Guire <der.herr@xxxxxxx> > --- > > v3: cleanups and merged review notes from Jonathan Corbet <corbet@xxxxxxx> > v4: english cleanup and an important ommission on scope from > Valdis Kletniek <Valdis.Kletnieks@xxxxxx> > > Documentation/scheduler/completion.txt | 245 ++++++++++++++++++++++++++++++++ > 1 file changed, 245 insertions(+) > create mode 100644 Documentation/scheduler/completion.txt > > diff --git a/Documentation/scheduler/completion.txt b/Documentation/scheduler/completion.txt > new file mode 100644 > index 0000000..72cbda2 > --- /dev/null > +++ b/Documentation/scheduler/completion.txt > @@ -0,0 +1,245 @@ > +completions - wait for completion handling > +========================================== > + > +This document was originally written based on 3.18.0 (linux-next) > + > +Introduction: > +------------- > + > +If you have one or more threads of execution that must wait for some process > +to have reached a point or a specific state, completions can provide a race > +free solution to this problem. Semantically they are somewhat like a > +pthread_barriers and have similar use-cases. > + > +Completions are a code synchronization mechanism that is preferable to any s/that is/which are ? > +misuse of locks. Any time you think of using yield() or some quirky > +msleep(1); loop to allow something else to proceed, you probably want to > +look into using one of the wait_for_completion*() calls instead. The > +advantage of using completions is clear intent of the code but also more s/code but/code, but > +efficient code as both threads can continue until the result is actually > +needed. > + > +Completions are built on top of the generic event infrastructure in Linux, > +with the event reduced to a simple flag appropriately called "done" in > +struct completion, that tells the waiting threads of execution if they > +can continue safely. > + > +As completions are scheduling related the code is found in s/related the/related, the > +kernel/sched/completion.c - for details on completion design and > +implementation see completions-design.txt > + > + > +Usage: > +------ > + > +There are three parts to the using completions, the initialization of the s/to the using/to using > +struct completion, the waiting part through a call to one of the variants of > +wait_for_completion() and the signaling side through a call to complete(), > +or complete_all(). Further there are some helper functions for checking the > +state of completions. > + > +To use completions one needs to include <linux/completion.h> and > +create a variable of type struct completion. The structure used for > +handling of completions is: > + > + struct completion { > + unsigned int done; > + wait_queue_head_t wait; > + }; > + > +providing the wait queue to place tasks on for waiting and the flag for > +indicating the state of affairs. > + > +Completions should be named to convey the intent of the waiter. A good > +example is: > + > + wait_for_completion(&early_console_added); > + > + complete(&early_console_added); > + > +Good naming (as always) helps code readability. > + > + > +Initializing completions: > +------------------------- > + > +Initialization of dynamically allocated completions, often embedded in > +other structures, is done with: > + > + void init_completion(&done); > + > +Initialization is accomplished by initializing the wait queue and setting > +the default state to "not available", that is, "done" is set to 0. > + > +The re-initialization function, reinit_completion(), simply resets the > +done element to "not available", thus again to 0, without touching the > +wait queue. Calling init_completion() on the same completions object is s/completions object/completion object > +most likely a bug as it re-initializes the queue to an empty queue and > +enqueued tasks could get "lost" - use reinit_completion() in that case. > + > +For static declaration and initialization, macros are available. These are: > + > + static DECLARE_COMPLETION(setup_done) > + > +used for static declarations in file scope. Within functions the static > +initialization should always use: > + > + DECLARE_COMPLETION_ONSTACK(setup_done) > + > +suitable for automatic/local variables on the stack and will make lockdep > +happy. Note also that one needs to making *sure* the completion passt to s/needs to making *sure*/needs to make *sure*/ s/passt/passed ? > +work threads remains in-scope, and no references remain to on-stack data > +when the initiating function returns. > + > + > +Waiting for completions: > +------------------------ > + > +For a thread of execution to wait for some concurrent work to finish, it > +calls wait_for_completion() on the initialized completion structure. > +A typical usage scenario is: > + > + > + structure completion setup_done; > + init_completion(&setup_done); > + initialze_work(...,&setup_done,...) > + > + /* run non-dependent code */ /* do setup */ > + > + wait_for_completion(&seupt_done); complete(setup_done) > + > +This is not implying any temporal order of wait_for_completion() and the this does not imply any temporar order on? > +call to complete() - if the call to complete() happened before the call > +to wait_for_completion() then the waiting side simply will continue > +immediately as all dependencies are satisfied. > + > +Note that wait_for_completion() is calling spin_lock_irq/spin_unlock_irq > +so it can only be called safely when you know that interrupts are enabled. > +Calling it from hard-irq context will result in hard to detect spurious > +enabling of interrupts. It's not just about hardirq contexts, but also about irqs-off atomic contexts. > + > + > +wait_for_completion(): > + > + void wait_for_completion(struct completion *done): > + > +The default behavior is to wait without a timeout and mark the task as > +uninterruptible. wait_for_completion() and its variants are only safe > +in soft-interrupt or process context but not in hard-irq context. I don't think wait_for_completion() is safe in softirq context ... > +As all variants of wait_for_completion() can (obviously) block for a long > +time, you probably don't want to call this with held locks - see also > +try_wait_for_completion() below. Not 'probably': you must not call it from atomic contexts, held spinlocks, elevated preempt count, disabled irqs, etc. > + > + > +Variants available: > +------------------- > + > +The below variants all return status and this status should be checked in > +most(/all) cases - in cases where the status is deliberately not checked you > +probably want to make a note explaining this (e.g. see > +arch/arm/kernel/smp.c:__cpu_up()). > + > +A common problem that occurs is to have unclean assignment of return types, > +so care should be taken with assigning return-values to variables of proper > +type. Checking for the specific meaning of return values also has been found > +to be quite inaccurate e.g. constructs like > +if(!wait_for_completion_interruptible_timeout(...)) would execute the same s/if(/if (/ > +code path for successful completion and for the interrupted case - which is > +probably not what you want. > + > + > + int wait_for_completion_interruptible(struct completion *done) > + > +marking the task TASK_INTERRUPTIBLE. If a signal was received while waiting. > +It will return -ERESTARTSYS and 0 otherwise. s/marking/This function marks The two final sentences should probably be one? > + > + > + unsigned long wait_for_completion_timeout(struct completion *done, > + unsigned long timeout) > + > +The task is marked as TASK_UNINTERRUPTIBLE and will wait at most timeout s/timeout/'timeout' > +(in jiffies). If timeout occurs it return 0 else the remaining time in s/it return 0/it returns 0, > +jiffies (but at least 1). Timeouts are preferably passed by msecs_to_jiffies() > +or usecs_to_jiffies(). If the returned timeout value is deliberately ignored > +a comment should probably explain why (e.g. see drivers/mfd/wm8350-core.c > +wm8350_read_auxadc()) > + > + > + long wait_for_completion_interruptible_timeout( > + struct completion *done, unsigned long timeout) > + > +passing a timeout in jiffies and marking the task as TASK_INTERRUPTIBLE. If a s/passing/This function passes > +signal was received it will return -ERESTARTSYS, 0 if completion timed-out and s/timed-out/timed out > +the remaining time in jiffies if completion occurred. > + > +Further variants include _killable which passes TASK_KILLABLE as the > +designated tasks state and will return a -ERESTARTSYS if interrupted or s/return a -ERESTARTSYS/return -ERESTARTSYS > +else 0 if completions was achieved as well as a _timeout variant. s/if completions was achieved/if completion was achieved/ > + > + long wait_for_completion_killable(struct completion *done) > + long wait_for_completion_killable_timeout(struct completion *done, > + unsigned long timeout) > + > + > +The _io variants wait_for_completion_io behave the same as the non-_io s/wait_for_completion_io/wait_for_completion_io() > +variants, except for accounting waiting time as waiting on IO, which has > +an impact on how scheduling is calculated. s/how scheduling is calculated/how the task is accounted in scheduling stats > + > + void wait_for_completion_io(struct completion *done) > + unsigned long wait_for_completion_io_timeout(struct completion *done > + unsigned long timeout) > + > + > +Signaling completions: > +---------------------- > + > +A thread of execution that wants to signal that the conditions for s/thread of execution/thread > +continuation have been achieved calls complete() to signal exactly one > +of the waiters that it can continue. > + > + void complete(struct completion *done) > + > +or calls complete_all to signal all current and future waiters. s/complete_all/complete_all() > + > + void complete_all(struct completion *done) > + > + > +The signaling will work as expected even if completions are signaled before > +a thread starts waiting. This is achieved by the waiter "consuming" > +(decrementing) the done element of struct completion. Waiting threads > +wakeup order is the same in which they were enqueued (FIFO order). > + > +If complete() is called multiple times then this will allow for that number > +of waiters to continue - each call to complete() will simply increment the > +done element. Calling complete_all() multiple times is a bug though. Both > +complete() and complete_all() can be called in hard-irq context safely. > + > +There only can be one thread calling complete() or complete_all() on a > +particular struct completions at any time - serialized through the wait s/struct completions/struct completion > +queue spinlock. Any such concurrent calls to complete() or complete_all() > +probably are a design bug. > + > +Signaling completion from hard-irq context is fine as it will appropriately > +lock with spin_lock_irqsave/spin_unlock_irqrestore. > + > + > +try_wait_for_completion()/completion_done(): > +-------------------------------------------- > + > +The try_wait_for_completion will not put the thread on the wait queue but s/The try_wait_for_completion will/ The try_wait_for_completion() function will/ > +rather returns false if it would need to enqueue (block) the thread, else it s/if it would need to enqueue/if it needed to enqueue/ ? > +consumes any posted completions and returns true. > + > + bool try_wait_for_completion(struct completion *done) > + > + > +Finally to check state of a completions without changing it in any way is s/completions/completion > +provided by completion_done() returning false if there are any posted s/if there are any/if there is any > +completion that was not yet consumed by waiters implying that there are > +waiters and true otherwise; > + > + bool completion_done(struct completion *done) > + > +Both try_wait_for_completion() and completion_done() are safe to be called in > +hard-irq context. irq atomic context != hard-irq context. This needs to be fixed throughout the document. Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html