Hello Marion. On 12/2/20 7:07 PM, Sudvarg, Marion wrote: > Hello Michael, > > I apologize if you're receiving this email a second time. I > accidentally kept HTML formatting enabled the first time I sent it, > causing it to be rejected as spam. > > I am teaching the Operating Systems Organization course at Washington > University in St. Louis, and to supplement a series of lectures on > locking and synchronization, I assigned students to read the futex(7) > manual page. One student, Alex Baker <mailto:alexbaker@xxxxxxxxx>, > pointed out a race condition in the description of how to "down" a > futex, i.e. wait for or acquire a lock, under the Semantics section. > Say we have two threads, T0 and T1, which execute as follows: > > 1. T0 acquires the lock, decrements the futex to 0 > 2. T1 switches in, attempts to acquire the lock, decrements the futex > to -1 > 3. T0 switches in, completes its critical section > 4. T0 unlocks the lock, increments the futex to 0 > 5. Because the futex is 0, T0 assumes threads are waiting on the > futex, but no threads have yet called FUTEX_WAIT > 6. T0 sets the futex to 1 > 7. T0 uses the FUTEX_WAKE operation > 8. T1 switches in, and believing it should still wait for the futex, > it sets the futex to -1 > 9. T1 now uses the FUTEX_WAIT operation > > Because, in step 8, T1 has set the futex to -1, its call to > FUTEX_WAIT in 9 will succeed, as the futex holds the expected value > in the call (-1). But since T0 has already completed its execution > and has called FUTEX_WAKE, T1 may never be woken. > > The fwait and fpost (i.e. lock and release, or down and up) functions > in the Examples section on the futex(2) page seem to be race free, > but use atomic compare exchange functions, instead of the atomic > increment and decrement semantics described in futex(7). > > I've attached a patch for the futex(7) man page, which modifies the > Semantics section to describe a correct, race-free use of a futex for > lock acquisition using atomic increment and decrement operations. > I've also added a code sample to help illustrate this. I hope the > addition of the code sample does not, in your opinion, add > unnecessary length to this manual page, given that the futex(2) page > is already so thorough. > > I have copied Bert Hubert, whom I believe to be the original author > of the futex(7) man page. I have a question: what Linux distro are you using/did you make this patch from? The code example that your patch is using is _not_ in the upstream page that is part of man-pages. Regarding your comments on the race above, I find the text of the manual page a bit unclear, so I'm not sure what kind of fix should be made. Maybe we are lucky and Bert has a long memory and replies to this thread. Thanks, Michael ===== diff --git a/man7/futex.7 b/man7/futex.7 index 22f610646..f59725b61 100644 --- a/man7/futex.7 +++ b/man7/futex.7 @@ -72,19 +72,11 @@ operation. Waiting on a futex, to "down" it, is the reverse operation. Atomically decrement the counter and check if it changed to 0, in which case the operation is done and the futex was uncontended. -In all other circumstances, the process should -request that the kernel wait for another process to up the futex. +In all other circumstances, the process should set the counter to \-1 +and request that the kernel wait for another process to up the futex. This is done using the .B FUTEX_WAIT -operation, -which is provided the return value of the atomic decrement operation. -In the event that another process has modified the value of the futex -between the atomic decrement and the -.B FUTEX_WAIT -operation, this guarantees that the -.B FUTEX_WAIT -fails, and the process may try again to "down" the futex. -.SS +operation. .PP The .BR futex (2) @@ -113,166 +105,6 @@ below. This man page illustrates the most common use of the .BR futex (2) primitives; it is by no means the only one. -.SH EXAMPLES -The program below demonstrates the use of futexes in a program where -threads use a futex to synchronize access to a critical section, -which increments a global integer variable -.IR nloops -(a command-line argument that defaults to 100000 if omitted) -times. -After the parallel section, -the program prints the value of the global variable. -Upon running this program we see output such as the following: -.PP -.in +4n -.EX -$ \fB./futex_demo\fP -Ran with 2 threads -Each thread incremented global_int 1000000 times -Final value of global_int: 2000000 -.EE -.in -.SS Program source -\& -.EX -/* futex_demo.c - - Usage: futex_demo [nloops] - (Default: 100000) - - Demonstrate the use of futexes in a program where multiple threads - use a futex to synchronize access to a global integer variable, which - is initialized to 0. The two threads each increment the variable - \(aqnloops\(aq times, and employ a synchronization protocol that - ensures only one thread can access the global variable at a time. - - We use OpenMP for thread parallelism; - therefore, this program must be compiled with the \-fopenmp flag, - e.g.: - - gcc futex_demo.c \-o futex_demo \-fopenmp - -#define _GNU_SOURCE -#include <stdio.h> -#include <errno.h> -#include <stdatomic.h> -#include <stdint.h> -#include <stdlib.h> -#include <unistd.h> -#include <sys/syscall.h> -#include <linux/futex.h> -#include <omp.h> - -#define errExit(msg) do { perror(msg); exit(EXIT_FAILURE); \e - } while (0) - -#define NUM_THREADS 2 -#define LOCKED 0 -#define UNLOCKED 1 - -static int global_int = 0; -static uint32_t lock = UNLOCKED; - -static int -futex(uint32_t *uaddr, int futex_op, uint32_t val, - const struct timespec *timeout, uint32_t *uaddr2, uint32_t val3) -{ - return syscall(SYS_futex, uaddr, futex_op, val, - timeout, uaddr2, val3); -} - -/* Increments the global integer variable nloops times. - Without locking, a race condition may occur. */ - -static void -critical_section(int nloops) -{ - for (int i = 0; i < nloops; i++) { - global_int++; - } -} - - -/* Attempt to lock the futex pointed to by \(aqfutexp\(aq: - The futex value is decremented by 1. - If the futex value is now LOCKED, - the lock was successfully acquired. - Otherwise, wait for the lock to be released. */ - -static void -flock(uint32_t * futexp) -{ - int s; - int futex_val; - - /* Attempt to acquire the lock */ - while ( (futex_val = __atomic_sub_fetch(futexp, 1, __ATOMIC_ACQ_REL)) < LOCKED ) { - - /* If the lock is not available, wait */ - - s = futex(futexp, FUTEX_WAIT, futex_val, NULL, NULL, 0); - if (s == \-1 && errno != EAGAIN) - errExit("futex\-FUTEX_WAIT"); - } -} - -/* Unlock the futex pointed to by \(aqfutexp\(aq: - The futex value is incremented by 1. - - If the futex value is now UNLOCKED, - no threads are waiting for the lock. - Otherwise, another thread is waiting, - so set the value to UNLOCKED and wake. */ - -static void -funlock(uint32_t * futexp) -{ - int s; - - /* Are any threads waiting for the lock? */ - if (__atomic_add_fetch(futexp, 1, __ATOMIC_ACQ_REL) != UNLOCKED) { - - /* If so, unlock and notify */ - __atomic_store_n(futexp, UNLOCKED, __ATOMIC_RELEASE); - s = futex(futexp, FUTEX_WAKE, 1, NULL, NULL, 0); - if (s == \-1) - errExit("futex\-FUTEX_WAKE"); - } -} - -int -main(int argc, char *argv[]) -{ - int nloops; - int n_threads; - - nloops = (argc > 1) ? atoi(argv[1]) : 100000; - - //Begin OpenMP parallel section - omp_set_num_threads(NUM_THREADS); - #pragma omp parallel - { - - //Retrieve the actual number of threads - if (omp_get_thread_num() == 0) { - n_threads = omp_get_num_threads(); - } - - //Lock and run the critical section - flock(&lock); - critical_section(nloops); - funlock(&lock); - - } - - printf("Ran with %d threads\n", n_threads); - printf("Each thread incremented global_int %d times\n", nloops); - printf("Final value of global_int: %d\n", global_int); - - exit(EXIT_SUCCESS); -} - -.EE .\" .SH AUTHORS .\" .PP .\" Futexes were designed and worked on by Hubertus Franke -- Michael Kerrisk Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/ Linux/UNIX System Programming Training: http://man7.org/training/