On 4/8/22 09:54, Thomas Gleixner wrote: > On Fri, Apr 08 2022 at 04:41, Nico Pache wrote: >> On 4/8/22 04:15, Peter Zijlstra wrote: >>>> >>>> The following case can still fail: >>>> robust head (skipped) -> private lock (reaped) -> shared lock (skipped) >>> >>> This is still all sorts of confused.. it's a list head, the entries can >>> be in any random other VMA. You must not remove *any* user memory before >>> doing the robust thing. Not removing the VMA that contains the head is >>> pointless in the extreme. >> Not sure how its pointless if it fixes all the different reproducers we've >> written for it. As for the private lock case we stated here, we havent been able >> to reproduce it, but I could see how it can be a potential issue (which is why >> its noted). > > The below reproduces the problem nicely, i.e. the lock() in the parent > times out. So why would the OOM killer fail to cause the same problem > when it reaps the private anon mapping where the private futex sits? > > If you revert the lock order in the child the robust muck works. Thanks for the reproducer Thomas :) I think I need to re-up my knowledge around COW and how it effects that stack. There are increased oddities when you add the pthread library that I cant fully wrap my head around at the moment. My confusion lies in how the parent/child share a robust list here, but they obviously do. In my mind the mut_s would be different in the child/parent after the fork and pthread_mutex_init (and friends) are done in the child. Thanks! -- Nico > > Thanks, > > tglx > --- > #include <errno.h> > #include <fcntl.h> > #include <pthread.h> > #include <time.h> > #include <stdio.h> > #include <string.h> > #include <unistd.h> > > #include <sys/types.h> > #include <sys/mman.h> > > static char n[4096]; > > int main(void) > { > pthread_mutexattr_t mat_s, mat_p; > pthread_mutex_t *mut_s, *mut_p; > pthread_barrierattr_t ba; > pthread_barrier_t *b; > struct timespec to; > void *pri, *shr; > int r; > > shr = mmap(NULL, sizeof(n), PROT_READ | PROT_WRITE, > MAP_SHARED | MAP_ANONYMOUS, -1, 0); > > pthread_mutexattr_init(&mat_s); > pthread_mutexattr_setrobust(&mat_s, PTHREAD_MUTEX_ROBUST); > mut_s = shr; > pthread_mutex_init(mut_s, &mat_s); > > pthread_barrierattr_init(&ba); > pthread_barrierattr_setpshared(&ba, PTHREAD_PROCESS_SHARED); > b = shr + 1024; > pthread_barrier_init(b, &ba, 2); > > if (!fork()) { > pri = mmap(NULL, 1<<20, PROT_READ | PROT_WRITE, > MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); > pthread_mutexattr_init(&mat_p); > pthread_mutexattr_setpshared(&mat_p, PTHREAD_PROCESS_PRIVATE); > pthread_mutexattr_setrobust(&mat_p, PTHREAD_MUTEX_ROBUST); > mut_p = pri; > pthread_mutex_init(mut_p, &mat_p); > > // With lock order s, p parent gets timeout > // With lock order p, s parent gets owner died > pthread_mutex_lock(mut_s); > pthread_mutex_lock(mut_p); > // Remove unmap and lock order does not matter > munmap(pri, sizeof(n)); > pthread_barrier_wait(b); > printf("child gone\n"); > } else { > pthread_barrier_wait(b); > printf("parent lock\n"); > clock_gettime(CLOCK_REALTIME, &to); > to.tv_sec += 1; > r = pthread_mutex_timedlock(mut_s, &to); > printf("parent lock returned: %s\n", strerror(r)); > } > return 0; > } >