On 02/13/2018 04:42 AM, Benjamin Marzinski wrote: > Commit 6e2423fd fixed a bug where the tur checker could cancel a > detached thread after it had exitted. However in fixing this, the new > code grabbed a mutex (to call condlog) while holding a spin_lock. To > deal with this, I've done away with the holder spin_lock completely, and > replaced it with two atomic variables, based on a suggestion by Martin > Wilck. > > The holder variable works exactly like before. When the checker is > initialized, it is set to 1. When a thread is created it is incremented. > When either the thread or the checker are done with the context, they > atomically decrement the holder variable and check its value. If it > is 0, they free the context. If it is 1, they never touch the context > again. > > The other variable has changed. First, ct->running and ct->thread have > switched uses. ct->thread is now only ever accessed by the checker, > never the thread. If it is non-NULL, a thread has started up, but > hasn't been dealt with by the checker yet. It is also obviously used > by the checker to cancel the thread. > > ct->running is now an atomic variable. When the thread is started > it is set to 1. When the checker wants to kill a thread, it atomically > sets the value to 0 and reads the previous value. If it was 1, > the checker cancels the thread. If it was 0, the nothing needs to be > done. After the checker has dealt with the thread, it sets ct->thread > to NULL. > > Right before the thread finishes and pops the cleanup handler, it > atomically sets the value of ct->running to 0 and reads the previous > value. If it was 1, the thread just pops the cleanup handler and exits. > If it was 0, then the checker is trying to cancel the thread, and so the > thread calls pause(), which is a cancellation point. > > Cc: Martin Wilck <mwilck@xxxxxxxx> > Cc: Bart Van Assche <Bart.VanAssche@xxxxxxx> > Signed-off-by: Benjamin Marzinski <bmarzins@xxxxxxxxxx> > --- > libmultipath/checkers/tur.c | 107 ++++++++++++++++++-------------------------- > 1 file changed, 43 insertions(+), 64 deletions(-) > [ .. ] > @@ -391,19 +368,17 @@ int libcheck_check(struct checker * c) > ct->state = PATH_UNCHECKED; > ct->fd = c->fd; > ct->timeout = c->timeout; > - pthread_spin_lock(&ct->hldr_lock); > - ct->holders++; > - pthread_spin_unlock(&ct->hldr_lock); > + uatomic_add(&ct->holders, 1); > + uatomic_set(&ct->running, 1); > tur_set_async_timeout(c); > setup_thread_attr(&attr, 32 * 1024, 1); > r = pthread_create(&ct->thread, &attr, tur_thread, ct); > pthread_attr_destroy(&attr); > if (r) { > - pthread_spin_lock(&ct->hldr_lock); > - ct->holders--; > - pthread_spin_unlock(&ct->hldr_lock); > - pthread_mutex_unlock(&ct->lock); > + uatomic_sub(&ct->holders, 1); > + uatomic_set(&ct->running, 0); > ct->thread = 0; > + pthread_mutex_unlock(&ct->lock); > condlog(3, "%s: failed to start tur thread, using" > " sync mode", tur_devt(devt, sizeof(devt), ct)); > return tur_check(c->fd, c->timeout, I would rather set 'ct->running' from within the thread; otherwise there is a chance that pthread_create() returns without error, but the thread itself hasn't been run yet. Cheers, Hannes -- Dr. Hannes Reinecke Teamlead Storage & Networking hare@xxxxxxx +49 911 74053 688 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton HRB 21284 (AG Nürnberg) -- dm-devel mailing list dm-devel@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/dm-devel