On Fri, Oct 05, 2018 at 12:11:18PM +0200, Martin Wilck wrote: > On Thu, 2018-10-04 at 11:31 -0500, Benjamin Marzinski wrote: > > It's been a while, and I'm not exactly sure what I was thinking here. > > But IIRC the idea was that if the state isn't set yet, then the old > > thread could mess with the results of a future thread. Also, it was > > to > > avoid a corner case where the tur checker was called, started the > > thread, got the result before the checker exitted, cancelled the > > thread > > and returned the result and then was called very shortly afterwards, > > before the thread could clean itself up. In that case, the right > > thing > > to do (I thought) would be to start a new thread, because the other > > one > > would be ending soon. In truth, starting a new thread at all is > > probably a bad idea, since the old thread will still mess with the > > checker context on exit. > > > > A better idea might be to simply fail back to a syncronous call to > > tur_check() when you notice a cancelled thread that hasn't exitted. > > That can cause all the other checkers to get delayed, but at least in > > that case you are still checking paths. The other option is to do as > > before and just not check this path, which won't effect the other > > checkers. It seems very unlikely that the thread could remain > > uncancelled forever, especially if the path was usable. > > > > thoughts? > > Generally speaking, we're deeply in the realm of highly unlikely > situations I would say. But we should get it right eventually. > > Maybe we can add logic to the tur thread to keep its hands off the > context if it's a "zombie", like below (just a thought, untested)? This still wouldn't stop a thread from racing with new thread creation to change ct->holders or ct->running. -Ben > Martin > > > diff --git a/libmultipath/checkers/tur.c b/libmultipath/checkers/tur.c > index bf8486d3..e8493ca8 100644 > --- a/libmultipath/checkers/tur.c > +++ b/libmultipath/checkers/tur.c > @@ -219,11 +219,18 @@ static void cleanup_func(void *data) > rcu_unregister_thread(); > } > > +#define tur_thread_quit_unless_owner(__ct) \ > + if (__ct->thread != pthread_self()) { \ > + pthread_mutex_unlock(&__ct->lock); \ > + pthread_exit(NULL); \ > + } > + > static void copy_msg_to_tcc(void *ct_p, const char *msg) > { > struct tur_checker_context *ct = ct_p; > > pthread_mutex_lock(&ct->lock); > + tur_thread_quit_unless_owner(ct); > strlcpy(ct->message, msg, sizeof(ct->message)); > pthread_mutex_unlock(&ct->lock); > } > @@ -243,6 +250,7 @@ static void *tur_thread(void *ctx) > > /* TUR checker start up */ > pthread_mutex_lock(&ct->lock); > + tur_thread_quit_unless_owner(ct); > ct->state = PATH_PENDING; > ct->message[0] = '\0'; > pthread_mutex_unlock(&ct->lock); > @@ -252,6 +260,7 @@ static void *tur_thread(void *ctx) > > /* TUR checker done */ > pthread_mutex_lock(&ct->lock); > + tur_thread_quit_unless_owner(ct); > ct->state = state; > pthread_cond_signal(&ct->active); > pthread_mutex_unlock(&ct->lock); > > > > > -- > Dr. Martin Wilck <mwilck@xxxxxxxx>, Tel. +49 (0)911 74053 2107 > SUSELinux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton > HRB 21284 (AG Nürnberg) > -- dm-devel mailing list dm-devel@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/dm-devel