Programming question/problem with RH9's POSIX thread library

Neil Bird <neil@xxxxxxxxxx> · Mon, 14 Jul 2003 11:29:59 +0100

  I'm hoping I'll found someone out there who knows the POSIX/Linux 
thread implementation we've now got in RH9 (NPTL?) in some detail!

  We have an app. we're porting to RH9 using the new implementation, 
and we're getting a lock-up that I can't reproduce with a noddy example.

  Essentially, one 'parent' thread is trying to kill a 'child' thread 
(that it created earlier) with 'pthread_cancel()'.  The child thread, if 
gdb (and strace IIRC) is to be believed is sat in a sem_wait().  The 
child thread has set the cancel_type to ASYNC (so it should go away as 
soon as it's told).

  The parent thread then immediately does a pthread_join() to ensure 
the child's gone - then nothing.  The child stays in sem_wait(), the 
parent never returns from pthread_join().

  Anyone know what can cause this?  I think it's some funny race 
condition, as occasiona sprinkling of printf() debug can make it, if not 
go away, then less likely, but a small test prog. I wrote that does the 
same thing always kills the child thread as I'd expect, both if it's 
killed within sem_wait() and if it's killed just beforehand.  AIUI, 
sem_wait() is supposed to be a cancel point anyway.

--
[neil@xxx ~]# rm -f .signature
[neil@xxx ~]# ls -l .signature
ls: .signature: No such file or directory
[neil@xxx ~]# exit

--
Shrike-list mailing list
Shrike-list@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/shrike-list