Hello Mike, On 11/21/20 6:54 PM, Mike Crowe wrote: > Hi Michael, > > On Saturday 21 November 2020 at 07:59:04 +0100, Michael Kerrisk (man-pages) wrote: >> I've been taking a closer look at the the new pthread*clock*() APIs: >> pthread_clockjoin_np() >> pthread_cond_clockwait() >> pthread_mutex_clocklock() >> pthread_rwlock_clockrdlock() >> pthread_rwlock_clockwrlock() >> sem_clockwait() >> >> I've noticed some oddities, and at least a couple of bugs. >> >> First off, I just note that there's a surprisingly wide variation in >> the low-level futex calls being used by these APIs when implementing >> CLOCK_REALTIME support: >> >> pthread_rwlock_clockrdlock() >> pthread_rwlock_clockwrlock() >> sem_clockwait() >> pthread_cond_clockwait() >> futex(addr, >> FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 3, >> {abstimespec}, FUTEX_BITSET_MATCH_ANY) >> (This implementation seems to be okay) >> >> pthread_clockjoin_np() >> futex(addr, FUTEX_WAIT, 48711, {reltimespec}) >> (This is buggy; see below.) >> >> pthread_mutex_clocklock() >> futex(addr, FUTEX_WAIT_PRIVATE, 2, {reltimespec}) >> (There's bugs and strangeness here; see below.) > > Yes, I found it very confusing when I started adding the new > pthread*clock*() functions, and it still takes me a while to find the right > functions when I look now. I believe that Adhemerval was talking about > simplifying some of this. > >> === Bugs === >> >> pthread_clockjoin_np(): >> As already recognized in another mail thread [1], this API accepts any >> kind of clockid, even though it doesn't support most of them. > > Well, it sort of does support them at least as well as many other > implementations of such functions do - it just calculates a relative > timeout using the supplied lock and then uses that. But, ... > >> A further bug is that even if CLOCK_REALTIME is specified, >> pthread_clockjoin_np() sleeps against the CLOCK_MONOTONIC clock. >> (Currently it does this for *all* clockid values.) The problem here is >> that the FUTEX_WAIT operation sleeps against the CLOCK_MONOTONIC clock >> by default. At the least, the FUTEX_CLOCK_REALTIME is required for >> this case. Alternatively, an implementation using >> FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME (like the first four >> functions listed above) might be appropriate. > > ...this is one downside of that. That bug was inherited from the > existing pthread_clock_timedjoin_np implementation. Oh -- that's pretty sad. I hadn't considered the possibility that the (longstanding) "timed" functions might have the same bug. > I was planning to write a patch to just limit the supported clocks, but > I'll have a go at fixing the bug you describe properly instead first which > will limit the implementation to CLOCK_REALTIME and CLOCK_MONOTONIC anyway. > >> === >> >> pthread_mutex_clocklock(): >> First of all, there's a small oddity. Suppose we specify the clockid >> as CLOCK_REALTIME, and then while the call is blocked, we set the >> clock realtime backwards. Then, there will be further futex calls to >> handle the modification to the clock (and possibly multiple futex >> calls if the realtime clock is adjusted repeatedly): >> >> futex(addr, FUTEX_WAIT_PRIVATE, 2, {reltimespec1}) >> futex(addr, FUTEX_WAIT_PRIVATE, 2, {reltimespec2}) >> ... >> >> Then there seems to be a bug. If we specify the clockid as >> CLOCK_REALTIME, and while the call is blocked we set the realtime >> clock forwards, then the blocking interval of the call is *not* >> adjusted (shortened), when of course it should be. > > This is because __lll_clocklock_wait ends up doing a relative wait rather > than an absolute one so it suffers from the same problem as > pthread_clockjoin_np. > >> === >> >> I've attached a couple of small test programs at the end of this mail. > > Thanks for looking at this in detail. > > AFAIK, all of these bugs also affected the corresponding existing > pthread*timed*() functions. When I added the new pthread*clock*() functions > I was trying to keep my changes to the existing code as small as possible. > (I started out trying to "scratch the itch" of libstdc++ > std::condition_variable::wait_for misbehaving[2] when the system clock was > warped in 2015 and all of this ballooned from that.) Now that the functions > are in, I think there's definitely scope for improving the implementation > and I will try to do so as time and confidence allows - the implementation > of __pthread_mutex_clocklock_common scares me greatly! Yeah, a lot of glibc code is not so easy to follow... Thank you for taking a look. Cheers, Michael > [1] https://lore.kernel.org/linux-man/20201119120034.GA20599@xxxxxxxxxx/ > [2] https://randombitsofuselessinformation.blogspot.com/2018/06/its-about-time-monotonic-time.html > -- Michael Kerrisk Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/ Linux/UNIX System Programming Training: http://man7.org/training/