On Monday, January 2, 2017 11:00:44 AM CET Kamil Dudka wrote: > > Note this test was just changed upstream to use locks instead of volatiles, > > which significantly improved performance on a 40 core NUMA system at least: > > http://git.savannah.gnu.org/gitweb/?p=gnulib.git;a=commitdiff;h=480d374 > > Thanks for letting me know! I have applied the above patch on the coreutils > package in Fedora rawhide: > > http://pkgs.fedoraproject.org/cgit/rpms/coreutils.git/commit/?id=b3b3da0c > > ... and it seems to have resolved the issue with hanging Koji builds. As we discussed in person with Kamil, unfortunately the upstream fix doesn't help. And also most probably, I was wrong before when I told that this is _not_ arch specific; it looks like it actually is ppc64le only and, indeed!, the test_rwlock() test took incredibly long before the upstream fix (not only on ppc64le). So a good performance speedup done upstream, thanks! But yes, there's still problem with test_rwlock() on ppc64le. Usually I see this in build.log [1]: ... + set +x 1 Starting test_lock ... OK 9 Starting test_rwlock ... OK 10 Starting test_recursive_lock ... OK 13 Starting test_once ... OK 13 ==== ... but sometimes there's infinite hang [2]: ... + set +x 1 Starting test_lock ... OK [HANG HERE] That integer number on each line means "how many seconds left when the test (being written to stdout) finished" [3]. Kamil pointed out that the test-lock.c might be wrong. The point is that there are multiple independent read-only threads (up to 10?) in "infinite loop until all read-write threads finish". Which causes probably writer's starvation. POSIX only says that "Implementations may favor writers over readers to avoid writer starvation.", but it is not guaranteed in general. More precisely, we seem to have PTHREAD_RWLOCK_PREFER_READER_NP in glibc by default, so I'm not sure whether we shouldn't set PTHREAD_RWLOCK_PREFER_WRITER_NP in test-lock.c somewhere... at least if possible. Also, the question is whether it isn't really glibc bug, because schedulers non-ppc64le architectures look to be more "fair" regardless the default. Also, I was unable to reproduce anywhere else than in Koji. [1] https://koji.fedoraproject.org/koji/taskinfo?taskID=17147902 [2] https://koji.fedoraproject.org/koji/taskinfo?taskID=17147995 [3] http://pkgs.fedoraproject.org/cgit/rpms/gettext.git/commit/?h=private-praiskup-test-lock-upstream-fix&id=937f421ab3276c7fc363c1af9e Pavel _______________________________________________ devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx