On 04/12/2010 12:25 AM, John David Anglin wrote: > On Sun, 11 Apr 2010, Helge Deller wrote: > >> Nevertheless, I still see the crashes with all kernel patches applied. >> >> What I usually do is to start up more than 8 screen sessions. In each of the >> sessions I start the bash loop: >> -> i=0; while true; do i=$(($i+1)); echo Run $i; ./minifail; done; >> and detach from the screen sessions. >> After some time, the load goes up to 8-16 and a few crashes fill the syslog. >> I'm sure the crashes are related to how much load the machine is, and how >> often process switches will happen. >> How many minifail testcases do you run in parallel? > > Sigh, never more than one... > > That said, I did realize last night that the cache flush in ptep_set_wrprotect > based on pte_dirty was flawed. In a SMP kernel with a user on a different > cpu pounding on the page to be write protected, there was a race between > the pte_dirty check and the write protect. > > Further, I don't believe the dirty bit is reliable. Our cmpxchg is not > atomic with respect to changes in the dirty bit. Thus, there is a small > window where a change in the dirty bit could get lost. > > So for now, I think it safest to move the flush after the setting of the > write protect bit, and do it unconditionally. This should be ok since > page faults are disabled. I recognize that this will hurt performance. > > I'm going to test the following on my rp3440. The flushing has greatly > improved SMP userspace stability. However, I have still seen a few issues > in the GCC testsuite. > > Maybe it will help your B2000. However, let's just go one step at a time. Sadly no luck :-( minifail still crashes... Helge -- To unsubscribe from this list: send the line "unsubscribe linux-parisc" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html