Re: threads and fork on machine with VIPT-WB cache

Helge Deller <deller@xxxxxx> · Mon, 12 Apr 2010 23:02:34 +0200

On 04/12/2010 12:25 AM, John David Anglin wrote:
> On Sun, 11 Apr 2010, Helge Deller wrote:
> 
>> Nevertheless, I still see the crashes with all kernel patches applied.
>>
>> What I usually do is to start up more than 8 screen sessions. In each of the
>> sessions I start the bash loop:
>> -> i=0; while true; do i=$(($i+1)); echo Run $i; ./minifail; done;
>> and detach from the screen sessions.
>> After some time, the load goes up to 8-16 and a few crashes fill the syslog.
>> I'm sure the crashes are related to how much load the machine is, and how
>> often process switches will happen.
>> How many minifail testcases do you run in parallel?
> 
> Sigh, never more than one...
> 
> That said, I did realize last night that the cache flush in ptep_set_wrprotect
> based on pte_dirty was flawed.  In a SMP kernel with a user on a different
> cpu pounding on the page to be write protected, there was a race between
> the pte_dirty check and the write protect.
> 
> Further, I don't believe the dirty bit is reliable.  Our cmpxchg is not
> atomic with respect to changes in the dirty bit.  Thus, there is a small
> window where a change in the dirty bit could get lost.
> 
> So for now, I think it safest to move the flush after the setting of the
> write protect bit, and do it unconditionally.  This should be ok since
> page faults are disabled.  I recognize that this will hurt performance.
> 
> I'm going to test the following on my rp3440.  The flushing has greatly
> improved SMP userspace stability.  However, I have still seen a few issues
> in the GCC testsuite.
> 
> Maybe it will help your B2000.  However, let's just go one step at a time.

Sadly no luck :-(
minifail still crashes...

Helge
--
To unsubscribe from this list: send the line "unsubscribe linux-parisc" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html