On Mon, 28 Aug 2006 00:39:08 -0700 (PDT), David Miller wrote: >Ok, I finally figured this one out and reproduced it on my >ultra5. > >If, for example, we have a 4MB PTE and we do a write >we'll only update one of the 8KB sub-PTEs of that >mapping. > >This is fine until that new mapping gets displaced from >the TLB and someone does a read that ends up hitting one >of the sub-PTEs that didn't get it's write-enable bit >set yet. > >At this point we have a problem, because if a write is >made to the original address, the kernel says "the writable >bit is set, nothing to do". So it won't flush the TLB, >and therefore it won't kick out the TLB mapping brought >in by the read. > >So we just get wedged here until something displaces that >TLB entry. This is why X acts sluggish and since it can >loop like this for quite a while the X server and the >hardware can get plenty confused. > >The end result is that we have to make sure any PTE updates >propagate to all sub-PTEs of a large mapping during any >change. That's really expensive and we'd have to add some >complex code to the set_pte_at() code path just to handle >this. > >So the easiest way to fix this, without having to disable >largepage PTE mappings of I/O devices, is the patch below. >I will push this to Linus for 2.6.18 and -stable so that >2.6.17 gets it too. > >commit 6ad7d29d2edd8c3d632e71454f619f5c0c6c2703 >Author: David S. Miller <davem@xxxxxxxxxxxxxxxxxxxx> >Date: Mon Aug 28 00:33:03 2006 -0700 > > [SPARC64]: Fix X server hangs due to large pages. > > This problem was introduced by changeset > 14778d9072e53d2171f66ffd9657daff41acfaed > > Unlike the hugetlb code paths, the normal fault code is not setup to > propagate PTE changes for large page sizes correctly like the ones we > make for I/O mappings in io_remap_pfn_range(). > > It is absolutely necessary to update all sub-ptes of a largepage > mapping on a fault. Adding special handling for this would add > considerably complexity to tlb_batch_add(). So let's just side-step > the issue and forcefully dirty any writable PTEs created by > io_remap_pfn_range(). > > The only other real option would be to disable to large PTE code of > io_remap_pfn_range() and we really don't want to do that. > > Much thanks to Mikael Pettersson for tracking down this problem and > testing debug patches. > > Signed-off-by: David S. Miller <davem@xxxxxxxxxxxxx> > >diff --git a/arch/sparc64/mm/generic.c b/arch/sparc64/mm/generic.c >index 8cb0620..af9d81d 100644 >--- a/arch/sparc64/mm/generic.c >+++ b/arch/sparc64/mm/generic.c >@@ -69,6 +69,8 @@ static inline void io_remap_pte_range(st > } else > offset += PAGE_SIZE; > >+ if (pte_write(entry)) >+ entry = pte_mkdirty(entry); > do { > BUG_ON(!pte_none(*pte)); > set_pte_at(mm, address, pte, entry); > Thanks. X works fine on my U5 now with 2.6.18-rc5 + this patch. /Mikael - To unsubscribe from this list: send the line "unsubscribe sparclinux" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html