Re: overzealous TLB flushing by lazy VMAP flushing

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



From: David Miller <davem@xxxxxxxxxxxxx>
Date: Mon, 04 Aug 2014 16:23:14 -0700 (PDT)

Sorry, I screwed up the lkml CC:, fixing that here.

> Hey Nick,
> 
> The lazy VMAP flushing in mm/vmalloc.c seems to make various
> assumptions about vmalloc area layout.
> 
> In particular it assumes that if there are pending VMAP flushes
> in multiple regions managed by vmap/vunmap, it's safe to queue
> up a range flush from the lowest such address to the highest
> such address.
> 
> This is problematic and causes problems on sparc64 as diagnosed by
> Christopher (CC:'d).
> 
> On sparc64 we have the following regions:
> 
> modules		0x010000000 --> 0x0f0000000
> openfirmware	0x0f0000000 --> 0x100000000
> vmalloc		0x100000000 --> 0x10000000000
> 
> So if a module is unloaded as well as some vfree()'s occur, the next
> lazy VMAP flush will flush a range that covers all of openfirmware.
> 
> This will flush the firmware's locked TLB entries, which in turn cause
> all sorts of problems.
> 
> It is not possible to adjust where these ranges are in order to make
> the vmalloc and module ranges be right next to eachother.  The
> firmware area is fixed, first of all.  Second of all the module area
> has to be in the low 4GB because of the code model we compile the
> kernel with (all symbols are 32-bit), and we want to use as little of
> the sub-4GB area as possible because it has to fit the main kernel
> image, modules, and the firmware region.
> 
> We could add all sorts of range logic to the flush_tlb_range()
> implementation on sparc64, but I really think that the kernel should
> not trigger a TLB flush across a range for which it never managed any
> mappings.
> 
> I also think that the lazy VMAP flusher should be mindful of this for
> another reason.  Specifically, issuing such an enormous flush range is
> going to be expensive, more expensive that whatever we were gaining by
> batching these flushes.
> 
> Unlike for userspace mappings, for kernel mappings we can't have a
> cutoff for page-by-page flushes and just do a context based TLB flush.
> We always have to do page-by-page flushes.  So these huge ranges
> really do hurt.
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Kernel Development]     [DCCP]     [Linux ARM Development]     [Linux]     [Photo]     [Yosemite Help]     [Linux ARM Kernel]     [Linux SCSI]     [Linux x86_64]     [Linux Hams]

  Powered by Linux