On Tue, Jan 26, 2016 at 8:40 PM, Matthew Wilcox <willy@xxxxxxxxxxxxxxx> wrote: > On Mon, Jan 25, 2016 at 09:33:35AM -0800, Andy Lutomirski wrote: >> On Mon, Jan 25, 2016 at 9:25 AM, Matthew Wilcox >> <matthew.r.wilcox@xxxxxxxxx> wrote: >> > From: Matthew Wilcox <willy@xxxxxxxxxxxxxxx> >> > >> > track_pfn_insert() overwrites the pgprot that is passed in with a value >> > based on the VMA's page_prot. This is a problem for people trying to >> > do clever things with the new vm_insert_pfn_prot() as it will simply >> > overwrite the passed protection flags. If we use the current value of >> > the pgprot as the base, then it will behave as people are expecting. >> > >> > Also fix track_pfn_remap() in the same way. >> >> Well that's embarrassing. Presumably it worked for me because I only >> overrode the cacheability bits and lookup_memtype did the right thing. >> >> But shouldn't the PAT code change the memtype if vm_insert_pfn_prot >> requests it? Or are there no callers that actually need that? (HPET >> doesn't, because there's a plain old ioremapped mapping.) > > I'm confused. Here's what I understand: > > - on x86, the bits in pgprot can be considered as two sets of bits; > the 'cacheability bits' -- those in _PAGE_CACHE_MASK and the > 'protection bits' -- PRESENT, RW, USER, ACCESSED, NX > - The purpose of track_pfn_insert() is to ensure that the cacheability bits > are the same on all mappings of a given page, as strongly advised by the > Intel manuals [1]. So track_pfn_insert() is really only supposed to > modify _PAGE_CACHE_MASK of the passed pgprot, but in fact it ends up > modifying the protection bits as well, due to the bug. > > I don't think you overrode the cacheability bits at all. It looks to > me like your patch ends up mapping the HPET into userspace writable. I sure hope not. If vm_page_prot was writable, something was already broken, because this is the vvar mapping, and the vvar mapping is VM_READ (and not even VM_MAYREAD). > > I don't think the vm_insert_pfn_prot() call gets to change the memtype. > For one, that page may already be mapped into a differet userspace using > the pre-existing memtype, and [1] continues to bite you. Then there > may be outstanding kernel users of the page that's being mapped in. So why was remap_pfn_range different? I'm sure there was a reason. I don't think that whatever_pfn_prot should ever map a page inconsistently, but I find it surprising that some of the variants call reserve_memtype to change the memtype and others don't. Anyway, this is in no way an objection to your patches. --Andy -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>