On Mon, Jan 25, 2016 at 09:33:35AM -0800, Andy Lutomirski wrote: > On Mon, Jan 25, 2016 at 9:25 AM, Matthew Wilcox > <matthew.r.wilcox@xxxxxxxxx> wrote: > > From: Matthew Wilcox <willy@xxxxxxxxxxxxxxx> > > > > track_pfn_insert() overwrites the pgprot that is passed in with a value > > based on the VMA's page_prot. This is a problem for people trying to > > do clever things with the new vm_insert_pfn_prot() as it will simply > > overwrite the passed protection flags. If we use the current value of > > the pgprot as the base, then it will behave as people are expecting. > > > > Also fix track_pfn_remap() in the same way. > > Well that's embarrassing. Presumably it worked for me because I only > overrode the cacheability bits and lookup_memtype did the right thing. > > But shouldn't the PAT code change the memtype if vm_insert_pfn_prot > requests it? Or are there no callers that actually need that? (HPET > doesn't, because there's a plain old ioremapped mapping.) I'm confused. Here's what I understand: - on x86, the bits in pgprot can be considered as two sets of bits; the 'cacheability bits' -- those in _PAGE_CACHE_MASK and the 'protection bits' -- PRESENT, RW, USER, ACCESSED, NX - The purpose of track_pfn_insert() is to ensure that the cacheability bits are the same on all mappings of a given page, as strongly advised by the Intel manuals [1]. So track_pfn_insert() is really only supposed to modify _PAGE_CACHE_MASK of the passed pgprot, but in fact it ends up modifying the protection bits as well, due to the bug. I don't think you overrode the cacheability bits at all. It looks to me like your patch ends up mapping the HPET into userspace writable. I don't think the vm_insert_pfn_prot() call gets to change the memtype. For one, that page may already be mapped into a differet userspace using the pre-existing memtype, and [1] continues to bite you. Then there may be outstanding kernel users of the page that's being mapped in. So I think track_pfn_insert() is doing the right thing with respect to the cacheability bits (overwrite the ones passed in), it's just doing an unexpected thing with regard to the protection bits, which my patch should fix. [1] "The PAT allows any memory type to be specified in the page tables, and therefore it is possible to have a single physical page mapped to two or more different linear addresses, each with different memory types. Intel does not support this practice because it may lead to undefined operations that can result in a system failure. In particular, a WC page must never be aliased to a cacheable page because WC writes may not check the processor caches." -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>