On Tue, Jul 17, 2018 at 09:00:53AM +0000, Michal Hocko wrote: > On Mon 16-07-18 23:38:46, Kirill A. Shutemov wrote: > > On Mon, Jul 16, 2018 at 07:40:42PM +0200, Michal Hocko wrote: > > > On Mon 16-07-18 17:47:39, Kirill A. Shutemov wrote: > > > > On Mon, Jul 16, 2018 at 04:22:45PM +0200, Michal Hocko wrote: > > > > > On Mon 16-07-18 17:04:41, Kirill A. Shutemov wrote: > > > > > > On Mon, Jul 16, 2018 at 01:30:28PM +0000, Michal Hocko wrote: > > > > > > > On Tue 10-07-18 13:48:58, Andrew Morton wrote: > > > > > > > > On Tue, 10 Jul 2018 16:48:20 +0300 "Kirill A. Shutemov" <kirill.shutemov@xxxxxxxxxxxxxxx> wrote: > > > > > > > > > > > > > > > > > vma_is_anonymous() relies on ->vm_ops being NULL to detect anonymous > > > > > > > > > VMA. This is unreliable as ->mmap may not set ->vm_ops. > > > > > > > > > > > > > > > > > > False-positive vma_is_anonymous() may lead to crashes: > > > > > > > > > > > > > > > > > > ... > > > > > > > > > > > > > > > > > > This can be fixed by assigning anonymous VMAs own vm_ops and not relying > > > > > > > > > on it being NULL. > > > > > > > > > > > > > > > > > > If ->mmap() failed to set ->vm_ops, mmap_region() will set it to > > > > > > > > > dummy_vm_ops. This way we will have non-NULL ->vm_ops for all VMAs. > > > > > > > > > > > > > > > > Is there a smaller, simpler fix which we can use for backporting > > > > > > > > purposes and save the larger rework for development kernels? > > > > > > > > > > > > > > Why cannot we simply keep anon vma with null vm_ops and set dummy_vm_ops > > > > > > > for all users who do not initialize it in their mmap callbacks? > > > > > > > Basically have a sanity check&fixup in call_mmap? > > > > > > > > > > > > As I said, there's a corner case of MAP_PRIVATE of /dev/zero. > > > > > > > > > > This is really creative. I really didn't think about that. I am > > > > > wondering whether this really has to be handled as a private anonymous > > > > > mapping implicitly. Why does vma_is_anonymous has to succeed for these > > > > > mappings? Why cannot we simply handle it as any other file backed > > > > > PRIVATE mapping? > > > > > > > > Because it's established way to create anonymous mappings in Linux. > > > > And we cannot break the semantics. > > > > > > How exactly would semantic break? You would still get zero pages on read > > > faults and anonymous pages on CoW. So basically the same thing as for > > > any other file backed MAP_PRIVATE mapping. > > > > You are wrong about zero page. > > Well, if we redirect ->fault to do_anonymous_page and Yeah. And it will make write fault to allocate *two* pages. One in do_anonymous_page() and one in do_cow_fault(). Just no. We have a reason why anon VMAs handled separately. It's possible to unify them, but it requires substantial ground work. > > And you won't get THP. > > huge_fault to do_huge_pmd_anonymous_page then we should emulate the > standard anonymous mapping. > > > And I'm sure there's more differences. Just grep for > > vma_is_anonymous(). > > I am sorry to push on this but if we have one odd case I would rather > handle it and have a simple _rule_ that every mmap provide _has_ to > provide vm_ops and have a trivial fix up at a single place rather than > patch a subtle placeholders you were proposing. > > I will not insist of course but this looks less fragile to me. You propose quite a big redesign on how we handle anonymous VMAs. Feel free to propose the patch(set). But I don't think it would fly for stable@. -- Kirill A. Shutemov