On Tue, 31 Jul 2018 at 09:55, John Stultz <john.stultz@xxxxxxxxxx> wrote: > > On Mon, Jul 30, 2018 at 8:26 PM, Hugh Dickins <hughd@xxxxxxxxxx> wrote: > > On Mon, 30 Jul 2018, Linus Torvalds wrote: > >> On Mon, Jul 30, 2018 at 2:53 PM Hugh Dickins <hughd@xxxxxxxxxx> wrote: > >> > > >> > I have no problem with reverting -rc7's vma_is_anonymous() series. > >> > >> I don't think we need to revert the whole series: I think the rest are > >> all fairly obvious cleanups, and shouldn't really have any semantic > >> changes. > > > > Okay. > > > >> > >> It's literally only that last patch in the series that then changes > >> that meaning of "vm_ops". And I don't really _mind_ that last step > >> either, but since we don't know exactly what it was that it broke, and > >> we're past rc7, I don't think we really have any option but the revert > >> it. > > > > It took me a long time to grasp what was happening, that that last > > patch bfd40eaff5ab was fixing. Not quite explained in the commit. > > > > I think it was that by mistakenly passing the vma_is_anonymous() test, > > create_huge_pmd() gave the MAP_PRIVATE kcov mapping a THP (instead of > > COWing pages from kcov); which the truncate then had to split, and in > > going to do so, again hit the mistaken vma_is_anonymous() test, BUG. > > > >> > >> And if we revert it, I think we need to just remove the > >> VM_BUG_ON_VMA() that it was supposed to fix. Because I do think that > >> it is quite likely that the real bug is that overzealous BUG_ON(), > >> since I can't see any reason why anonymous mappings should be special > >> there. > > > > Yes, that probably has to go: but it's not clear what state it leaves > > us in, with an anon THP being split by a truncate, without the expected > > locking; I don't remember offhand, probably a subtler bug than that BUG, > > which you may or may not consider an improvement. > > > > I fear that Kirill has not missed inserting a vma_set_anonymous() from > > somewhere that it should be, but rather that zygote is working with some > > special mapping which used to satisfy vma_is_anonymous(), faults supplying > > backing pages, but now comes out as !vma_is_anonymous(), so do_fault() > > finds !dummy_vm_ops.fault hence SIGBUS. > > I've been only casually following this thread (mostly just glad Amit > caught it and I could avoid having to bisect the issue in my own > Android testing), but this bit starting to shake a few old cobwebs > loose in my brain. > > I'm wondering if Zygote is utilizing ashmem here, and we're somehow > traversing ashmem purged memory, or due to some setup issue the > initial traverse isn't being zero-filled as expected? > > ashmem ranges are created using: shmem_file_setup() and shmem_zero_setup() > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/staging/android/ashmem.c#n377 > > > If we purge pages, it punches it out with: > vfs_fallocate(range->asma->file, > FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE, > start, end - start); > here: > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/staging/android/ashmem.c#n447 > > But in ashmem_pin(), we don't do anything other then returning if we > purged any page in the range. > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/staging/android/ashmem.c#n577 > > And I believe the future assumption is the if we traverse those pages > they will be zero filled (if purged or even during the initial > traversal after mmap) > > Its been a long time, and I've not looked at the code in question but > it sounds from Hugh's comments above that we might instead get a > SIGBUS here. > > Looking more at the problematic patch.. > Amit: Does adding something like (whitespace damaged, apologies): > > index a1a0025..1af6915 100644 > --- a/drivers/staging/android/ashmem.c > +++ b/drivers/staging/android/ashmem.c > @@ -402,7 +402,8 @@ static int ashmem_mmap(struct file *file, struct > vm_area_struct *vma) > fput(asma->file); > goto out; > } > - } > + } else > + vma_set_anonymous(vma); > > if (vma->vm_file) > fput(vma->vm_file); > This ashmem change ^^ worked too. Regards, Amit Pundir > > Seem to resolve it? (Sorry, I'd test it myself, but I'm away from my > desk for the night). > thanks > -john