On Tue, 14 Jul 2020 10:32:20 -0700 Suren Baghdasaryan wrote: > On Tue, Jul 14, 2020 at 9:41 AM Suren Baghdasaryan <surenb@xxxxxxxxxx> wrote: > > > > On Tue, Jul 14, 2020 at 8:47 AM Todd Kjos <tkjos@xxxxxxxxxx> wrote: > > > > > > +Suren Baghdasaryan +Hridya Valsaraju who support the ashmem driver. > > > > Thanks for looping me in. > > > > > On Tue, Jul 14, 2020 at 7:18 AM Michal Hocko <mhocko@xxxxxxxxxx> wrote: > > > > > > > > On Tue 14-07-20 22:08:59, Hillf Danton wrote: > > > > > > > > > > On Tue, 14 Jul 2020 10:26:29 +0200 Michal Hocko wrote: > > > > > > On Tue 14-07-20 13:32:05, Hillf Danton wrote: > > > > > > > > > > > > > > On Mon, 13 Jul 2020 20:41:11 -0700 Eric Biggers wrote: > > > > > > > > On Tue, Jul 14, 2020 at 11:32:52AM +0800, Hillf Danton wrote: > > > > > > > > > > > > > > > > > > Add FALLOC_FL_NOBLOCK and on the shmem side try to lock inode upon the > > > > > > > > > new flag. And the overall upside is to keep the current gfp either in > > > > > > > > > the khugepaged context or not. > > > > > > > > > > > > > > > > > > --- a/include/uapi/linux/falloc.h > > > > > > > > > +++ b/include/uapi/linux/falloc.h > > > > > > > > > @@ -77,4 +77,6 @@ > > > > > > > > > */ > > > > > > > > > #define FALLOC_FL_UNSHARE_RANGE 0x40 > > > > > > > > > > > > > > > > > > +#define FALLOC_FL_NOBLOCK 0x80 > > > > > > > > > + > > > > > > > > > > > > > > > > You can't add a new UAPI flag to fix a kernel-internal problem like this. > > > > > > > > > > > > > > Sounds fair, see below. > > > > > > > > > > > > > > What the report indicates is a missing PF_MEMALLOC_NOFS and it's > > > > > > > checked on the ashmem side and added as an exception before going > > > > > > > to filesystem. On shmem side, no more than a best effort is paid > > > > > > > on the inteded exception. > > > > > > > > > > > > > > --- a/drivers/staging/android/ashmem.c > > > > > > > +++ b/drivers/staging/android/ashmem.c > > > > > > > @@ -437,6 +437,7 @@ static unsigned long > > > > > > > ashmem_shrink_scan(struct shrinker *shrink, struct shrink_control *sc) > > > > > > > { > > > > > > > unsigned long freed = 0; > > > > > > > + bool nofs; > > > > > > > > > > > > > > /* We might recurse into filesystem code, so bail out if necessary */ > > > > > > > if (!(sc->gfp_mask & __GFP_FS)) > > > > > > > @@ -445,6 +446,11 @@ ashmem_shrink_scan(struct shrinker *shri > > > > > > > if (!mutex_trylock(&ashmem_mutex)) > > > > > > > return -1; > > > > > > > > > > > > > > + /* enter filesystem with caution: nonblock on locking */ > > > > > > > + nofs = current->flags & PF_MEMALLOC_NOFS; > > > > > > > + if (!nofs) > > > > > > > + current->flags |= PF_MEMALLOC_NOFS; > > > > > > > + > > > > > > > while (!list_empty(&ashmem_lru_list)) { > > > > > > > struct ashmem_range *range = > > > > > > > list_first_entry(&ashmem_lru_list, typeof(*range), lru); > > > > > > > > > > > > I do not think this is an appropriate fix. First of all is this a real > > > > > > deadlock or a lockdep false positive? Is it possible that ashmem just > > > > > > > > > > The warning matters and we can do something to quiesce it. > > > > > > > > The underlying issue should be fixed rather than _something_ done to > > > > silence it. > > > > > > > > > > needs to properly annotate its shmem inodes? Or is it possible that > > > > > > the internal backing shmem file is visible to the userspace so the write > > > > > > path would be possible? > > > > > > > > > > > > If this a real problem then the proper fix would be to set internal > > > > > > shmem mapping's gfp_mask to drop __GFP_FS. > > > > > > > > > > Thanks for the tip, see below. > > > > > > > > > > Can you expand a bit on how it helps direct reclaimers like khugepaged > > > > > in the syzbot report wrt deadlock? > > > > > > > > I do not understand your question. > > > > > > > > > TBH I have difficult time following > > > > > up after staring at the chart below for quite a while. > > > > > > > > Yes, lockdep reports are quite hard to follow and they tend to confuse > > > > one hell out of me. But this one says that there is a reclaim dependency > > > > between the shmem inode lock and the reclaim context. > > > > > > > > > Possible unsafe locking scenario: > > > > > > > > > > CPU0 CPU1 > > > > > ---- ---- > > > > > lock(fs_reclaim); > > > > > lock(&sb->s_type->i_mutex_key#15); > > > > > lock(fs_reclaim); > > > > > > > > > > lock(&sb->s_type->i_mutex_key#15); > > > > > > > > Please refrain from proposing fixes until the actual problem is > > > > understood. I suspect that this might be just false positive because the > > > > lockdep cannot tell the backing shmem which is internal to ashmem(?) > > > > with any general shmem. > > Actually looking some more into this, I think you are right. Ashmem > currently does not redirect writes into the backing shmem and > fallocate call from ashmem_shrink_scan is always performed against > asma->file, which is the backing shmem. IOW writes into the backing > shmem are not supported, therefore this concurrent locking can't > happen. The print of generic_file_write_iter in the syzbot report backs that concurrency because of f_op::fallocate and another is Reported-by: syzbot+7a0d9d0b26efefe61780@xxxxxxxxxxxxxxxxxxxxxxxxx > > I'm not sure how we can annotate the fact that the inode_lock in > generic_file_write_iter and in shmem_fallocate always operate on > different inodes. Ideas? > > > > > But somebody really familiar with ashmem code > > > > should have a look I believe. > > > > I believe the deadlock is possible if a write to ashmem fd coincides > > with shrinking of ashmem caches. I just developed a possible fix here > > https://android-review.googlesource.com/c/kernel/common/+/1361205 but > > wanted to test it before posting upstream. The idea is to detect such > > a race between write and cache shrinking operations and let > > ashmem_shrink_scan bail out if the race is detected instead of taking > > inode_lock. AFAIK writing ashmem files is not a usual usage for ashmem > > (standard usage is to mmap it and use as shared memory), therefore > > this bailing out early should not affect ashmem cache maintenance > > much. Besides ashmem_shrink_scan already bails out early if a > > contention on ashmem_mutex is detected, which is a much more probable > > case (see: https://elixir.bootlin.com/linux/v5.8-rc4/source/drivers/staging/android/ashmem.c#L497). > > > > I'll test and post the patch here in a day or so if there are no early > > objections to it. > > Thanks! > > > > > > > > > > -- > > > > Michal Hocko > > > > SUSE Labs