On Tue, 14 Jul 2020 10:26:29 +0200 Michal Hocko wrote: > On Tue 14-07-20 13:32:05, Hillf Danton wrote: > > > > On Mon, 13 Jul 2020 20:41:11 -0700 Eric Biggers wrote: > > > On Tue, Jul 14, 2020 at 11:32:52AM +0800, Hillf Danton wrote: > > > > > > > > Add FALLOC_FL_NOBLOCK and on the shmem side try to lock inode upon the > > > > new flag. And the overall upside is to keep the current gfp either in > > > > the khugepaged context or not. > > > > > > > > --- a/include/uapi/linux/falloc.h > > > > +++ b/include/uapi/linux/falloc.h > > > > @@ -77,4 +77,6 @@ > > > > */ > > > > #define FALLOC_FL_UNSHARE_RANGE 0x40 > > > > > > > > +#define FALLOC_FL_NOBLOCK 0x80 > > > > + > > > > > > You can't add a new UAPI flag to fix a kernel-internal problem like this. > > > > Sounds fair, see below. > > > > What the report indicates is a missing PF_MEMALLOC_NOFS and it's > > checked on the ashmem side and added as an exception before going > > to filesystem. On shmem side, no more than a best effort is paid > > on the inteded exception. > > > > --- a/drivers/staging/android/ashmem.c > > +++ b/drivers/staging/android/ashmem.c > > @@ -437,6 +437,7 @@ static unsigned long > > ashmem_shrink_scan(struct shrinker *shrink, struct shrink_control *sc) > > { > > unsigned long freed = 0; > > + bool nofs; > > > > /* We might recurse into filesystem code, so bail out if necessary */ > > if (!(sc->gfp_mask & __GFP_FS)) > > @@ -445,6 +446,11 @@ ashmem_shrink_scan(struct shrinker *shri > > if (!mutex_trylock(&ashmem_mutex)) > > return -1; > > > > + /* enter filesystem with caution: nonblock on locking */ > > + nofs = current->flags & PF_MEMALLOC_NOFS; > > + if (!nofs) > > + current->flags |= PF_MEMALLOC_NOFS; > > + > > while (!list_empty(&ashmem_lru_list)) { > > struct ashmem_range *range = > > list_first_entry(&ashmem_lru_list, typeof(*range), lru); > > I do not think this is an appropriate fix. First of all is this a real > deadlock or a lockdep false positive? Is it possible that ashmem just The warning matters and we can do something to quiesce it. > needs to properly annotate its shmem inodes? Or is it possible that > the internal backing shmem file is visible to the userspace so the write > path would be possible? > > If this a real problem then the proper fix would be to set internal > shmem mapping's gfp_mask to drop __GFP_FS. Thanks for the tip, see below. Can you expand a bit on how it helps direct reclaimers like khugepaged in the syzbot report wrt deadlock? TBH I have difficult time following up after staring at the chart below for quite a while. Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(fs_reclaim); lock(&sb->s_type->i_mutex_key#15); lock(fs_reclaim); lock(&sb->s_type->i_mutex_key#15); --- a/drivers/staging/android/ashmem.c +++ b/drivers/staging/android/ashmem.c @@ -381,6 +381,7 @@ static int ashmem_mmap(struct file *file if (!asma->file) { char *name = ASHMEM_NAME_DEF; struct file *vmfile; + gfp_t gfp; if (asma->name[ASHMEM_NAME_PREFIX_LEN] != '\0') name = asma->name; @@ -392,6 +393,10 @@ static int ashmem_mmap(struct file *file goto out; } vmfile->f_mode |= FMODE_LSEEK; + gfp = mapping_gfp_mask(vmfile->f_mapping); + if (gfp & __GFP_FS) + mapping_set_gfp_mask(vmfile->f_mapping, + gfp & ~__GFP_FS); asma->file = vmfile; } get_file(asma->file);