On Fri, Oct 07, 2016 at 07:15:18PM +0200, Oleg Nesterov wrote: > On 10/07, Dave Chinner wrote: > > > > On Thu, Oct 06, 2016 at 07:17:58PM +0200, Oleg Nesterov wrote: > > > Probably false positive? Although when I look at the comment above xfs_sync_sb() > > > I think that may be sometging like below makes sense, but I know absolutely nothing > > > about fs/ and XFS in particular. > > > > > > Oleg. > > > > > > > > > --- x/fs/xfs/xfs_trans.c > > > +++ x/fs/xfs/xfs_trans.c > > > @@ -245,7 +245,8 @@ xfs_trans_alloc( > > > atomic_inc(&mp->m_active_trans); > > > > > > tp = kmem_zone_zalloc(xfs_trans_zone, > > > - (flags & XFS_TRANS_NOFS) ? KM_NOFS : KM_SLEEP); > > > + (flags & (XFS_TRANS_NOFS | XFS_TRANS_NO_WRITECOUNT)) > > > + ? KM_NOFS : KM_SLEEP); > > > tp->t_magic = XFS_TRANS_HEADER_MAGIC; > > > tp->t_flags = flags; > > > tp->t_mountp = mp; > > > > Brief examination says caller should set XFS_TRANS_NOFS, not change > > the implementation to make XFS_TRANS_NO_WRITECOUNT flag to also mean > > XFS_TRANS_NOFS. > > I didn't mean the change above can fix the problem, and I don't really > understand your suggestion. xfs_syncsb() does: tp = xfs_trans_alloc(... , XFS_TRANS_NO_WRITECOUNT, ....); but it's running in a GFP_NOFS context when a freeze is being finalised. SO, rather than changing what XFS_TRANS_NO_WRITECOUNT does in xfs_trans_alloc(), we should tell it to do a GFP_NOFS allocation. i.e. tp = xfs_trans_alloc(... , XFS_TRANS_NOFS | XFS_TRANS_NO_WRITECOUNT, ....); and nothing inside xfs_trans_alloc() changes at all. > Obviously any GFP_FS allocation in xfs_fs_freeze() > paths will trigger the same warning. Of which there should be none except for that xfs_trans_alloc() call. > I added this hack > > --- a/fs/xfs/xfs_super.c > +++ b/fs/xfs/xfs_super.c > @@ -1333,10 +1333,15 @@ xfs_fs_freeze( > struct super_block *sb) > { > struct xfs_mount *mp = XFS_M(sb); > + int ret; > > + current->flags |= PF_FSTRANS; // tell kmem_flags_convert() to remove GFP_FS > xfs_save_resvblks(mp); > xfs_quiesce_attr(mp); > - return xfs_sync_sb(mp, true); > + ret = xfs_sync_sb(mp, true); > + current->flags &= ~PF_FSTRANS; > + > + return ret; > } /me shudders > just for testing purposes and after that I got another warning below. I didn't > read it carefully yet, but _at first glance_ it looks like the lock inversion > uncovered by 2/2, although I can be easily wrong. cancel_delayed_work_sync(l_work) > under sb_internal can hang if xfs_log_worker() waits for this rwsem?` Actually: I *can't read it*. I've got no fucking clue what lockdep is trying to say here. This /looks/ like a lockdep is getting confused between memory reclaim contexts (which /aren't locks/ but overload interrupt levels) and freeze contexts (which /aren't locks/) and workqueue locks which /aren't nested/ inside transactions or freeze contexts. But, really, I can't follow it because I have to guess at what "lock contexts" are not locks but are something else. cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html