Re: 'mv' of ./tests/bugs/posix/bug-1113960.t causes 100% CPU

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




----- Original Message -----
> From: "Nithya Balachandran" <nbalacha@xxxxxxxxxx>
> To: "Raghavendra Gowdappa" <rgowdapp@xxxxxxxxxx>
> Cc: gluster-devel@xxxxxxxxxxx
> Sent: Tuesday, May 17, 2016 2:55:35 PM
> Subject: Re:  'mv' of ./tests/bugs/posix/bug-1113960.t causes 100% CPU
> 
> On Tue, May 17, 2016 at 2:52 PM, Raghavendra Gowdappa <rgowdapp@xxxxxxxxxx>
> wrote:
> 
> >
> >
> > ----- Original Message -----
> > > From: "Raghavendra Gowdappa" <rgowdapp@xxxxxxxxxx>
> > > To: "Nithya Balachandran" <nbalacha@xxxxxxxxxx>
> > > Cc: gluster-devel@xxxxxxxxxxx
> > > Sent: Tuesday, May 17, 2016 2:40:32 PM
> > > Subject: Re:  'mv' of ./tests/bugs/posix/bug-1113960.t
> > causes 100% CPU
> > >
> > >
> > >
> > > ----- Original Message -----
> > > > From: "Nithya Balachandran" <nbalacha@xxxxxxxxxx>
> > > > To: "Niels de Vos" <ndevos@xxxxxxxxxx>
> > > > Cc: gluster-devel@xxxxxxxxxxx
> > > > Sent: Tuesday, May 17, 2016 2:25:20 PM
> > > > Subject: Re:  'mv' of ./tests/bugs/posix/bug-1113960.t
> > > > causes 100% CPU
> > > >
> > > > Hi,
> > > >
> > > > I have looked into this on another system earlier and this is what I
> > have
> > > > so
> > > > far:
> > > >
> > > > 1. The test involves moving and renaming directories and files within
> > those
> > > > dirs.
> > > > 2. A rename dir operation failed on one subvol. So we have 3 subvols
> > where
> > > > the directory has the new name and one where it has the old name.
> > > > 3. Some operation - perhaps a revalidate - has added a dentry with the
> > old
> > > > name to the inode . So there are now 2 dentries for the same inode for
> > a
> > > > directory.
> > >
> > > I think the stale dentry is caused by a racing lookup and rename.
> >
> > Please refer to bz 1335373 [1] for more details.
> >
> > [1] https://bugzilla.redhat.com/show_bug.cgi?id=1335373
> 
> 
> 
> I have posted a "hacky" patch at [2]
> 
> [2] http://review.gluster.org/#/c/14373/
> 
> 
> 
> >
> >
> > > Apart from
> > > that, I don't know of any other reasons for stale dentries in inode
> > table.
> > > "Dentry fop serializer" (DFS) [1], aims to solve these kind of races.
> > >
> > > [1] http://review.gluster.org/14286
> > >
> > > > 4. Renaming a file inside that directory calls an inode_link which end
> > up
> > > > traversing the dentry list for each entry all the way up to the root
> > in the
> > > > __foreach_ancestor_dentry function. If there are multiple deep
> > directories
> > > > with the same problem in the path, this takes a very long time (hours)
> > > > because of the number of times the function is called.
> > > >
> > > > I do not know why the rename dir failed. However, is the following a
> > > > correct/acceptable fix for the traversal issue?
> > > >
> > > > 1. A directory should never have more than one dentry
> > > > 2. __foreach_ancestor_dentry uses the dentry list of the parent inode.
> > > > Parent
> > > > inode will always be a directory.
> > > > 3. Can we just take the first dentry in the list for the cycle check
> > as we
> > > > are really only comparing inodes? In the scenarios I have tried, all
> > the
> > > > dentries in the dentry_list always have the same inode. This would
> > prevent
> > > > the hang. If there is more than one dentry for a directory, flag an
> > error
> > > > somehow.
> > > > 4. Is there any chance that a dentry in the list can have a different
> > > > inode?
> > > > If yes, that is a different problem and 3 does not apply.
> > > >
> > > > It would work like this:
> > > >
> > > >
> > > > last_parent_inode = NULL;
> > > >
> > > > list_for_each_entry (each, &parent->dentry_list, inode_list) {
> > > >
> > > > //Since we are only using the each->parent to check, stop if we have
> > > > already
> > > > checked it
> > > >
> > > > if(each->parent != last_parent_inode) {
> > > > ret = __foreach_ancestor_dentry (each, per_dentry_fn, data);
> > > > if (ret)
> > > > goto out;
> > > > }
> > > > last_parent_inode = each->parent;

Some of the issues with this approach:
1. If dentries have different parents (for eg., rename across directories and lookup race), this won't help.
2. Even if dentries have same parent, this assumes those dentries are adjacent and assumes that only two such dentries exist.

2 is not much of an issue as most of the cases we've seen have two dentries, but assumption 1 is weak.

> > > > }
> > > >
> > > >
> > > > This would prevent the hang but leads to other issues which would
> > exist in
> > > > the current code anyway - mainly, which dentry is the correct one and
> > how
> > > > do
> > > > we recover?
> > > >
> > > >
> > > > Regards,
> > > > Nithya
> > > >
> > > > On Wed, May 11, 2016 at 7:09 PM, Niels de Vos < ndevos@xxxxxxxxxx >
> > wrote:
> > > >
> > > >
> > > > Could someone look into this busy loop?
> > > > https://paste.fedoraproject.org/365207/29732171/raw/
> > > >
> > > > This was happening in a regression-test burn-in run, occupying a
> > Jenkins
> > > > slave for 2+ days:
> > > > https://build.gluster.org/job/regression-test-burn-in/936/
> > > > (run with commit f0ade919006b2581ae192f997a8ae5bacc2892af from master)
> > > >
> > > > A coredump of the mount process is available from here:
> > > > http://slave20.cloud.gluster.org/archived_builds/crash.tar.gz
> > > >
> > > > Thanks misc for reporting and gathering the debugging info.
> > > > Niels
> > > >
> > > > _______________________________________________
> > > > Gluster-devel mailing list
> > > > Gluster-devel@xxxxxxxxxxx
> > > > http://www.gluster.org/mailman/listinfo/gluster-devel
> > > >
> > > >
> > > > _______________________________________________
> > > > Gluster-devel mailing list
> > > > Gluster-devel@xxxxxxxxxxx
> > > > http://www.gluster.org/mailman/listinfo/gluster-devel
> > > _______________________________________________
> > > Gluster-devel mailing list
> > > Gluster-devel@xxxxxxxxxxx
> > > http://www.gluster.org/mailman/listinfo/gluster-devel
> > >
> >
> 
_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-devel



[Index of Archives]     [Gluster Users]     [Ceph Users]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux