----- Original Message ----- > From: "Nithya Balachandran" <nbalacha@xxxxxxxxxx> > To: "Niels de Vos" <ndevos@xxxxxxxxxx> > Cc: gluster-devel@xxxxxxxxxxx > Sent: Tuesday, May 17, 2016 2:25:20 PM > Subject: Re: 'mv' of ./tests/bugs/posix/bug-1113960.t causes 100% CPU > > Hi, > > I have looked into this on another system earlier and this is what I have so > far: > > 1. The test involves moving and renaming directories and files within those > dirs. > 2. A rename dir operation failed on one subvol. So we have 3 subvols where > the directory has the new name and one where it has the old name. > 3. Some operation - perhaps a revalidate - has added a dentry with the old > name to the inode . So there are now 2 dentries for the same inode for a > directory. I think the stale dentry is caused by a racing lookup and rename. Apart from that, I don't know of any other reasons for stale dentries in inode table. "Dentry fop serializer" (DFS) [1], aims to solve these kind of races. [1] http://review.gluster.org/14286 > 4. Renaming a file inside that directory calls an inode_link which end up > traversing the dentry list for each entry all the way up to the root in the > __foreach_ancestor_dentry function. If there are multiple deep directories > with the same problem in the path, this takes a very long time (hours) > because of the number of times the function is called. > > I do not know why the rename dir failed. However, is the following a > correct/acceptable fix for the traversal issue? > > 1. A directory should never have more than one dentry > 2. __foreach_ancestor_dentry uses the dentry list of the parent inode. Parent > inode will always be a directory. > 3. Can we just take the first dentry in the list for the cycle check as we > are really only comparing inodes? In the scenarios I have tried, all the > dentries in the dentry_list always have the same inode. This would prevent > the hang. If there is more than one dentry for a directory, flag an error > somehow. > 4. Is there any chance that a dentry in the list can have a different inode? > If yes, that is a different problem and 3 does not apply. > > It would work like this: > > > last_parent_inode = NULL; > > list_for_each_entry (each, &parent->dentry_list, inode_list) { > > //Since we are only using the each->parent to check, stop if we have already > checked it > > if(each->parent != last_parent_inode) { > ret = __foreach_ancestor_dentry (each, per_dentry_fn, data); > if (ret) > goto out; > } > last_parent_inode = each->parent; > } > > > This would prevent the hang but leads to other issues which would exist in > the current code anyway - mainly, which dentry is the correct one and how do > we recover? > > > Regards, > Nithya > > On Wed, May 11, 2016 at 7:09 PM, Niels de Vos < ndevos@xxxxxxxxxx > wrote: > > > Could someone look into this busy loop? > https://paste.fedoraproject.org/365207/29732171/raw/ > > This was happening in a regression-test burn-in run, occupying a Jenkins > slave for 2+ days: > https://build.gluster.org/job/regression-test-burn-in/936/ > (run with commit f0ade919006b2581ae192f997a8ae5bacc2892af from master) > > A coredump of the mount process is available from here: > http://slave20.cloud.gluster.org/archived_builds/crash.tar.gz > > Thanks misc for reporting and gathering the debugging info. > Niels > > _______________________________________________ > Gluster-devel mailing list > Gluster-devel@xxxxxxxxxxx > http://www.gluster.org/mailman/listinfo/gluster-devel > > > _______________________________________________ > Gluster-devel mailing list > Gluster-devel@xxxxxxxxxxx > http://www.gluster.org/mailman/listinfo/gluster-devel _______________________________________________ Gluster-devel mailing list Gluster-devel@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-devel