Hi,
I have looked into this on another system earlier and this is what I have so far:
1. The test involves moving and renaming directories and files within those dirs.
2. A rename dir operation failed on one subvol. So we have 3 subvols where the directory has the new name and one where it has the old name.
3. Some operation - perhaps a revalidate - has added a dentry with the old name to the inode . So there are now 2 dentries for the same inode for a directory.
4. Renaming a file inside that directory calls an inode_link which end up traversing the dentry list for each entry all the way up to the root in the __foreach_ancestor_dentry function. If there are multiple deep directories with the same problem in the path, this takes a very long time (hours) because of the number of times the function is called.
I do not know why the rename dir failed. However, is the following a correct/acceptable fix for the traversal issue?
1. A directory should never have more than one dentry
2. __foreach_ancestor_dentry uses the dentry list of the parent inode. Parent inode will always be a directory.
3. Can we just take the first dentry in the list for the cycle check as we are really only comparing inodes? In the scenarios I have tried, all the dentries in the dentry_list always have the same inode. This would prevent the hang. If there is more than one dentry for a directory, flag an error somehow.
4. Is there any chance that a dentry in the list can have a different inode? If yes, that is a different problem and 3 does not apply.
It would work like this:
This would prevent the hang but leads to other issues which would exist in the current code anyway - mainly, which dentry is the correct one and how do we recover?
Regards,
Nithya
I have looked into this on another system earlier and this is what I have so far:
1. The test involves moving and renaming directories and files within those dirs.
2. A rename dir operation failed on one subvol. So we have 3 subvols where the directory has the new name and one where it has the old name.
3. Some operation - perhaps a revalidate - has added a dentry with the old name to the inode . So there are now 2 dentries for the same inode for a directory.
4. Renaming a file inside that directory calls an inode_link which end up traversing the dentry list for each entry all the way up to the root in the __foreach_ancestor_dentry function. If there are multiple deep directories with the same problem in the path, this takes a very long time (hours) because of the number of times the function is called.
I do not know why the rename dir failed. However, is the following a correct/acceptable fix for the traversal issue?
1. A directory should never have more than one dentry
2. __foreach_ancestor_dentry uses the dentry list of the parent inode. Parent inode will always be a directory.
3. Can we just take the first dentry in the list for the cycle check as we are really only comparing inodes? In the scenarios I have tried, all the dentries in the dentry_list always have the same inode. This would prevent the hang. If there is more than one dentry for a directory, flag an error somehow.
4. Is there any chance that a dentry in the list can have a different inode? If yes, that is a different problem and 3 does not apply.
It would work like this:
last_parent_inode = NULL;
list_for_each_entry (each, &parent->dentry_list, inode_list) {
//Since we are only using the each->parent to check, stop if we have already checked it
if(each->parent != last_parent_inode) {
ret = __foreach_ancestor_dentry (each, per_dentry_fn, data);
if (ret)
goto out;
}
last_parent_inode = each->parent;
}
This would prevent the hang but leads to other issues which would exist in the current code anyway - mainly, which dentry is the correct one and how do we recover?
Regards,
Nithya
On Wed, May 11, 2016 at 7:09 PM, Niels de Vos <ndevos@xxxxxxxxxx> wrote:
Could someone look into this busy loop?
https://paste.fedoraproject.org/365207/29732171/raw/
This was happening in a regression-test burn-in run, occupying a Jenkins
slave for 2+ days:
https://build.gluster.org/job/regression-test-burn-in/936/
(run with commit f0ade919006b2581ae192f997a8ae5bacc2892af from master)
A coredump of the mount process is available from here:
http://slave20.cloud.gluster.org/archived_builds/crash.tar.gz
Thanks misc for reporting and gathering the debugging info.
Niels
_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-devel
_______________________________________________ Gluster-devel mailing list Gluster-devel@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-devel