Re: Freeze with cluster-2.03.11

Abhijith Das <adas@xxxxxxxxxx> · Mon, 30 Mar 2009 13:45:08 -0500

Wendy Cheng wrote:
> Kadlecsik Jozsef wrote:
>   
>>> You mean the part of the patch
>>>
>>> @@ -1503,6 +1503,15 @@ gfs_getattr(struct vfsmount *mnt, struct dentry *dentry, struct
>>>         error = gfs_glock_nq_init(ip->i_gl, LM_ST_SHARED, LM_FLAG_ANY, &gh);
>>>         if (!error) {
>>>                 generic_fillattr(inode, stat);
>>> +               if (S_ISREG(inode->i_mode) && dentry->d_parent 
>>> +                   && dentry->d_parent->d_inode) {
>>> +                       p_inode = igrab(dentry->d_parent->d_inode);
>>> +                       if (p_inode) {
>>> +                               pi = get_v2ip(p_inode);
>>> +                               pi->i_dir_stats++;
>>> +                               iput(p_inode);
>>> +                       }
>>> +               }
>>>                 gfs_glock_dq_uninit(&gh);
>>>         }
>>>  
>>> might cause a deadlock: if the parent directory inode is already locked, 
>>> then this part will wait infinitely to get the lock, isn't it?
>>>
>>> If I open a directory and then stat a file in it, is that enough to 
>>> trigger the deadlock?
>>>     
>>>       
>> No, that's too simple and should have came out much earlier, the patch is 
>> from Nov 6 2008. Something like creating files in a directory by one 
>> process and statting at the same time by another one, in a loop?
>>
>>   
>>     
>
> It would be a shame if GFS(1/2) ends up losing you as a user - not many 
> users can delve into the bits and bytes like you.
>
> My suggestion is that you work directly with GFS engineers, particularly 
> the one who submitted this patch. He is bright and hardworking - one of 
> the best among young engineers within Red Hat. This patch is a good 
> "start" to get into the root cause (as gfs readdir is hung on *every* 
> console logs you generated). Maybe a bugzilla would be a good start ?
Jozsef,

Could you remove the patch associated with bz 466645 and see if you can
hit the hang again? I've looked at the patch and I can't spot anything
obvious. If this patch is causing your problems, I'll work on
reproducing the problem on my setup here and try to fix it.

Thanks
--Abhi

--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster