Re: fcntl locking lockup (dlm 1.07, GFS 6.1.5, kernel 2.6.9-67.EL)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Jan 04, 2008 at 04:18:45PM -0500, Charlie Brady wrote:
> We've reduced the application code to a simple test case. The following 
> code run on each node will soon block, and doesn't receive signals until 
> the peer node is shutdown:
> 
> ...
>     fl.l_whence=SEEK_SET;
>     fl.l_start=0;
>     fl.l_len=1;
> 
>     while (1)
>     {
>       fl.l_type=F_WRLCK;
>       retval=fcntl(filedes,F_SETLKW,&fl);
>       if (retval==-1)
>       {
>         perror("lock");
>         exit(1);
>       }
>       // attempt to unlock the index file
>       fl.l_type=F_UNLCK;
>       retval=fcntl(filedes,F_SETLKW,&fl);
>       if (retval==-1)
>       {
>         perror("unlock");
>         exit(1);
>       }
>     }

Yes, this stresses a problematic design limitation in the RHEL4 dlm where
the dlm master node is ping-ponging all over the place and becomes so
unstable that everything comes to a halt.  One possible work-around is to
modify the program to hold a lock on filedes to keep the master stable,
e.g.  hold a zero length lock at some unused offset like 0xFFFFFF.

Dave

--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster

[Index of Archives]     [Corosync Cluster Engine]     [GFS]     [Linux Virtualization]     [Centos Virtualization]     [Centos]     [Linux RAID]     [Fedora Users]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite Camping]

  Powered by Linux