Re: F_SETLK fails after recovery

Bob Peterson <rpeterso@xxxxxxxxxx> · Tue, 2 Sep 2014 11:04:58 -0400 (EDT)

----- Original Message -----
> Hi,
>  In our two node system if one node fails, the other node takes over the
>  application and uses the shared gfs2 target successfully. However, after
>  the failed node comes back any attempts to lock files on the gfs2 resource
>  results in -ENOSYS. The following test program exhibits the problem - in
>  normal operation the lock succeeds but in the fail/recover scenario we get
>  -ENOSYS:
> 
> #include <stdio.h>
> #include <fcntl.h>
> #include <unistd.h>
> 
> int
> main(int argc, char **argv)
> {
> 	int fd;
> 	struct flock fl;
> 
> 	fd = open("/mnt/test.file",O_RDONLY);
> 	if (fd != -1) {
> 		if (fcntl(fd, F_SETFL, O_RDONLY|O_DSYNC) != -1) {
> 			fl.l_type = F_RDLCK;
> 			fl.l_whence = SEEK_SET;
> 			fl.l_start = 0;
> 			fl.l_len = 0;
> 			if (fcntl(fd, F_SETLK, &fl) != -1)
> 				printf("File locked successfully\n");
> 			else
> 				perror("fcntl(F_SETLK)");
> 		} else
> 			perror("fcntl(F_SETFL)");
> 		close (fd);
> 	} else
> 		perror("open");
> }
> 
> I've tracked things down to these messages:
> 
> 1409631951 lockspace lvclusdidiz0360 plock disabled our sig 816fba01 nodeid 2
> sig 2f6b
> :
> 1409634840 lockspace lvclusdidiz0360 plock disabled our sig 0 nodeid 2 sig
> 2f6b
> 
> Which indicates the lockspace attribute disable_plock has been set by way of
> the other node calling send_plocks_stored
> ().
> 
> Looking at the cpg.c:
> 
> static void prepare_plocks(struct lockspace *ls)
> {
>       
> struct change *cg = list_first_entry(&ls->changes, struct change, list);
>       
> struct member *memb;
> uint32_t sig;
> 
> :
> :
> :
>       if (nodes_added(ls))
>             store_plocks(ls, &sig);
>       send_plocks_stored(ls, sig);
> }
> 
> If nodes_added(ls) returns false then an uninitialized "sig" value will be
> passed to send_plocks_stored(). Do the "our sig" and "sig" values in the
> above log messages make sense?
> 
> If this is not the case, what is supposed to happen in order re-enable plocks
> on the recovered node?
> 
> Neale

Hi Neale,

For what it's worth: GFS2 just passes plock requests down to the cluster
infrastructure. (Unlike flocks which are handled internally by gfs2). It will be
important for the cluster folks to know what release this is. At this point
I'm not sure if it's openais, corosync or what not.

Regards,

Bob Peterson
Red Hat File Systems

-- 
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster