Is there a way to get flock on GFS to behave the way it does on the EXT3 file system?
I have attached sample C source code and here are instructions to demonstrate this issue.
My cluster is running GFS 6.1, RHEL 4 update 5 with all of the patches.
Compile both programs:
[mbrookov@imagine locktest]$ cc -o flock_EX_SH flock_EX_SH.c [mbrookov@imagine locktest]$ cc -o flockwritelock flockwritelock.c [mbrookov@imagine locktest]$
EXT3 test:
Start up xterm twice and cd to the directory where you compiled the 2 programs. On my system, /tmp is an EXT3 file system.In the first xterm, run 'flock_EX_SH /tmp/bar' and hit return. In the second xterm, run 'flockwritelock /tmp/bar' and hit return. The flockwritelock process will block waiting for an exclusive lock on the file /tmp/bar.
One the first xterm, hit return, the flock_EX_SH process will attempt to demote the exclusive lock to a shared lock and display a prompt. The flockwritelock process on the second xterm will stay blocked.
In the first xterm, hit return again, the flock_EX_SH process will free the lock, close the file and exit. The flockwritelock process will then receive the exclusive lock on /tmp/bar and display a prompt. Hit return in the second xterm to get flockwritelock to close and exit.
Output on first xterm:
[mbrookov@imagine locktest]$ ./flock_EX_SH /tmp/bar Have exclusive lock, hit return to free write lock on /tmp/bar and exit Attempt to demote lock on /tmp/bar to shared lock Have shared lock, hit return to free lock on /tmp/bar and exit [mbrookov@imagine locktest]$Output on second xterm:
[mbrookov@imagine locktest]$ ./flockwritelock /tmp/bar Have write lock, hit return to free write lock on /tmp/bar and exit [mbrookov@imagine locktest]$
GFS test:
Start up xterm twice and cd to the directory where you compiled the 2 programs. On my system, the locktest directory is on a GFS file system.In the first xterm, run 'flock_EX_SH bar' and hit return. In the second xterm, run 'flockwritelock bar' and hit return. The flockwritelock process will block waiting for an exclusive lock on the file bar.
On the first xterm, hit return, the flock_EX_SH process will attempt to demote the exclusive lock on bar to a shared lock but will fail because the system call to flock frees the lock allowing the flockwritelock process to get an exclusive lock. The flock_EX_SH process will exit.
Hit return on the second xterm, flockwritelock will close bar and exit.
Output on first xterm:
[mbrookov@imagine locktest]$ ./flock_EX_SH bar Have exclusive lock, hit return to free write lock on bar and exit Attempt to demote lock on bar to shared lock Could not demote to shared lock on file bar, Resource temporarily unavailable [mbrookov@imagine locktest]$Output on second xterm:
[mbrookov@imagine locktest]$ ./flockwritelock bar Have write lock, hit return to free write lock on bar and exit [mbrookov@imagine locktest]$The results for flock on GFS are the same if you run the two programs on the same node or on 2 different nodes. The locks (shared, exclusive, blocking, non blocking) also work correctly on both file systems. The problem is the case where GFS will free the exclusive lock and return an error instead of demote the exclusive lock to a shared lock.
The program depends on the EXT3 flock behavior -- the exclusive lock can be demoted to a shared lock without the possibility that another process that is blocked waiting for an exclusive lock receiving the lock.
Thank you
Matt
mbrookov@xxxxxxxxx
#include <stdio.h> #include <unistd.h> #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #include <sys/file.h> #include <string.h> #include <time.h> #include <errno.h> int writelock(char *filename) { int fd; if ((fd=open(filename,O_CREAT|O_RDWR,S_IRUSR|S_IWUSR|S_IRGRP))==-1) { fprintf (stderr,"Could not open %s:",filename); perror (""); exit(1); } int flock(int fd, int operation); while (flock(fd,LOCK_EX) == -1) { if (errno!=EAGAIN) { fprintf (stderr,"Could not get write lock on %s errno=%d:",filename,errno); perror (""); exit(1); } printf ("%s locked, trying again\n",filename); } printf ("Have exclusive lock, hit return to free write lock on %s and exit\n",filename); fgetc(stdin); printf ("Attempt to demote lock on %s to shared lock\n",filename); if (flock(fd,LOCK_SH|LOCK_NB) == -1) { fprintf (stderr,"Could not demote to shared lock on file %s, %s\n",filename,strerror(errno)); exit(1); } printf ("Have shared lock, hit return to free lock on %s and exit\n",filename); fgetc(stdin); if (flock(fd,LOCK_UN) == -1) { fprintf (stderr,"Could not free shared lock on file %s:",filename); perror (""); exit(1); } close(fd); } main(int argc, char *argv[]) { int fd; int i; fd=writelock(argv[1]); }
#include <stdio.h> #include <unistd.h> #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #include <sys/file.h> #include <string.h> #include <time.h> #include <errno.h> int writelock(char *filename) { int fd; if ((fd=open(filename,O_CREAT|O_RDWR,S_IRUSR|S_IWUSR|S_IRGRP))==-1) { fprintf (stderr,"Could not open %s:",filename); perror (""); exit(1); } /* ** F_SETLKW seems to work on GFS under light load ** looping over F_SETLK will fail with errno operation not permitted */ int flock(int fd, int operation); while (flock(fd,LOCK_EX) == -1) { if (errno!=EAGAIN) { fprintf (stderr,"Could not get write lock on %s errno=%d:",filename,errno); perror (""); exit(1); } printf ("%s locked, trying again\n",filename); } printf ("Have write lock, hit return to free write lock on %s and exit\n",filename); fgetc(stdin); if (flock(fd,LOCK_UN) == -1) { fprintf (stderr,"Could unlock %s:",filename); perror (""); exit(1); } close(fd); } main(int argc, char *argv[]) { int fd; int i; fd=writelock(argv[1]); }
-- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster