Re: NFS bug with utime/file create

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Oct 07, 2010 at 06:06:59PM +0200, Lukas Hejtmanek wrote:
> Hello,
> 
> I noticed an interesing bug in 2.6.26 kernel, not sure whether this has been
> fixed in newer version or not.
> 
> I have an application, that does basically the following code:
> int
> main(int argc, char *argv[]) {
>         int fd = open(argv[1], O_WRONLY|O_CREAT|O_TRUNC, 0666);
>         char buff[300];
>         write(fd, buff, 300);
>         while(1) {
>                 utime(argv[1], NULL);
>                 sleep(30);
>         }
>         return 0;
> }
> 
> 
> another application that runs on a different client does:
> 17:57:30.605304 nanosleep({24, 0}, {24, 0}) = 0
> 17:57:54.605526 stat64("/storage/home/xhejtman/gangadir/workspace/xhejtman/LocalXML/0",
> {st_mode=S_IFDIR|0700, st_size=59, ...}) = 0
> 17:57:54.606029 mkdir("/storage/home/xhejtman/gangadir/workspace/xhejtman/LocalXML/0/output",
> 0777) = -1 EEXIST (File exists)
> 17:57:54.606414 open("/storage/home/xhejtman/gangadir/workspace/xhejtman/LocalXML/0/output/__jobstatus__",
> O_RDONLY|O_LARGEFILE) = 3
> 17:57:54.607073 fstat64(3, {st_mode=S_IFREG|0644, st_size=300, ...}) = 0
> 17:57:54.607230 mmap2(NULL, 1048576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb74ef000
> 17:57:54.607325 _llseek(3, 0, [0], SEEK_CUR) = 0
> 17:57:54.607408 read(3, "\0\0\0\0\0\0\0\0\336e\356,\345\177\0\0\0\0\0\0\0\0\0\0"..., 1048576) = 300
> 17:57:54.607765 read(3, "", 1048576)    = 0
> 17:57:54.607854 close(3)                = 0
> 17:57:54.608128 munmap(0xb74ef000, 1048576) = 0
> 17:57:54.608224 stat64("/storage/home/xhejtman/gangadir/workspace/xhejtman/LocalXML/0/output/__jobstatus__",
> {st_mode=S_IFREG|0644, st_size=300, ...}) = 0
> 
> The first application tries to create a file (something like:
> open("time_elapsed.txt", O_WRONLY|O_CREAT|O_TRUNC, 0666) (*)
> 
> at this point, it emits PUTFH, SAVEFH, OPEN, GETFH, GETATTR, RESTOREFH,
> GETATTR compount. The server replies with NFS4ERR_EXPIRED.
> 
> The client tries to RENEW, the server replies NFS4ERR_EXPIRED.
> 
> The client restarts using SETCLIENTID and so on. During this phase, the first
> application emits utime call. It seems that orignial open (*) get lost and system
> deadlocks. 
> 
> Using NFS debugs, I can see a warning, that the lease is not expired (from the
> client's point of view, but the server is conviced that the lease is expired).
> 
> I can reliably reproduce it with diane/ganga framework. I cannot fully reproduce it
> just using simple C programs. 
> 
> 
> Is there something I could do?

There have been a number of fixes to the client state recovery code
since then, so it may be worth just retrying with a newer kernel on the
client.

--b.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux