NFS bug with utime/file create

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

I noticed an interesing bug in 2.6.26 kernel, not sure whether this has been
fixed in newer version or not.

I have an application, that does basically the following code:
int
main(int argc, char *argv[]) {
        int fd = open(argv[1], O_WRONLY|O_CREAT|O_TRUNC, 0666);
        char buff[300];
        write(fd, buff, 300);
        while(1) {
                utime(argv[1], NULL);
                sleep(30);
        }
        return 0;
}


another application that runs on a different client does:
17:57:30.605304 nanosleep({24, 0}, {24, 0}) = 0
17:57:54.605526 stat64("/storage/home/xhejtman/gangadir/workspace/xhejtman/LocalXML/0",
{st_mode=S_IFDIR|0700, st_size=59, ...}) = 0
17:57:54.606029 mkdir("/storage/home/xhejtman/gangadir/workspace/xhejtman/LocalXML/0/output",
0777) = -1 EEXIST (File exists)
17:57:54.606414 open("/storage/home/xhejtman/gangadir/workspace/xhejtman/LocalXML/0/output/__jobstatus__",
O_RDONLY|O_LARGEFILE) = 3
17:57:54.607073 fstat64(3, {st_mode=S_IFREG|0644, st_size=300, ...}) = 0
17:57:54.607230 mmap2(NULL, 1048576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb74ef000
17:57:54.607325 _llseek(3, 0, [0], SEEK_CUR) = 0
17:57:54.607408 read(3, "\0\0\0\0\0\0\0\0\336e\356,\345\177\0\0\0\0\0\0\0\0\0\0"..., 1048576) = 300
17:57:54.607765 read(3, "", 1048576)    = 0
17:57:54.607854 close(3)                = 0
17:57:54.608128 munmap(0xb74ef000, 1048576) = 0
17:57:54.608224 stat64("/storage/home/xhejtman/gangadir/workspace/xhejtman/LocalXML/0/output/__jobstatus__",
{st_mode=S_IFREG|0644, st_size=300, ...}) = 0

The first application tries to create a file (something like:
open("time_elapsed.txt", O_WRONLY|O_CREAT|O_TRUNC, 0666) (*)

at this point, it emits PUTFH, SAVEFH, OPEN, GETFH, GETATTR, RESTOREFH,
GETATTR compount. The server replies with NFS4ERR_EXPIRED.

The client tries to RENEW, the server replies NFS4ERR_EXPIRED.

The client restarts using SETCLIENTID and so on. During this phase, the first
application emits utime call. It seems that orignial open (*) get lost and system
deadlocks. 

Using NFS debugs, I can see a warning, that the lease is not expired (from the
client's point of view, but the server is conviced that the lease is expired).

I can reliably reproduce it with diane/ganga framework. I cannot fully reproduce it
just using simple C programs. 


Is there something I could do?

-- 
Lukáš Hejtmánek
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux