file not found on DHT, but it exists

ajwytzes at wise-guys.nl (Arend-Jan Wijtzes) · Wed, 31 Mar 2010 13:49:58 +0200

Hi gluster developers,

I have encountered a situation where a file can not be found,
but it does exist and it is on the correct node. The file can 
be stat()-ed but not opened. After a Gluster restart the file
is accessable again.

Glusterfs: 3.0.3 with altered hashing function (by me).

== On the Gluster mounted volume:
archive at cgmarchive0:~/archive/incoming$ ls -l www.funkyfish.nl#59493#cgmspider0
-rw-rw-r--    1 archive  archive    599065 Mar 30 15:16 www.funkyfish.nl#59493#cgmspider0
archive at cgmarchive0:~/archive/incoming$ wc -l www.funkyfish.nl#59493#cgmspider0
wc: www.funkyfish.nl#59493#cgmspider0: No such file or directory

== On the local (node0) volume
archive at cgmarchive0:/local.mnt/md0/glfs-data/incoming$ ls -l www.funkyfish.nl#59493#cgmspider0
-rw-rw-r--    1 vagabond vagabondo   599065 Mar 30 15:16 www.funkyfish.nl#59493#cgmspider0
archive at cgmarchive0:/local.mnt/md0/glfs-data/incoming$ wc -l www.funkyfish.nl#59493#cgmspider0
  10767 www.funkyfish.nl#59493#cgmspider0

== Error log
[2010-03-31 12:02:47] D [dht-common.c:1590:dht_fd_cbk] dht: subvolume node0 returned -1 (No such file or directory)
[2010-03-31 12:02:47] W [fuse-bridge.c:858:fuse_fd_cbk] glusterfs-fuse: 10346982: OPEN() /incoming/www.funkyfish.nl#59493#cgmspider0 => -1 (No such file or directory)

Then after a gluster restart (umount/mount sequence):

== On the Gluster mounted volume:
archive at cgmarchive0:~/archive/incoming$  ls -l www.funkyfish.nl#59493#cgmspider0
-rw-rw-r--    1 archive  archive    599065 Mar 30 15:16 www.funkyfish.nl#59493#cgmspider0
archive at cgmarchive0:~/archive/incoming$  wc -l www.funkyfish.nl#59493#cgmspider0
  10767 www.funkyfish.nl#59493#cgmspider0

The application access pattern for these files is:
* a file is copied onto the filesystem with a temporary name
* the file is renamed to it's final name
* the file is read once, then deleted
* the filename is normally not used again or at least not any time soon

All file operations went through the gluster fs (no direct local access).
The hashing function has been replaced by one that implements a 'consistent
hashing' scheme and adapted so that the temporary filename and the final 
filename always go to the same node.

The problem is not isolated to a single case, but it does take
a long time (days) to occur. In the long term it can be reproduced
so if you need more debugging info I can try to extract it for you.

Any ideas?

== Volume file
volume posix
  type storage/posix
  option directory /local.mnt/md0/glfs-data
end-volume

volume locks
  type features/posix-locks
  subvolumes posix
end-volume

volume fixed-id
  type features/filter
  option fixed-uid 2224
  option fixed-gid 224
  subvolumes locks
end-volume

volume brick
  type performance/io-threads
  subvolumes fixed-id
end-volume

volume server
  type protocol/server
  option transport-type tcp
  option auth.addr.brick.allow 10.0.0.*,10.1.0.*
  subvolumes brick
end-volume

volume node0
  type protocol/client
  option transport-type tcp
  option remote-host cgmarchive0
  option remote-subvolume brick
end-volume

volume node1
  type protocol/client
  option transport-type tcp
  option remote-host cgmarchive1
  option remote-subvolume brick
end-volume

volume dht
  type cluster/dht
  subvolumes node0 node1
end-volume

-- 
Arend-Jan Wijtzes -- Wiseguys -- www.wise-guys.nl