hello. we use glusterfs 3.2.0 2 glusterfs-server with SLES 11.1 and several clients which access the gfs-volumes. configuration: info ---- type=2 count=2 status=1 sub_count=2 version=1 transport-type=0 volume-id=05168b54-6a5c-4aa3-91ee-63d16976c6cd brick-0=10.0.1.xxx:-glusterstorage-macm03 brick-1=10.0.1.xxy:-glusterstorage-macm03 macm03-fuse.vol --------------- volume macm03-client-0 type protocol/client option remote-host 10.0.1.xxx option remote-subvolume /glusterstorage/macm03 option transport-type tcp end-volume volume macm03-client-1 type protocol/client option remote-host 10.0.1.xxy option remote-subvolume /glusterstorage/macm03 option transport-type tcp end-volume volume macm03-replicate-0 type cluster/replicate subvolumes macm03-client-0 macm03-client-1 end-volume volume macm03-write-behind type performance/write-behind subvolumes macm03-replicate-0 end-volume volume macm03-read-ahead type performance/read-ahead subvolumes macm03-write-behind end-volume volume macm03-io-cache type performance/io-cache subvolumes macm03-read-ahead end-volume volume macm03-quick-read type performance/quick-read subvolumes macm03-io-cache end-volume volume macm03-stat-prefetch type performance/stat-prefetch subvolumes macm03-quick-read end-volume volume macm03 type debug/io-stats subvolumes macm03-stat-prefetch end-volume macm03.10.0.1.xxx.glusterstorage-macm03.vol ------------------------------------------- volume macm03-posix type storage/posix option directory /glusterstorage/macm03 end-volume volume macm03-access-control type features/access-control subvolumes macm03-posix end-volume volume macm03-locks type features/locks subvolumes macm03-access-control end-volume volume macm03-io-threads type performance/io-threads subvolumes macm03-locks end-volume volume /glusterstorage/macm03 type debug/io-stats subvolumes macm03-io-threads end-volume volume macm03-server type protocol/server option transport-type tcp option auth.addr./glusterstorage/macm03.allow * subvolumes /glusterstorage/macm03 end-volume macm03.10.0.1.xxy.glusterstorage-macm03.vol ------------------------------------------- volume macm03-posix type storage/posix option directory /glusterstorage/macm03 end-volume volume macm03-access-control type features/access-control subvolumes macm03-posix end-volume volume macm03-locks type features/locks subvolumes macm03-access-control end-volume volume macm03-io-threads type performance/io-threads subvolumes macm03-locks end-volume volume /glusterstorage/macm03 type debug/io-stats subvolumes macm03-io-threads end-volume volume macm03-server type protocol/server option transport-type tcp option auth.addr./glusterstorage/macm03.allow * subvolumes /glusterstorage/macm03 end-volume client ------ the client has mounted the volume via fstab like this: server:/macm03 /srv/www/GFS glusterfs defaults,_netdev 0 0 now we registered strange behavior and i have some questions: 1) files with size 0 we find many files with size 0. in server-log we only find this. what does this mean? (most all files in one directory has size 0). [2011-04-28 23:52:00.630869] I [server-resolve.c:580:server_resolve] 0-macm03-server: pure path resolution for /xxx/preview/4aa76fa541413.jpg (LOOKUP) [2011-04-28 23:52:00.637384] I [server-resolve.c:580:server_resolve] 0-macm03-server: pure path resolution for /xxx/preview/4aa76fa541413.jpg (UNLINK) [2011-04-28 23:52:00.693183] I [server-resolve.c:580:server_resolve] 0-macm03-server: pure path resolution for /xxx/preview/4aa76fa541413.jpg (LOOKUP) [2011-04-28 23:52:00.711092] I [server-resolve.c:580:server_resolve] 0-macm03-server: pure path resolution for /xxx/preview/4aa76fa541413.jpg (MKNOD) [2011-04-28 23:52:00.746289] I [server-resolve.c:580:server_resolve] 0-macm03-server: pure path resolution for /xxx/preview/4aa76fa541413.jpg (SETATTR) [2011-04-28 23:52:16.373532] I [server-resolve.c:580:server_resolve] 0-macm03-server: pure path resolution for /xxx/preview/4aa76fa541413.jpg (LOOKUP) 2) then the client is selfhealing meta-data all the time.... (because the file has has size 0 on one of the servers???) but we triggered selfhealing severalt time like this: http://europe.gluster.org/community/documentation/index.php/Gluster_3.1:_Triggering_Self-Heal_on_Replicate [2011-04-29 07:55:27.188743] I [afr-common.c:581:afr_lookup_collect_xattr] 0-macm03-replicate-0: data self-heal is pending for /videos12/29640/preview/4aadf4b757de6.jpg. [2011-04-29 07:55:27.188829] I [afr-common.c:735:afr_lookup_done] 0-macm03-replicate-0: background meta-data data self-heal triggered. path: /videos12/29640/preview/4aadf4b757de6.jpg [2011-04-29 07:55:27.194446] W [dict.c:437:dict_ref] (-->/opt/glusterfs/3.2.0/lib64/glusterfs/3.2.0/xlator/protocol/client.so(client3_1_fstat_cbk+0x2bb) [0x2aaaaafe833b] (-->/opt/glusterfs/3.2.0/lib64/glusterfs/3.2.0/xlator/cluster/replicate.so(afr_sh_data_fstat_cbk+0x17d) [0x2aaaab11c9ad] (-->/opt/glusterfs/3.2.0/lib64/glusterfs/3.2.0/xlator/cluster/replicate.so(afr_sh_data_fix+0x1fc) [0x2aaaab11c64c]))) 0-dict: dict is NULL 3) on some of the clients we then can not access the whole directory: # dir xxx/preview/ /bin/ls: reading directory xxx/preview/: File descriptor in bad state total 0 in logs we find this: [2011-04-29 08:36:17.224301] W [afr-common.c:634:afr_lookup_self_heal_check] 0-macm03-replicate-0: /videos12/30181: gfid different on subvolume [2011-04-29 08:36:17.241330] I [afr-common.c:680:afr_lookup_done] 0-macm03-replicate-0: entries are missing in lookup of /xxx/preview. [2011-04-29 08:36:17.241373] I [afr-common.c:735:afr_lookup_done] 0-macm03-replicate-0: background meta-data data entry self-heal triggered. path: /xxx/preview [2011-04-29 08:36:17.243160] I [afr-self-heal-metadata.c:595:afr_sh_metadata_lookup_cbk] 0-macm03-replicate-0: path /videos12/30181/preview on subvolume macm03-client-0 => -1 (No such file or directory) [2011-04-29 08:36:17.302228] I [afr-dir-read.c:120:afr_examine_dir_readdir_cbk] 0-macm03-replicate-0: /videos12/30181/preview: failed to do opendir on macm03-client-0 [2011-04-29 08:36:17.303836] I [afr-dir-read.c:174:afr_examine_dir_readdir_cbk] 0-macm03-replicate-0: entry self-heal triggered. path: /xxx/preview, reason: checksums of directory differ, forced merge option set 4) sometimes when we umount glusterfs-volume on client and remount it again, we can access the dirctory which was in bad state before -> and then also selfhealing works at it should but sometimes also a remount does not work. any help would be appreciated. thank you very very much! thx christopher