Hi, the "process", if we wanna call it so, has finished. Maybe there was a process running that accessed/deleted/... files that haven't been accessed for a while, resulting in ctime mdata fixes. However, heal count is down to 0 on all bricks. Very strange, i see ~34K such log entries for each brick. Let's think positive: gluster is running properly and doing what it should do. Great! :D Hubert Am Mo., 8. Juni 2020 um 15:36 Uhr schrieb Strahil Nikolov <hunter86_bg@xxxxxxxxx>: > > Hm... That's something I didn't expect. > > > By the way, have you checked if all clients are connected to all bricks (if using FUSE)? > > Maybe you have some clients that cannot reach a brick. > > Best Regards, > Strahil Nikolov > > На 8 юни 2020 г. 12:48:22 GMT+03:00, Hu Bert <revirii@xxxxxxxxxxxxxx> написа: > >Hi Strahil, > > > >thx for your answer, but i assume that your approach won't help. It > >seems like that this behaviour is permanent; e.g. a log entry like > >this: > > > >[2020-06-08 09:40:03.948269] E [MSGID: 113001] > >[posix-metadata.c:234:posix_store_mdata_xattr] 0-persistent-posix: > >file: > >/gluster/md3/persistent/.glusterfs/38/30/38306ef8-6588-40cf-8be3-c0a022714612: > >gfid: 38306ef8-6588-40cf-8be3-c0a022714612 key:trusted.glusterfs.mdata > > [No such file or directory] > >[2020-06-08 09:40:03.948333] E [MSGID: 113114] > >[posix-metadata.c:433:posix_set_mdata_xattr_legacy_files] > >0-persistent-posix: gfid: 38306ef8-6588-40cf-8be3-c0a022714612 > >key:trusted.glusterfs.mdata [No such file or directory] > >[2020-06-08 09:40:03.948422] I [MSGID: 115060] > >[server-rpc-fops.c:938:_gf_server_log_setxattr_failure] > >0-persistent-server: 14193413: SETXATTR > >/images/generated/207/039/2070391/484x425r.jpg > >(38306ef8-6588-40cf-8be3-c0a022714612) ==> set-ctime-mdata, client: > >CTX_ID:b738017c-20a3-4547-afba-5b8933d8e6e5-GRAPH_ID:0-PID:1078-HOST:pepe-PC_NAME:persistent-client-2-RECON_NO:-1, > >error-xlator: persistent-posix > > > >tells me that an error (ctime-mdata) is found and fixed. And this is > >happening over and over again. A couple of minutes ago i wanted to > >begin with what you suggested and called 'gluster volume heal > >persistent info' and suddenly saw: > > > >Brick gluster1:/gluster/md3/persistent > >Status: Connected > >Number of entries: 0 > > > >Brick gluster2:/gluster/md3/persistent > >Status: Connected > >Number of entries: 0 > > > >Brick gluster3:/gluster/md3/persistent > >Status: Connected > >Number of entries: 0 > > > >I thought 'wtf...'; the heal-count was 0 as well; but the next call > >~15s later showed this again: > > > >Brick gluster1:/gluster/md3/persistent > >Number of entries: 31 > > > >Brick gluster2:/gluster/md3/persistent > >Number of entries: 27 > > > >Brick gluster3:/gluster/md3/persistent > >Number of entries: 4 > > > >For me it looks like the 'error found -> heal it' process works as it > >should, but due to the permanent errors (log file entries) the heal > >count of zero is almost impossible to read. > > > >Well, one could deactivate features.ctime as this seems to be the > >reason (as the log entries suggest), but i don't know if that is > >reasonable, i.e. if this feature is needed. > > > > > >Best regards, > >Hubert > > > >Am Mo., 8. Juni 2020 um 11:22 Uhr schrieb Strahil Nikolov > ><hunter86_bg@xxxxxxxxx>: > >> > >> Hi Hubert, > >> > >> Here is one idea: > >> Using 'gluster volume heal VOL info' can provide the gfids of > >files pending heal. > >> Once you have them, you can find the inode of each file via 'ls -li > >/gluster/brick/.gfid/<first_two_characters_of_gfid>/<next_two_characters>/gfid > >> > >> Then you can search the brick with find for that inode number (don't > >forget the 'ionice' to reduce the pressure). > >> > >> Once you have the list of files, stat them via the FUSE client and > >check if they got healed. > >> > >> I fully agree that you need to first heal the golumes before > >proceeding further or you might get into a nasty situation. > >> > >> Best Regards, > >> Strahil Nikolov > >> > >> > >> На 8 юни 2020 г. 8:30:57 GMT+03:00, Hu Bert <revirii@xxxxxxxxxxxxxx> > >написа: > >> >Good morning, > >> > > >> >i just wanted to update the version from 6.8 to 6.9 on our replicate > >3 > >> >system (formerly was version 5.11), and i see tons of these > >messages: > >> > > >> >[2020-06-08 05:25:55.192301] E [MSGID: 113001] > >> >[posix-metadata.c:234:posix_store_mdata_xattr] 0-persistent-posix: > >> >file: > >> > >>/gluster/md3/persistent/.glusterfs/43/31/43312aba-75c6-42c2-855c-e0db66d7748f: > >> >gfid: 43312aba-75c6-42c2-855c-e0db66d7748f > >key:trusted.glusterfs.mdata > >> > [No such file or directory] > >> >[2020-06-08 05:25:55.192375] E [MSGID: 113114] > >> >[posix-metadata.c:433:posix_set_mdata_xattr_legacy_files] > >> >0-persistent-posix: gfid: 43312aba-75c6-42c2-855c-e0db66d7748f > >> >key:trusted.glusterfs.mdata [No such file or directory] > >> >[2020-06-08 05:25:55.192426] I [MSGID: 115060] > >> >[server-rpc-fops.c:938:_gf_server_log_setxattr_failure] > >> >0-persistent-server: 13382741: SETXATTR > >> ><gfid:43312aba-75c6-42c2-855c-e0db66d7748f> > >> >(43312aba-75c6-42c2-855c-e0db66d7748f) ==> set-ctime-mdata, client: > >> > >>CTX_ID:e223ca30-6c30-4a40-ae98-a418143ce548-GRAPH_ID:0-PID:1006-HOST:sam-PC_NAME:persistent-client-2-RECON_NO:-1, > >> >error-xlator: persistent-posix > >> > > >> >Still the ctime-message. And a lot of these messages: > >> > > >> >[2020-06-08 05:25:53.016606] W [MSGID: 101159] > >> >[inode.c:1330:__inode_unlink] 0-inode: > >> >7043eed7-dbd7-4277-976f-d467349c1361/21194684.jpg: dentry not found > >in > >> >839512f0-75de-414f-993d-1c35892f8560 > >> > > >> >Well... the problem is: the volume seems to be in a permanent heal > >> >status: > >> > > >> >Gathering count of entries to be healed on volume persistent has > >been > >> >successful > >> >Brick gluster1:/gluster/md3/persistent > >> >Number of entries: 31 > >> >Brick gluster2:/gluster/md3/persistent > >> >Number of entries: 6 > >> >Brick gluster3:/gluster/md3/persistent > >> >Number of entries: 5 > >> > > >> >a bit later: > >> >Gathering count of entries to be healed on volume persistent has > >been > >> >successful > >> >Brick gluster1:/gluster/md3/persistent > >> >Number of entries: 100 > >> >Brick gluster2:/gluster/md3/persistent > >> >Number of entries: 74 > >> >Brick gluster3:/gluster/md3/persistent > >> >Number of entries: 1 > >> > > >> >The number of entries never reach 0-0-0; i already updated one of > >the > >> >systems from 6.8 to 6.9, but updating the other 2 when heal isn't > >zero > >> >doesn't seem to be a good idea. Well... any idea? > >> > > >> > > >> >Best regards, > >> >Hubert > >> > > >> >Am Fr., 8. Mai 2020 um 21:47 Uhr schrieb Strahil Nikolov > >> ><hunter86_bg@xxxxxxxxx>: > >> >> > >> >> On April 21, 2020 8:00:32 PM GMT+03:00, Amar Tumballi > >> ><amar@xxxxxxxxx> wrote: > >> >> >There seems to be a burst of issues when people upgraded to 5.x > >or > >> >6.x > >> >> >from > >> >> >3.12 (Thanks to you and Strahil, who have reported most of them). > >> >> > > >> >> >Latest update from Strahil is that if files are copied fresh on > >7.5 > >> >> >series, > >> >> >there are no issues. > >> >> > > >> >> >We are in process of identifying the patch, and also provide an > >> >option > >> >> >to > >> >> >disable 'acl' for testing. Will update once we identify the > >issue. > >> >> > > >> >> >Regards, > >> >> >Amar > >> >> > > >> >> > > >> >> > > >> >> >On Sat, Apr 11, 2020 at 11:10 AM Hu Bert <revirii@xxxxxxxxxxxxxx> > >> >> >wrote: > >> >> > > >> >> >> Hi, > >> >> >> > >> >> >> no one has seen such messages? > >> >> >> > >> >> >> Regards, > >> >> >> Hubert > >> >> >> > >> >> >> Am Mo., 6. Apr. 2020 um 06:13 Uhr schrieb Hu Bert > >> >> ><revirii@xxxxxxxxxxxxxx > >> >> >> >: > >> >> >> > > >> >> >> > Hello, > >> >> >> > > >> >> >> > i just upgraded my servers and clients from 5.11 to 6.8; > >besides > >> >> >one > >> >> >> > connection problem to the gluster download server everything > >> >went > >> >> >> > fine. > >> >> >> > > >> >> >> > On the 3 gluster servers i mount the 2 volumes as well, and > >only > >> >> >there > >> >> >> > (and not on all the other clients) there are some messages in > >> >the > >> >> >log > >> >> >> > file of both mount logs: > >> >> >> > > >> >> >> > [2020-04-06 04:10:53.552561] W [MSGID: 114031] > >> >> >> > [client-rpc-fops_v2.c:851:client4_0_setxattr_cbk] > >> >> >> > 0-persistent-client-2: remote operation failed [Permission > >> >denied] > >> >> >> > [2020-04-06 04:10:53.552635] W [MSGID: 114031] > >> >> >> > [client-rpc-fops_v2.c:851:client4_0_setxattr_cbk] > >> >> >> > 0-persistent-client-1: remote operation failed [Permission > >> >denied] > >> >> >> > [2020-04-06 04:10:53.552639] W [MSGID: 114031] > >> >> >> > [client-rpc-fops_v2.c:851:client4_0_setxattr_cbk] > >> >> >> > 0-persistent-client-0: remote operation failed [Permission > >> >denied] > >> >> >> > [2020-04-06 04:10:53.553226] E [MSGID: 148002] > >> >> >> > [utime.c:146:gf_utime_set_mdata_setxattr_cbk] > >> >0-persistent-utime: > >> >> >dict > >> >> >> > set of key for set-ctime-mdata failed [Permission denied] > >> >> >> > The message "W [MSGID: 114031] > >> >> >> > [client-rpc-fops_v2.c:851:client4_0_setxattr_cbk] > >> >> >> > 0-persistent-client-2: remote operation failed [Permission > >> >denied]" > >> >> >> > repeated 4 times between [2020-04-06 04:10:53.552561] and > >> >> >[2020-04-06 > >> >> >> > 04:10:53.745542] > >> >> >> > The message "W [MSGID: 114031] > >> >> >> > [client-rpc-fops_v2.c:851:client4_0_setxattr_cbk] > >> >> >> > 0-persistent-client-1: remote operation failed [Permission > >> >denied]" > >> >> >> > repeated 4 times between [2020-04-06 04:10:53.552635] and > >> >> >[2020-04-06 > >> >> >> > 04:10:53.745610] > >> >> >> > The message "W [MSGID: 114031] > >> >> >> > [client-rpc-fops_v2.c:851:client4_0_setxattr_cbk] > >> >> >> > 0-persistent-client-0: remote operation failed [Permission > >> >denied]" > >> >> >> > repeated 4 times between [2020-04-06 04:10:53.552639] and > >> >> >[2020-04-06 > >> >> >> > 04:10:53.745632] > >> >> >> > The message "E [MSGID: 148002] > >> >> >> > [utime.c:146:gf_utime_set_mdata_setxattr_cbk] > >> >0-persistent-utime: > >> >> >dict > >> >> >> > set of key for set-ctime-mdata failed [Permission denied]" > >> >repeated > >> >> >4 > >> >> >> > times between [2020-04-06 04:10:53.553226] and [2020-04-06 > >> >> >> > 04:10:53.746080] > >> >> >> > > >> >> >> > Anything to worry about? > >> >> >> > > >> >> >> > > >> >> >> > Regards, > >> >> >> > Hubert > >> >> >> ________ > >> >> >> > >> >> >> > >> >> >> > >> >> >> Community Meeting Calendar: > >> >> >> > >> >> >> Schedule - > >> >> >> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC > >> >> >> Bridge: https://bluejeans.com/441850968 > >> >> >> > >> >> >> Gluster-users mailing list > >> >> >> Gluster-users@xxxxxxxxxxx > >> >> >> https://lists.gluster.org/mailman/listinfo/gluster-users > >> >> >> > >> >> > >> >> Hi, > >> >> > >> >> Can you provide the xfs_info for the bricks from the volume ? > >> >> > >> >> I have a theory that I want to confirm or reject. > >> >> > >> >> Best Regards, > >> >> Strahil Nikolov ________ Community Meeting Calendar: Schedule - Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC Bridge: https://bluejeans.com/441850968 Gluster-users mailing list Gluster-users@xxxxxxxxxxx https://lists.gluster.org/mailman/listinfo/gluster-users