On 15/03/21 3:39 pm, Zenon Panoussis wrote:
Does anyone know what healing error 22 "invalid argument" is and how to fix it, or at least how to troubleshoot it? while true; do date; gluster volume heal gv0 statistics heal-count; echo -e "--------------\n"; sleep 297; done Fri Mar 12 14:58:36 CET 2021 Gathering count of entries to be healed on volume gv0 has been successful Brick node01:/gfs/gv0 Number of entries: 4 Brick node02:/gfs/gv0 Number of entries: 343 Brick node03:/gfs/gv0 Number of entries: 344 -------------- Three days later... Mon Mar 15 10:57:23 CET 2021 Gathering count of entries to be healed on volume gv0 has been successful Brick node01:/gfs/gv0 Number of entries: 4 Brick node02:/gfs/gv0 Number of entries: 343 Brick node03:/gfs/gv0 Number of entries: 344 -------------- glustershd.log is full of entries like these: [2021-03-15 05:38:01.991945 +0000] I [MSGID: 108026] [afr-self-heal-entry.c:1053:afr_selfheal_entry_do] 0-gv0-replicate-0: performing entry selfheal on 011fcc1b-4d90-4c36-86ec-488aaa4db3b8 [2021-03-15 05:59:02.812770 +0000] E [MSGID: 114031] [client-rpc-fops_v2.c:214:client4_0_mkdir_cbk] 0-gv0-client-2: remote operation failed. [{path=(null)}, {errno=22}, {error=Invalid argument}] [2021-03-15 05:59:02.813933 +0000] E [MSGID: 114031] [client-rpc-fops_v2.c:214:client4_0_mkdir_cbk] 0-gv0-client-1: remote operation failed. [{path=(null)}, {errno=22}, {error=Invalid argument}] [2021-03-15 05:59:03.061068 +0000] E [MSGID: 114031] [client-rpc-fops_v2.c:214:client4_0_mkdir_cbk] 0-gv0-client-0: remote operation failed. [{path=(null)}, {errno=22}, {error=Invalid argument}] [2021-03-15 05:59:05.547156 +0000] I [MSGID: 108026] [afr-self-heal-entry.c:1053:afr_selfheal_entry_do] 0-gv0-replicate-0: performing entry selfheal on 1a5121eb-a90b-4b23-92ba-f277124cb82a That is, it starts healing an object, fails a few times, moves on to the next, fails on it too, and so on ad infinitum. The volume is a replica 3, gluster is v9.0, and all three bricks are up and connected. This behaviour started shortly after I enabled granular-entry-heal. Whether that has anything to do with the problem or not, I don't know. Switching back to disabled granular-entry-heal did not help.
-Was this an upgraded setup or a fresh v9.0 install? Asking because v9.0 has granular-entry-heal on by default for new volumes.
- When there are entries yet to be healed, the CLI should have prevented you toggling this option - was that not the case?
- Can you find the directory name corresponding to the gfid 011fcc1b-4d90-4c36-86ec-488aaa4db3b8 (use https://github.com/gluster/glusterfs/blob/master/extras/gfid-to-dirname.sh if needed) and see if all files/ sub directories (first level only) inside it are same on all 3 bricks?
________ Community Meeting Calendar: Schedule - Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC Bridge: https://meet.google.com/cpu-eiue-hvk Gluster-users mailing list Gluster-users@xxxxxxxxxxx https://lists.gluster.org/mailman/listinfo/gluster-users
________ Community Meeting Calendar: Schedule - Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC Bridge: https://meet.google.com/cpu-eiue-hvk Gluster-users mailing list Gluster-users@xxxxxxxxxxx https://lists.gluster.org/mailman/listinfo/gluster-users