Confusion supreme

Zenon Panoussis <oracle@xxxxxxxxxxxxxxx> · Wed, 26 Jun 2024 17:07:19 +0300

Hello all

I have a mail store on a volume replica 3 with no arbiter. A while
ago the disk of one of the bricks failed and I was several days
late to notice it. When I did, I removed that brick from the volume,
replaced the failed disk, updated the OS on that machine from el8
to el9 and gluster on all three nodes from 10.3 to 11.1, added back
the brick and started a heal. Things appeared to work out OK, but
actually they did not. And this is what I have now.

# gluster volume info gv0

Volume Name: gv0
Type: Replicate
Volume ID: 1e3ca399-8e57-4ee8-997f-f64479199d23
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: zephyrosaurus:/vol/gfs/gv0
Brick2: alvarezsaurus:/vol/gfs/gv0
Brick3: nanosaurus:/vol/gfs/gv0
Options Reconfigured:
cluster.entry-self-heal: on
cluster.metadata-self-heal: on
cluster.data-self-heal: on
<snip>

On all three hosts:

# ls /vol/vmail/net/provocation/oracle/Maildir/cur
ls: reading directory '/vol/vmail/net/provocation/oracle/Maildir/cur': Invalid argument

That is the glusterfs-mounted inbox of my mail.

If I list the bricks instead, I have different results on each host:

HostA
# ls /vol/gfs/gv0/net/provocation/oracle/Maildir/cur/ |wc -l
4848

HostB
# ls /vol/gfs/gv0/net/provocation/oracle/Maildir/cur/ |wc -l
522

HostC
# ls /vol/gfs/gv0/net/provocation/oracle/Maildir/cur/ |wc -l
4837

However,

# gluster volume heal gv0 info
Brick zephyrosaurus:/vol/gfs/gv0
/net/provocation/oracle/Maildir/cur
/net/provocation/oracle/Maildir/cur/1701712419.M379665P902306V000000000000002DI8264026770F33CFF_1.zephyrosaurus.nettheatre.org,S=14500:2,RS
/net/provocation/oracle/Maildir/cur/1701712390.M212926P902294V000000000000002DIA089A37BF7E58BB4_1.zephyrosaurus.nettheatre.org,S=19286:2,S
Status: Connected
Number of entries: 3

Brick alvarezsaurus:/vol/gfs/gv0
/net/provocation/oracle/Maildir/cur
Status: Connected
Number of entries: 1

Brick nanosaurus:/vol/gfs/gv0
/net/provocation/oracle/Maildir/cur
Status: Connected
Number of entries: 1

That's definitely not what it should be. There are at least 4300+
files missing from hostB and a dozen from hostC which are not queued
for healing. And nothing is in split-brain.

So I check the attributes of that Maildir/cur directory:

HostA
# getfattr -d -m . -e hex /vol/gfs/gv0/net/provocation/oracle/Maildir/cur
getfattr: Removing leading '/' from absolute path names
# file: vol/gfs/gv0/net/provocation/oracle/Maildir/cur
trusted.afr.dirty=0x000000000000000000000000
trusted.afr.gv0-client-1=0x000000000000000000002394
trusted.afr.gv0-client-2=0x0000000000000000000000a0
trusted.afr.gv0-client-4=0x00000001000000000000001a
trusted.gfid=0xbf3ed8b7b2a8457d88f19482ae1ce73d
trusted.glusterfs.dht=0x000000000000000000000000ffffffff
trusted.glusterfs.mdata=0x010000000000000000520318006fbb5600000000001c00ba4800000000667c1663000000002e7f0c84a2925c3455bf9e00000000001098f13e

HostB
# getfattr -d -m . -e hex /vol/gfs/gv0/net/provocation/oracle/Maildir/cur
getfattr: Removing leading '/' from absolute path names
# file: vol/gfs/gv0/net/provocation/oracle/Maildir/cur
trusted.afr.dirty=0x000000000000000000000000
trusted.afr.gv0-client-2=0x0000000000000000000000a0
trusted.gfid=0xbf3ed8b7b2a8457d88f19482ae1ce73d
trusted.glusterfs.dht=0x000000000000000000000000ffffffff
trusted.glusterfs.mdata=0x010000000000000000520318006fbb5600000000001c00ba4800000000667c1663000000002e7f0c84a2925c3455bf9e00000000001098f13e

HostC
# getfattr -d -m . -e hex /vol/gfs/gv0/net/provocation/oracle/Maildir/cur
getfattr: Removing leading '/' from absolute path names
# file: vol/gfs/gv0/net/provocation/oracle/Maildir/cur
trusted.afr.dirty=0x000000000000000000000000
trusted.afr.gv0-client-1=0x000000000000000000002394
trusted.afr.gv0-client-3=0x000000000000000000000000
trusted.afr.gv0-client-4=0x000000010000000000000020
trusted.gfid=0xbf3ed8b7b2a8457d88f19482ae1ce73d
trusted.glusterfs.dht=0x000000000000000000000000ffffffff
trusted.glusterfs.mdata=0x010000000000000000520318006fbb5600000000001c00ba4800000000667c1663000000002e7f0c84a2925c3455bf9e00000000001098f13e

There is the explanation of this mess. In a replica 3 where
I should have exactly three clients, I have four clients and
none of the bricks have the same set of peers.

In a recent thread about similar problems, Ilias used the word
"clueless". It applies equally to me too: I have zero clue where
to begin or what to do. Any ideas anyone?

Cheers,

Z

--
Слава Україні!
Путлер хуйло!
________

Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users