Re: [Gluster-users] Lot of EIO errors in disperse volume

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Ram,

On 16/01/17 12:33, Ankireddypalle Reddy wrote:
Xavi,
          Thanks. Is there any other way to map from GFID to path.

The only way I know is to search all files from bricks and lookup for the trusted.gfid xattr.

I will look for a way to share the TRACE logs. Easier way might be to add some extra logging. I could do that if you could let me know functions in which you are interested..

The problem is that I don't know where the problem is. One possibility could be to track all return values from all bricks for all writes and then identify which ones belong to an inconsistent file.

But if this doesn't reveal anything interesting we'll need to look at some other place. And this can be very tedious and slow.

Anyway, what we are looking now is not the source of an EIO, since there are two bricks with consistent state and the file should be perfectly readable and writable. It's true that there's some problem here and it could derive in EIO if one of the healthy bricks degrades, but at least this file shouldn't be giving EIO errors for now.

Xavi


Sent on from my iPhone

On Jan 16, 2017, at 6:23 AM, Xavier Hernandez <xhernandez@xxxxxxxxxx> wrote:

Hi Ram,

On 13/01/17 18:41, Ankireddypalle Reddy wrote:
Xavi,
            I enabled TRACE logging. The log grew up to 120GB and could not make much out of it. Then I started logging GFID in the code where we were seeing errors.

[2017-01-13 17:02:01.761349] I [dict.c:3065:dict_dump_to_log] 0-glusterfsProd-disperse-0: dict=0x7fa6706bc690 ((trusted.ec.size:0:0:0:0:30:6b:0:0:)(trusted.ec.version:0:0:0:0:0:0:2a:38:0:0:0:0:0:0:2a:38:))
[2017-01-13 17:02:01.761360] I [dict.c:3065:dict_dump_to_log] 0-glusterfsProd-disperse-0: dict=0x7fa6706bed64 ((trusted.ec.size:0:0:0:0:0:0:0:0:)(trusted.ec.version:0:0:0:0:0:0:0:0:0:0:0:0:0:0:2a:38:))
[2017-01-13 17:02:01.761365] W [MSGID: 122056] [ec-combine.c:881:ec_combine_check] 0-glusterfsProd-disperse-0: Mismatching xdata in answers of 'LOOKUP'
[2017-01-13 17:02:01.761405] I [dict.c:166:key_value_cmp] 0-glusterfsProd-disperse-0: 'trusted.ec.size' is different in two dicts (8, 8)
[2017-01-13 17:02:01.761417] I [dict.c:3065:dict_dump_to_log] 0-glusterfsProd-disperse-0: dict=0x7fa6706bbb14 ((trusted.ec.size:0:0:0:0:30:6b:0:0:)(trusted.ec.version:0:0:0:0:0:0:2a:38:0:0:0:0:0:0:2a:38:))
[2017-01-13 17:02:01.761428] I [dict.c:3065:dict_dump_to_log] 0-glusterfsProd-disperse-0: dict=0x7fa6706bed64 ((trusted.ec.size:0:0:0:0:0:0:0:0:)(trusted.ec.version:0:0:0:0:0:0:0:0:0:0:0:0:0:0:2a:38:))
[2017-01-13 17:02:01.761433] W [MSGID: 122056] [ec-combine.c:881:ec_combine_check] 0-glusterfsProd-disperse-0: Mismatching xdata in answers of 'LOOKUP'
[2017-01-13 17:02:01.761442] W [MSGID: 122006] [ec-combine.c:214:ec_iatt_combine] 0-glusterfsProd-disperse-0: Failed to combine iatt (inode: 11275691004192850514-11275691004192850514, gfid: 60b990ed-d741-4176-9c7b-4d3a25fb8252  -  60b990ed-d741-4176-9c7b-4d3a25fb8252,  links: 1-1, uid: 0-0, gid: 0-0, rdev: 0-0,size: 406650880-406683648, mode: 100775-100775)

The file for which we are seeing this error turns out to be having a GFID of 60b990ed-d741-4176-9c7b-4d3a25fb8252

Then I tried looking for find out the file with this GFID. It pointed me to following path. I was expecting a real file system path from the following turorial:
https://gluster.readthedocs.io/en/latest/Troubleshooting/gfid-to-path/

I think this method only works if bricks have the inode cached.


getfattr -n trusted.glusterfs.pathinfo -e text /mnt/gfid/.gfid/60b990ed-d741-4176-9c7b-4d3a25fb8252
getfattr: Removing leading '/' from absolute path names
# file: mnt/gfid/.gfid/60b990ed-d741-4176-9c7b-4d3a25fb8252
trusted.glusterfs.pathinfo="(<DISTRIBUTE:glusterfsProd-dht> (<EC:glusterfsProd-disperse-0> <POSIX(/ws/disk1/ws_brick):glusterfs6:/ws/disk1/ws_brick/.glusterfs/60/b9/60b990ed-d741-4176-9c7b-4d3a25fb8252> <POSIX(/ws/disk1/ws_brick):glusterfs5:/ws/disk1/ws_brick/.glusterfs/60/b9/60b990ed-d741-4176-9c7b-4d3a25fb8252>))"

Then I looked for the xatttrs for these files from all the 3 bricks

[root@glusterfs4 glusterfs]# getfattr -d -m . -e hex /ws/disk1/ws_brick/.glusterfs/60/b9/60b990ed-d741-4176-9c7b-4d3a25fb8252
getfattr: Removing leading '/' from absolute path names
# file: ws/disk1/ws_brick/.glusterfs/60/b9/60b990ed-d741-4176-9c7b-4d3a25fb8252
trusted.bit-rot.version=0x02000000000000005877a8dc00041138
trusted.ec.config=0x0000080301000200
trusted.ec.size=0x0000000000000000
trusted.ec.version=0x00000000000000000000000000002a38
trusted.gfid=0x60b990edd74141769c7b4d3a25fb8252

[root@glusterfs5 bricks]# getfattr -d -m . -e hex /ws/disk1/ws_brick/.glusterfs/60/b9/60b990ed-d741-4176-9c7b-4d3a25fb8252
getfattr: Removing leading '/' from absolute path names
# file: ws/disk1/ws_brick/.glusterfs/60/b9/60b990ed-d741-4176-9c7b-4d3a25fb8252
trusted.bit-rot.version=0x02000000000000005877a8dc000c92d0
trusted.ec.config=0x0000080301000200
trusted.ec.dirty=0x00000000000000160000000000000000
trusted.ec.size=0x00000000306b0000
trusted.ec.version=0x0000000000002a380000000000002a38
trusted.gfid=0x60b990edd74141769c7b4d3a25fb8252

[root@glusterfs6 ee]# getfattr -d -m . -e hex /ws/disk1/ws_brick/.glusterfs/60/b9/60b990ed-d741-4176-9c7b-4d3a25fb8252
getfattr: Removing leading '/' from absolute path names
# file: ws/disk1/ws_brick/.glusterfs/60/b9/60b990ed-d741-4176-9c7b-4d3a25fb8252
trusted.bit-rot.version=0x02000000000000005877a8dc000c9436
trusted.ec.config=0x0000080301000200
trusted.ec.dirty=0x00000000000000160000000000000000
trusted.ec.size=0x00000000306b0000
trusted.ec.version=0x0000000000002a380000000000002a38
trusted.gfid=0x60b990edd74141769c7b4d3a25fb8252

It turns out that the size and version in fact does not match for one of the files.

It seems as if the brick on glusterfs4 didn't receive any write request (or they failed for some reason). Do you still have the trace log ? is there any way I could download it ?

Xavi


Thanks and Regards,
Ram

-----Original Message-----
From: gluster-devel-bounces@xxxxxxxxxxx [mailto:gluster-devel-bounces@xxxxxxxxxxx] On Behalf Of Ankireddypalle Reddy
Sent: Friday, January 13, 2017 4:17 AM
To: Xavier Hernandez
Cc: gluster-users@xxxxxxxxxxx; Gluster Devel (gluster-devel@xxxxxxxxxxx)
Subject: Re:  [Gluster-users] Lot of EIO errors in disperse volume

Xavi,
       Thanks for explanation. Will collect TRACE logs today.

Thanks and Regards,
Ram

Sent from my iPhone

On Jan 13, 2017, at 3:03 AM, Xavier Hernandez <xhernandez@xxxxxxxxxx> wrote:

Hi Ram,

On 12/01/17 22:14, Ankireddypalle Reddy wrote:
Xavi,
           I changed the logging to log the individual bytes. Consider the following from ws-glus.log file where /ws/glus is the mount point.

[2017-01-12 20:47:59.368102] I [MSGID: 109063]
[dht-layout.c:718:dht_layout_normalize] 0-glusterfsProd-dht: Found
anomalies in
/Folder_01.05.2017_21.15/CV_MAGNETIC/V_30970/CHUNK_390607 (gfid =
e694387f-dde7-410b-9562-914a994d5e85). Holes=1 overlaps=0
[2017-01-12 20:47:59.391218] I [MSGID: 109036]
[dht-common.c:9082:dht_log_new_layout_for_dir_selfheal]
0-glusterfsProd-dht: Setting layout of
/Folder_01.05.2017_21.15/CV_MAGNETIC/V_30970/CHUNK_390607 with
[Subvol_name: glusterfsProd-disperse-0, Err: -1 , Start: 2505397587 ,
Stop: 2863311527 , Hash: 1 ], [Subvol_name: glusterfsProd-disperse-1,
Err: -1 , Start: 2863311528 , Stop: 3221225468 , Hash: 1 ],
[Subvol_name: glusterfsProd-disperse-10, Err: -1 , Start: 3221225469
, Stop: 3579139409 , Hash: 1 ], [Subvol_name:
glusterfsProd-disperse-11, Err: -1 , Start: 3579139410 , Stop:
3937053350 , Hash: 1 ], [Subvol_name: glusterfsProd-disperse-2, Err:
-1 , Start: 3937053351 , Stop: 4294967295 , Hash: 1 ], [Subvol_name:
glusterfsProd-disperse-3, Err: -1 , Start: 0 , Stop: 357913940 ,
Hash: 1 ], [Subvol_name: glusterfsProd-disperse-4, Err: -1 , Start:
357913941 , Stop: 715827881 , Hash: 1 ], [Subvol_name:
glusterfsProd-disperse-5, Err: -1 , Start: 715827882 , Stop:
1073741822 , Hash
: 1 ], [Subvol_name: glusterfsProd-disperse-6, Err: -1 , Start: 1073741823 , Stop: 1431655763 , Hash: 1 ], [Subvol_name: glusterfsProd-disperse-7, Err: -1 , Start: 1431655764 , Stop: 1789569704 , Hash: 1 ], [Subvol_name: glusterfsProd-disperse-8, Err: -1 , Start: 1789569705 , Stop: 2147483645 , Hash: 1 ], [Subvol_name: glusterfsProd-disperse-9, Err: -1 , Start: 2147483646 , Stop: 2505397586 , Hash: 1 ],

          Self-heal seems to be triggered for path  /Folder_01.05.2017_21.15/CV_MAGNETIC/V_30970/CHUNK_390607 due to anomalies as per DHT.  It would be great if someone could explain what could be the anomaly here.  The setup where we encountered this is a fairly stable setup with no brick failures or no node failures.

This is not really a self-heal, at least from the point of view of ec. This means that DHT has found a discrepancy in the layout of that directory, however this doesn't mean any problem (notice the 'I' in the log, meaning that it's informative, not a warning nor error).

Not sure how DHT works in this case or why it finds this "anomaly", but if there aren't any previous errors before that message, it can be completely ignored.

Not sure if it can be related to option cluster.weighted-rebalance that is enabled by default.


         Then Self-heal seems to have encountered the following error.

[2017-01-12 20:48:23.418432] I [dict.c:166:key_value_cmp]
0-glusterfsProd-disperse-2: 'trusted.ec.version' is different in two
dicts (16, 16)
[2017-01-12 20:48:23.418496] I [dict.c:3065:dict_dump_to_log]
0-glusterfsProd-disperse-2: dict=0x7f0b649520ac
((trusted.glusterfs.dht:0:0:0:1:0:0:0:0:0:0:0:0:15:55:55:54:)(trusted
.ec.version:0:0:0:0:0:0:0:b:0:0:0:0:0:0:0:e:))
[2017-01-12 20:48:23.418519] I [dict.c:3065:dict_dump_to_log]
0-glusterfsProd-disperse-2: dict=0x7f0b6495b4e0
((trusted.glusterfs.dht:0:0:0:1:0:0:0:0:0:0:0:0:15:55:55:54:)(trusted
.ec.version:0:0:0:0:0:0:0:d:0:0:0:0:0:0:0:e:))
[2017-01-12 20:48:23.418531] W [MSGID: 122056] [ec-combine.c:873:ec_combine_check] 0-glusterfsProd-disperse-2: Mismatching xdata in answers of 'LOOKUP'

That's a real problem. Here we have two bricks that differ in the trusted.ec.version xattr. However this xattr not necessarily belongs to the previous directory. They are unrelated messages.


           In this case glusterfsProd-disperse-2 sub volume actually consists of the following bricks.
           glusterfs4sds:/ws/disk11/ws_brick, glusterfs5sds:
/ws/disk11/ws_brick, glusterfs6sds: /ws/disk11/ws_brick

           I went ahead and checked the value of trusted.ec.version on all the 3 bricks inside this sub vol:

           [root@glusterfs6 ~]# getfattr -e hex -n trusted.ec.version /ws/disk11/ws_brick//Folder_01.05.2017_21.15/CV_MAGNETIC/V_30970/CHUNK_390607
           # file: ws/disk11/ws_brick//Folder_01.05.2017_21.15/CV_MAGNETIC/V_30970/CHUNK_390607
           trusted.ec.version=0x0000000000000009000000000000000b

           [root@glusterfs4 ~]# getfattr -e hex -n trusted.ec.version /ws/disk11/ws_brick//Folder_01.05.2017_21.15/CV_MAGNETIC/V_30970/CHUNK_390607
           # file: ws/disk11/ws_brick//Folder_01.05.2017_21.15/CV_MAGNETIC/V_30970/CHUNK_390607
            trusted.ec.version=0x0000000000000009000000000000000b

            [root@glusterfs5 glusterfs]# getfattr -e hex -n trusted.ec.version /ws/disk11/ws_brick//Folder_01.05.2017_21.15/CV_MAGNETIC/V_30970/CHUNK_390607
            # file: ws/disk11/ws_brick//Folder_01.05.2017_21.15/CV_MAGNETIC/V_30970/CHUNK_390607
            trusted.ec.version=0x0000000000000009000000000000000b

The attribute value seems to be same on all the 3 bricks.

That's a clear indication that the ec warning is not related to this directory because trusted.ec.version always increases, never decreases, and the directory has a value smaller that the one that appears in the log message.

If you show all dict entries in the log, it seems that it does refer to a directory because trusted.ec.size is not present, but it must be another directory than the one you looked at. We would need to find which one is having this issue. The TRACE log would be helpful here.



             Also please note that every single time that the trusted.ec.version was found to mismatch the  same values are getting logged. Following are 2 more instances of trusted.ec.version mismatch.

[2017-01-12 20:14:25.554540] I [dict.c:166:key_value_cmp]
0-glusterfsProd-disperse-2: 'trusted.ec.version' is different in two
dicts (16, 16)
[2017-01-12 20:14:25.554588] I [dict.c:3065:dict_dump_to_log]
0-glusterfsProd-disperse-2: dict=0x7f0b6495a9f0
((glusterfs.open-fd-count:30:0:)(trusted.glusterfs.dht:0:0:0:1:0:0:0:
0:0:0:0:0:15:55:55:54:)(trusted.ec.version:0:0:0:0:0:0:0:d:0:0:0:0:0:
0:0:e:))
[2017-01-12 20:14:25.554608] I [dict.c:3065:dict_dump_to_log]
0-glusterfsProd-disperse-2: dict=0x7f0b6495903c
((glusterfs.open-fd-count:30:0:)(trusted.glusterfs.dht:0:0:0:1:0:0:0:
0:0:0:0:0:15:55:55:54:)(trusted.ec.version:0:0:0:0:0:0:0:b:0:0:0:0:0:
0:0:e:))
[2017-01-12 20:14:25.554624] W [MSGID: 122053]
[ec-common.c:116:ec_check_status] 0-glusterfsProd-disperse-2:
Operation failed on some subvolumes (up=7, mask=7, remaining=0,
good=3, bad=4)
[2017-01-12 20:14:25.554632] W [MSGID: 122002]
[ec-common.c:71:ec_heal_report] 0-glusterfsProd-disperse-2: Heal
failed [Invalid argument]
[2017-01-12 20:14:25.555598] I [dict.c:166:key_value_cmp]
0-glusterfsProd-disperse-2: 'trusted.ec.version' is different in two
dicts (16, 16)
[2017-01-12 20:14:25.555622] I [dict.c:3065:dict_dump_to_log]
0-glusterfsProd-disperse-2: dict=0x7f0b64956c24
((glusterfs.open-fd-count:30:0:)(trusted.glusterfs.dht:0:0:0:1:0:0:0:
0:0:0:0:0:15:55:55:54:)(trusted.ec.version:0:0:0:0:0:0:0:b:0:0:0:0:0:
0:0:e:))
[2017-01-12 20:14:25.555638] I [dict.c:3065:dict_dump_to_log]
0-glusterfsProd-disperse-2: dict=0x7f0b64964e8c
((glusterfs.open-fd-count:30:0:)(trusted.glusterfs.dht:0:0:0:1:0:0:0:
0:0:0:0:0:15:55:55:54:)(trusted.ec.version:0:0:0:0:0:0:0:d:0:0:0:0:0:
0:0:e:))


I think that this refers to the same directory. This seems an attempt to heal it that has failed. So it makes sense that it finds exactly the same values.



In glustershd.log lot of similar errors are logged.

[2017-01-12 21:10:53.728770] I [dict.c:166:key_value_cmp]
0-glusterfsProd-disperse-0: 'trusted.ec.size' is different in two
dicts (8, 8)
[2017-01-12 21:10:53.728804] I [dict.c:3065:dict_dump_to_log]
0-glusterfsProd-disperse-0: dict=0x7f21694b6f50
((trusted.ec.size:0:0:0:0:42:3a:0:0:)(trusted.ec.version:0:0:0:0:0:0:
37:5f:0:0:0:0:0:0:37:5f:))
[2017-01-12 21:10:53.728827] I [dict.c:3065:dict_dump_to_log]
0-glusterfsProd-disperse-0: dict=0x7f21694b62bc
((trusted.ec.size:0:0:0:0:0:ca:0:0:)(trusted.ec.version:0:0:0:0:0:0:0
:a1:0:0:0:0:0:0:37:5f:))
[2017-01-12 21:10:53.728842] W [MSGID: 122056] [ec-combine.c:873:ec_combine_check] 0-glusterfsProd-disperse-0: Mismatching xdata in answers of 'LOOKUP'
[2017-01-12 21:10:53.728854] W [MSGID: 122053]
[ec-common.c:116:ec_check_status] 0-glusterfsProd-disperse-0:
Operation failed on some subvolumes (up=7, mask=7, remaining=0,
good=6, bad=1)
[2017-01-12 21:10:53.728876] W [MSGID: 122002]
[ec-common.c:71:ec_heal_report] 0-glusterfsProd-disperse-0: Heal
failed [Invalid argument]

This seems an attempt to heal a file, but I see a lot of differences between both versions. The size on one brick is 13.238.272 bytes, but on the other brick it's 1.111.097.344 bytes. That's a huge difference.

Looking at the trusted.ec.version, I see that the 'data' version is very different (from 161 to 14.175), however the metadata version is exactly the same. This really seems like a lot of writes while one brick was down (or disconnected for some reason, or writes failed for some reason). One brick has lost about 14.000 writes of ~80KB.

I think the most important thing right now would be to identify which files and directories are having these problems to be able to identify the cause. Again, the TRACE log will be really useful.

Xavi


Thanks and Regards,
Ram

-----Original Message-----
From: Xavier Hernandez [mailto:xhernandez@xxxxxxxxxx]
Sent: Thursday, January 12, 2017 6:40 AM
To: Ankireddypalle Reddy
Cc: Gluster Devel (gluster-devel@xxxxxxxxxxx);
gluster-users@xxxxxxxxxxx
Subject: Re: [Gluster-users]  Lot of EIO errors in
disperse volume

Hi Ram,


On 12/01/17 11:49, Ankireddypalle Reddy wrote:
Xavi,
        As I mentioned before the error could happen for any FOP. Will try to run with TRACE debug level. Is there a possibility that we are checking for this attribute on a directory, because a directory does not seem to be having this attribute set.

No, directories do not have this attribute and no one should be reading it from a directory.

Also is the function to check size and version called after it is decided that heal should be run or is this check is the one which decides whether a heal should be run.

Almost all checks that trigger a heal are done in the lookup fop when some discrepancy is detected.

The function that checks size and version is called later once a lock on the inode is acquired (even if no heal is needed). However further failures in the processing of any fop can also trigger a self-heal.

Xavi


Thanks and Regards,
Ram

Sent from my iPhone

On Jan 12, 2017, at 2:25 AM, Xavier Hernandez <xhernandez@xxxxxxxxxx> wrote:

Hi Ram,

On 12/01/17 02:36, Ankireddypalle Reddy wrote:
Xavi,
       I added some more logging information. The trusted.ec.size field values are in fact different.
        trusted.ec.size    l1 = 62719407423488    l2 = 0

That's very weird. Directories do not have this attribute. It's only present on regular files. But you said that the error happens while creating the file, so it doesn't make much sense because file creation always sets trusted.ec.size to 0.

Could you reproduce the problem with diagnostics.client-log-level set to TRACE and send the log to me ? it will create a big log, but I'll have much more information about what's going on.

Do you have a mixed setup with nodes of different types ? for example mixed 32/64 bits architectures or different operating systems ? I ask this because 62719407423488 in hex is 0x390B00000000, which has the lower 32 bits set to 0, but has garbage above that.


        This is a fairly static setup with no brick/ node failure.  Please explain why  is that a heal is being triggered and what could have acutually caused these size xattrs to differ.  This is causing random I/O failures and is impacting the backup schedules.

The launch of self-heal is normal because it has detected an inconsistency. The real problem is what originates that inconsistency.

Xavi


[ 2017-01-12 01:19:18.256970] W [MSGID: 122056] [ec-combine.c:873:ec_combine_check] 0-glusterfsProd-disperse-8: Mismatching xdata in answers of 'LOOKUP'
[2017-01-12 01:19:18.257015] W [MSGID: 122053]
[ec-common.c:116:ec_check_status] 0-glusterfsProd-disperse-8:
Operation failed on some subvolumes (up=7, mask=7, remaining=0,
good=3, bad=4)
[2017-01-12 01:19:18.257018] W [MSGID: 122002]
[ec-common.c:71:ec_heal_report] 0-glusterfsProd-disperse-8: Heal
failed [Invalid argument]
[2017-01-12 01:19:21.002028] E [dict.c:197:key_value_cmp]
0-glusterfsProd-disperse-4: 'trusted.ec.size' is different in two
dicts (8, 8)
[2017-01-12 01:19:21.002056] E [dict.c:166:log_value]
0-glusterfsProd-disperse-4: trusted.ec.size [ l1 = 62719407423488
l2 = 0 i1 = 0 i2 = 0 ]
[2017-01-12 01:19:21.002064] W [MSGID: 122056] [ec-combine.c:873:ec_combine_check] 0-glusterfsProd-disperse-4: Mismatching xdata in answers of 'LOOKUP'
[2017-01-12 01:19:21.209640] E [dict.c:197:key_value_cmp]
0-glusterfsProd-disperse-4: 'trusted.ec.size' is different in two
dicts (8, 8)
[2017-01-12 01:19:21.209673] E [dict.c:166:log_value]
0-glusterfsProd-disperse-4: trusted.ec.size [ l1 = 62719407423488
l2 = 0 i1 = 0 i2 = 0 ]
[2017-01-12 01:19:21.209686] W [MSGID: 122056] [ec-combine.c:873:ec_combine_check] 0-glusterfsProd-disperse-4: Mismatching xdata in answers of 'LOOKUP'
[2017-01-12 01:19:21.209719] W [MSGID: 122053]
[ec-common.c:116:ec_check_status] 0-glusterfsProd-disperse-4:
Operation failed on some subvolumes (up=7, mask=7, remaining=0,
good=6, bad=1)
[2017-01-12 01:19:21.209753] W [MSGID: 122002]
[ec-common.c:71:ec_heal_report] 0-glusterfsProd-disperse-4: Heal
failed [Invalid argument]

Thanks and Regards,
Ram

-----Original Message-----
From: Ankireddypalle Reddy
Sent: Wednesday, January 11, 2017 9:29 AM
To: Ankireddypalle Reddy; Xavier Hernandez; Gluster Devel
(gluster-devel@xxxxxxxxxxx); gluster-users@xxxxxxxxxxx
Subject: RE: [Gluster-users]  Lot of EIO errors in
disperse volume

Xavi,
         I built a debug binary to log more information. This is what is getting logged. Looks like it is the attribute trusted.ec.size which is different among the bricks in a sub volume.

In glustershd.log :

[2017-01-11 14:19:45.023845] N [MSGID: 122029] [ec-generic.c:683:ec_combine_lookup] 0-glusterfsProd-disperse-8: Mismatching iatt in answers of 'GF_FOP_LOOKUP'
[2017-01-11 14:19:45.027718] E [dict.c:166:key_value_cmp]
0-glusterfsProd-disperse-6: 'trusted.ec.size' is different in two
dicts (8, 8)
[2017-01-11 14:19:45.027736] W [MSGID: 122056] [ec-combine.c:873:ec_combine_check] 0-glusterfsProd-disperse-6: Mismatching xdata in answers of 'LOOKUP'
[2017-01-11 14:19:45.027763] E [dict.c:166:key_value_cmp]
0-glusterfsProd-disperse-6: 'trusted.ec.size' is different in two
dicts (8, 8)
[2017-01-11 14:19:45.027781] W [MSGID: 122056] [ec-combine.c:873:ec_combine_check] 0-glusterfsProd-disperse-6: Mismatching xdata in answers of 'LOOKUP'
[2017-01-11 14:19:45.027793] W [MSGID: 122053]
[ec-common.c:116:ec_check_status] 0-glusterfsProd-disperse-6:
Operation failed on some subvolumes (up=7, mask=7, remaining=0,
good=6, bad=1)
[2017-01-11 14:19:45.027815] W [MSGID: 122002]
[ec-common.c:71:ec_heal_report] 0-glusterfsProd-disperse-6: Heal
failed [Invalid argument]
[2017-01-11 14:19:45.029035] E [dict.c:166:key_value_cmp]
0-glusterfsProd-disperse-8: 'trusted.ec.size' is different in two
dicts (8, 8)
[2017-01-11 14:19:45.029057] W [MSGID: 122056] [ec-combine.c:873:ec_combine_check] 0-glusterfsProd-disperse-8: Mismatching xdata in answers of 'LOOKUP'
[2017-01-11 14:19:45.029089] E [dict.c:166:key_value_cmp]
0-glusterfsProd-disperse-8: 'trusted.ec.size' is different in two
dicts (8, 8)
[2017-01-11 14:19:45.029105] W [MSGID: 122056] [ec-combine.c:873:ec_combine_check] 0-glusterfsProd-disperse-8: Mismatching xdata in answers of 'LOOKUP'
[2017-01-11 14:19:45.029121] W [MSGID: 122053]
[ec-common.c:116:ec_check_status] 0-glusterfsProd-disperse-8:
Operation failed on some subvolumes (up=7, mask=7, remaining=0,
good=6, bad=1)
[2017-01-11 14:19:45.032566] E [dict.c:166:key_value_cmp]
0-glusterfsProd-disperse-6: 'trusted.ec.size' is different in two
dicts (8, 8)
[2017-01-11 14:19:45.029138] W [MSGID: 122002]
[ec-common.c:71:ec_heal_report] 0-glusterfsProd-disperse-8: Heal
failed [Invalid argument]
[2017-01-11 14:19:45.032585] W [MSGID: 122056] [ec-combine.c:873:ec_combine_check] 0-glusterfsProd-disperse-6: Mismatching xdata in answers of 'LOOKUP'
[2017-01-11 14:19:45.032614] E [dict.c:166:key_value_cmp]
0-glusterfsProd-disperse-6: 'trusted.ec.size' is different in two
dicts (8, 8)
[2017-01-11 14:19:45.032631] W [MSGID: 122056] [ec-combine.c:873:ec_combine_check] 0-glusterfsProd-disperse-6: Mismatching xdata in answers of 'LOOKUP'
[2017-01-11 14:19:45.032638] W [MSGID: 122053]
[ec-common.c:116:ec_check_status] 0-glusterfsProd-disperse-6:
Operation failed on some subvolumes (up=7, mask=7, remaining=0,
good=6, bad=1)
[2017-01-11 14:19:45.032654] W [MSGID: 122002]
[ec-common.c:71:ec_heal_report] 0-glusterfsProd-disperse-6: Heal
failed [Invalid argument]
[2017-01-11 14:19:45.037514] E [dict.c:166:key_value_cmp]
0-glusterfsProd-disperse-6: 'trusted.ec.size' is different in two
dicts (8, 8)
[2017-01-11 14:19:45.037536] W [MSGID: 122056] [ec-combine.c:873:ec_combine_check] 0-glusterfsProd-disperse-6: Mismatching xdata in answers of 'LOOKUP'
[2017-01-11 14:19:45.037553] E [dict.c:166:key_value_cmp]
0-glusterfsProd-disperse-6: 'trusted.ec.size' is different in two
dicts (8, 8)
[2017-01-11 14:19:45.037573] W [MSGID: 122056] [ec-combine.c:873:ec_combine_check] 0-glusterfsProd-disperse-6: Mismatching xdata in answers of 'LOOKUP'
[2017-01-11 14:19:45.037582] W [MSGID: 122053]
[ec-common.c:116:ec_check_status] 0-glusterfsProd-disperse-6:
Operation failed on some subvolumes (up=7, mask=7, remaining=0,
good=6, bad=1)
[2017-01-11 14:19:45.037599] W [MSGID: 122002]
[ec-common.c:71:ec_heal_report] 0-glusterfsProd-disperse-6: Heal
failed [Invalid argument]
[2017-01-11 14:20:40.001401] E [dict.c:166:key_value_cmp]
0-glusterfsProd-disperse-3: 'trusted.ec.size' is different in two
dicts (8, 8)
[2017-01-11 14:20:40.001387] E [dict.c:166:key_value_cmp]
0-glusterfsProd-disperse-5: 'trusted.ec.size' is different in two
dicts (8, 8)

In the mount daemon log:

[2017-01-11 14:20:17.806826] E [MSGID: 122001]
[ec-common.c:872:ec_config_check] 2-glusterfsProd-disperse-0:
Invalid or corrupted config [Invalid argument]
[2017-01-11 14:20:17.806847] E [MSGID: 122066]
[ec-common.c:969:ec_prepare_update_cbk] 2-glusterfsProd-disperse-0:
Invalid config xattr [Invalid argument]
[2017-01-11 14:20:17.807076] E [MSGID: 122001]
[ec-common.c:872:ec_config_check] 2-glusterfsProd-disperse-1:
Invalid or corrupted config [Invalid argument]
[2017-01-11 14:20:17.807099] E [MSGID: 122066]
[ec-common.c:969:ec_prepare_update_cbk] 2-glusterfsProd-disperse-1:
Invalid config xattr [Invalid argument]
[2017-01-11 14:20:17.807286] E [MSGID: 122001]
[ec-common.c:872:ec_config_check] 2-glusterfsProd-disperse-10:
Invalid or corrupted config [Invalid argument]
[2017-01-11 14:20:17.807298] E [MSGID: 122066]
[ec-common.c:969:ec_prepare_update_cbk] 2-glusterfsProd-disperse-10:
Invalid config xattr [Invalid argument]
[2017-01-11 14:20:17.807409] E [MSGID: 122001]
[ec-common.c:872:ec_config_check] 2-glusterfsProd-disperse-11:
Invalid or corrupted config [Invalid argument]
[2017-01-11 14:20:17.807420] E [MSGID: 122066]
[ec-common.c:969:ec_prepare_update_cbk] 2-glusterfsProd-disperse-11:
Invalid config xattr [Invalid argument]
[2017-01-11 14:20:17.807448] E [MSGID: 122001]
[ec-common.c:872:ec_config_check] 2-glusterfsProd-disperse-4:
Invalid or corrupted config [Invalid argument]
[2017-01-11 14:20:17.807462] E [MSGID: 122066]
[ec-common.c:969:ec_prepare_update_cbk] 2-glusterfsProd-disperse-4:
Invalid config xattr [Invalid argument]
[2017-01-11 14:20:17.807539] E [MSGID: 122001]
[ec-common.c:872:ec_config_check] 2-glusterfsProd-disperse-2:
Invalid or corrupted config [Invalid argument]
[2017-01-11 14:20:17.807550] E [MSGID: 122066]
[ec-common.c:969:ec_prepare_update_cbk] 2-glusterfsProd-disperse-2:
Invalid config xattr [Invalid argument]
[2017-01-11 14:20:17.807723] E [MSGID: 122001]
[ec-common.c:872:ec_config_check] 2-glusterfsProd-disperse-3:
Invalid or corrupted config [Invalid argument]
[2017-01-11 14:20:17.807739] E [MSGID: 122066]
[ec-common.c:969:ec_prepare_update_cbk] 2-glusterfsProd-disperse-3:
Invalid config xattr [Invalid argument]
[2017-01-11 14:20:17.807785] E [MSGID: 122001]
[ec-common.c:872:ec_config_check] 2-glusterfsProd-disperse-5:
Invalid or corrupted config [Invalid argument]
[2017-01-11 14:20:17.807796] E [MSGID: 122066]
[ec-common.c:969:ec_prepare_update_cbk] 2-glusterfsProd-disperse-5:
Invalid config xattr [Invalid argument]
[2017-01-11 14:20:17.808020] E [MSGID: 122001]
[ec-common.c:872:ec_config_check] 2-glusterfsProd-disperse-9:
Invalid or corrupted config [Invalid argument]
[2017-01-11 14:20:17.808034] E [MSGID: 122066]
[ec-common.c:969:ec_prepare_update_cbk] 2-glusterfsProd-disperse-9:
Invalid config xattr [Invalid argument]
[2017-01-11 14:20:17.808054] E [MSGID: 122001]
[ec-common.c:872:ec_config_check] 2-glusterfsProd-disperse-6:
Invalid or corrupted config [Invalid argument]
[2017-01-11 14:20:17.808066] E [MSGID: 122066]
[ec-common.c:969:ec_prepare_update_cbk] 2-glusterfsProd-disperse-6:
Invalid config xattr [Invalid argument]
[2017-01-11 14:20:17.808282] E [MSGID: 122001]
[ec-common.c:872:ec_config_check] 2-glusterfsProd-disperse-8:
Invalid or corrupted config [Invalid argument]
[2017-01-11 14:20:17.808292] E [MSGID: 122066]
[ec-common.c:969:ec_prepare_update_cbk] 2-glusterfsProd-disperse-8:
Invalid config xattr [Invalid argument]
[2017-01-11 14:20:17.809212] E [MSGID: 122001]
[ec-common.c:872:ec_config_check] 2-glusterfsProd-disperse-7:
Invalid or corrupted config [Invalid argument]
[2017-01-11 14:20:17.809228] E [MSGID: 122066]
[ec-common.c:969:ec_prepare_update_cbk] 2-glusterfsProd-disperse-7:
Invalid config xattr [Invalid argument]

[2017-01-11 14:20:17.812660] I [MSGID: 109036]
[dht-common.c:8043:dht_log_new_layout_for_dir_selfheal]
2-glusterfsProd-dht: Setting layout of
/Folder_01.05.2017_21.15/CV_MAGNETIC/V_31500/CHUNK_402578 with
[Subvol_name: glusterfsProd-disperse-0, Err: -1 , Start:
1789569705 , Stop: 2147483645 , Hash: 1 ], [Subvol_name:
glusterfsProd-disperse-1, Err: -1 , Start: 2147483646 , Stop:
2505397586 , Hash: 1 ], [Subvol_name: glusterfsProd-disperse-10,
Err: -1 , Start: 2505397587 , Stop: 2863311527 , Hash: 1 ],
[Subvol_name: glusterfsProd-disperse-11, Err: -1 , Start:
2863311528 , Stop: 3221225468 , Hash: 1 ], [Subvol_name:
glusterfsProd-disperse-2, Err: -1 , Start: 3221225469 , Stop:
3579139409 , Hash: 1 ], [Subvol_name: glusterfsProd-disperse-3, Err:
-1 , Start: 3579139410 , Stop: 3937053350 , Hash: 1 ], [Subvol_name:
glusterfsProd-disperse-4, Err: -1 , Start: 3937053351 , Stop:
4294967295 , Hash: 1 ], [Subvol_name: glusterfsProd-disperse-5, Err:
-1 , Start: 0 , Stop: 357913940 , Hash: 1 ], [Subvol_name:
glusterfsProd-disperse-6, Err: -1 , Start: 357913941 , Stop:
715827881 , Hash: 1 ], [Subvol_name: glusterfsProd-disperse-7, Err:
-1 , Start: 715827882 , Stop: 1073741822 , Hash: 1 ], [Subvol_name:
glusterfsProd-disperse-8, Err: -1 , Start: 1073741823 , Stop:
1431655763 , Hash: 1 ], [Subvol_name: glusterfsProd-disperse-9, Err:
-1 , Start: 1431655764 , Stop: 1789569704 , Hash: 1 ],


-----Original Message-----
From: gluster-users-bounces@xxxxxxxxxxx
[mailto:gluster-users-bounces@xxxxxxxxxxx] On Behalf Of
Ankireddypalle Reddy
Sent: Tuesday, January 10, 2017 10:09 AM
To: Xavier Hernandez; Gluster Devel (gluster-devel@xxxxxxxxxxx);
gluster-users@xxxxxxxxxxx
Subject: Re: [Gluster-users]  Lot of EIO errors in
disperse volume

Xavi,
        In this case it's the file creation which failed. So I provided the xattrs of the parent.

Thanks and Regards,
Ram

-----Original Message-----
From: Xavier Hernandez [mailto:xhernandez@xxxxxxxxxx]
Sent: Tuesday, January 10, 2017 9:10 AM
To: Ankireddypalle Reddy; Gluster Devel
(gluster-devel@xxxxxxxxxxx); gluster-users@xxxxxxxxxxx
Subject: Re:  Lot of EIO errors in disperse volume

Hi Ram,

On 10/01/17 14:42, Ankireddypalle Reddy wrote:
Attachments (2):

1



ec.txt
<https://imap.commvault.com/webconsole/embedded.do?url=https://im
ap
.co
mmvault.com/webconsole/api/drive/publicshare/346714/file/ee2d1536
c2
dc4
dff94afb12132b4f8f6/action/preview&downloadUrl=https://imap.commvault.
com/webconsole/api/contentstore/publicshare/346714/file/ee2d1536c
2d c4d ff94afb12132b4f8f6/action/download>
[Download]
<https://imap.commvault.com/webconsole/api/contentstore/publicsha
re
/34
6714/file/ee2d1536c2dc4dff94afb12132b4f8f6/action/download>(11.50
KB)

2



ws-glus.log
<https://imap.commvault.com/webconsole/embedded.do?url=https://im
ap
.co
mmvault.com/webconsole/api/drive/publicshare/346714/file/cff3e050
6e
754
b9a939db02da1cbbd58/action/preview&downloadUrl=https://imap.commvault.
com/webconsole/api/contentstore/publicshare/346714/file/cff3e0506
e7 54b 9a939db02da1cbbd58/action/download>
[Download]
<https://imap.commvault.com/webconsole/api/contentstore/publicsha
re
/34
6714/file/cff3e0506e754b9a939db02da1cbbd58/action/download>(3.48
MB)

Xavi,
       We are encountering errors for different kinds of FOPS.
       The open failed for the following file:

       cvd_2017_01_10_02_28_26.log:98182 1f9fe 01/10 00:57:10 8414465
[MEDIAFS    ] 20117519-52075477 SingleInstancer_FS::StartDataFile2:
Failed to create the data file
[/ws/glus/Folder_07.11.2016_23.02/CV_MAGNETIC/V_8854974/CHUNK_513
42
720 /SFILE_CONTAINER_062], error=0xECCC0005:{CQiFile::Open(92)} +
{CQiUTFOSAPI::open(96)/ErrNo.5.(Input/output error)-Open failed,
File=/ws/glus/Folder_07.11.2016_23.02/CV_MAGNETIC/V_8854974/CHUNK
_5
134 2720/SFILE_CONTAINER_062, OperationFlag=0xC1,
PermissionMode=0x1FF}

       I've attached the extended attributes for the directories
       /ws/glus/Folder_07.11.2016_23.02/CV_MAGNETIC/V_8854974/
and

/ws/glus/Folder_07.11.2016_23.02/CV_MAGNETIC/V_8854974/CHUNK_5134
27
20
from all the bricks.

      The attributes look fine to me. I've also attached some
log cuts to illustrate the problem.

I need the extended attributes of the file itself, not the parent directories.

Xavi


Thanks and Regards,
Ram

-----Original Message-----
From: Xavier Hernandez [mailto:xhernandez@xxxxxxxxxx]
Sent: Tuesday, January 10, 2017 7:53 AM
To: Ankireddypalle Reddy; Gluster Devel
(gluster-devel@xxxxxxxxxxx); gluster-users@xxxxxxxxxxx
Subject: Re:  Lot of EIO errors in disperse volume

Hi Ram,

the error is caused by an extended attribute that does not match
on all
3 bricks of the disperse set. Most probable value is
trusted.ec.version, but could be others.

At first sight, I don't see any change from 3.7.8 that could have
caused this. I'll check again.

What kind of operations are you doing ? this can help me narrow the search.

Xavi

On 10/01/17 13:43, Ankireddypalle Reddy wrote:
Xavi,
       Thanks. If you could please explain what to look for in
the
extended attributes then I will check and let you know if I find
anything suspicious.  Also we noticed that some of these
operations would succeed if retried. Do you know of any
communicated related errors that are being reported/triaged.

Thanks and Regards,
Ram

-----Original Message-----
From: Xavier Hernandez [mailto:xhernandez@xxxxxxxxxx]
Sent: Tuesday, January 10, 2017 7:23 AM
To: Ankireddypalle Reddy; Gluster Devel
(gluster-devel@xxxxxxxxxxx); gluster-users@xxxxxxxxxxx
Subject: Re:  Lot of EIO errors in disperse
volume

Hi Ram,

On 10/01/17 13:14, Ankireddypalle Reddy wrote:
Attachment (1):

1



ecxattrs.txt
<https://imap.commvault.com/webconsole/embedded.do?url=https://imap.
c
o
mmvault.com/webconsole/api/drive/publicshare/346714/file/1272e6
82
787
4
4
f15bf1a54f2b31b559d/action/preview&downloadUrl=https://imap.commvault.
com/webconsole/api/contentstore/publicshare/346714/file/1272e68
27
874
4
f
15bf1a54f2b31b559d/action/download>
[Download]
<https://imap.commvault.com/webconsole/api/contentstore/publics
ha
re/
3
4
6714/file/1272e68278744f15bf1a54f2b31b559d/action/download>(5.9
2
KB)

Xavi,
          Please find attached the extended attributes for a
directory from all the bricks. Free space check failed for this
with error number EIO.

What do you mean ? what operation have you made to check the
free
space on that directory ?

If it's a recursive check, I need the extended attributes from
the
exact file that triggers the EIO. The attached attributes seem
consistent and that directory shouldn't cause any problem. Does an 'ls'
on that directory fail or does it show the contents ?

Xavi


Thanks and Regards,
Ram

-----Original Message-----
From: Xavier Hernandez [mailto:xhernandez@xxxxxxxxxx]
Sent: Tuesday, January 10, 2017 6:45 AM
To: Ankireddypalle Reddy; Gluster Devel
(gluster-devel@xxxxxxxxxxx); gluster-users@xxxxxxxxxxx
Subject: Re:  Lot of EIO errors in disperse
volume

Hi Ram,

can you execute the following command on all bricks on a file
that is giving EIO ?

getfattr -m. -e hex -d <path to file in brick>

Xavi

On 10/01/17 12:41, Ankireddypalle Reddy wrote:
Xavi,
         We have been running 3.7.8 on these servers. We
upgraded
to 3.7.18 yesterday. We upgraded all the servers at a time.
The volume was brought down during upgrade.

Thanks and Regards,
Ram

-----Original Message-----
From: Xavier Hernandez [mailto:xhernandez@xxxxxxxxxx]
Sent: Tuesday, January 10, 2017 6:35 AM
To: Ankireddypalle Reddy; Gluster Devel
(gluster-devel@xxxxxxxxxxx); gluster-users@xxxxxxxxxxx
Subject: Re:  Lot of EIO errors in disperse
volume

Hi Ram,

how did you upgrade gluster ? from which version ?

Did you upgrade one server at a time and waited until
self-heal
finished before upgrading the next server ?

Xavi

On 10/01/17 11:39, Ankireddypalle Reddy wrote:
Hi,

   We upgraded to GlusterFS 3.7.18 yesterday.  We see lot of
failures in our applications. Most of the errors are EIO. The
following log lines are commonly seen in the logs:



The message "W [MSGID: 122056]
[ec-combine.c:873:ec_combine_check]
0-StoragePool-disperse-4: Mismatching xdata in answers of 'LOOKUP'"
repeated 2 times between [2017-01-10 02:46:25.069809] and
[2017-01-10 02:46:25.069835]

[2017-01-10 02:46:25.069852] W [MSGID: 122056]
[ec-combine.c:873:ec_combine_check] 0-StoragePool-disperse-5:
Mismatching xdata in answers of 'LOOKUP'

The message "W [MSGID: 122056]
[ec-combine.c:873:ec_combine_check]
0-StoragePool-disperse-5: Mismatching xdata in answers of 'LOOKUP'"
repeated 2 times between [2017-01-10 02:46:25.069852] and
[2017-01-10 02:46:25.069873]

[2017-01-10 02:46:25.069910] W [MSGID: 122056]
[ec-combine.c:873:ec_combine_check] 0-StoragePool-disperse-6:
Mismatching xdata in answers of 'LOOKUP'

...

[2017-01-10 02:46:26.520774] I [MSGID: 109036]
[dht-common.c:9076:dht_log_new_layout_for_dir_selfheal]
0-StoragePool-dht: Setting layout of
/Folder_07.11.2016_23.02/CV_MAGNETIC/V_8854213/CHUNK_51334585
with
[Subvol_name: StoragePool-disperse-0, Err: -1 , Start:
3221225466 ,
Stop: 3758096376 , Hash: 1 ], [Subvol_name:
StoragePool-disperse-1,
Err:
-1 , Start: 3758096377 , Stop: 4294967295 , Hash: 1 ], [Subvol_name:
StoragePool-disperse-2, Err: -1 , Start: 0 , Stop: 536870910 , Hash:
1 ], [Subvol_name: StoragePool-disperse-3, Err: -1 , Start:
536870911 ,
Stop: 1073741821 , Hash: 1 ], [Subvol_name:
StoragePool-disperse-4,
Err:
-1 , Start: 1073741822 , Stop: 1610612732 , Hash: 1 ], [Subvol_name:
StoragePool-disperse-5, Err: -1 , Start: 1610612733 , Stop:
2147483643 ,
Hash: 1 ], [Subvol_name: StoragePool-disperse-6, Err: -1 , Start:
2147483644 , Stop: 2684354554 , Hash: 1 ], [Subvol_name:
StoragePool-disperse-7, Err: -1 , Start: 2684354555 , Stop:
3221225465 ,
Hash: 1 ],

[2017-01-10 02:46:26.522841] N [MSGID: 122031]
[ec-generic.c:1130:ec_combine_xattrop] 0-StoragePool-disperse-3:
Mismatching dictionary in answers of 'GF_FOP_XATTROP'

The message "N [MSGID: 122031]
[ec-generic.c:1130:ec_combine_xattrop]
0-StoragePool-disperse-3: Mismatching dictionary in answers
of 'GF_FOP_XATTROP'" repeated 2 times between [2017-01-10
02:46:26.522841] and [2017-01-10 02:46:26.522894]

[2017-01-10 02:46:26.522898] W [MSGID: 122040]
[ec-common.c:919:ec_prepare_update_cbk] 0-StoragePool-disperse-3:
Failed to get size and version [Input/output error]

[2017-01-10 02:46:26.523115] N [MSGID: 122031]
[ec-generic.c:1130:ec_combine_xattrop] 0-StoragePool-disperse-6:
Mismatching dictionary in answers of 'GF_FOP_XATTROP'

The message "N [MSGID: 122031]
[ec-generic.c:1130:ec_combine_xattrop]
0-StoragePool-disperse-6: Mismatching dictionary in answers
of 'GF_FOP_XATTROP'" repeated 2 times between [2017-01-10
02:46:26.523115] and [2017-01-10 02:46:26.523143]

[2017-01-10 02:46:26.523147] W [MSGID: 122040]
[ec-common.c:919:ec_prepare_update_cbk] 0-StoragePool-disperse-6:
Failed to get size and version [Input/output error]

[2017-01-10 02:46:26.523302] N [MSGID: 122031]
[ec-generic.c:1130:ec_combine_xattrop] 0-StoragePool-disperse-2:
Mismatching dictionary in answers of 'GF_FOP_XATTROP'

The message "N [MSGID: 122031]
[ec-generic.c:1130:ec_combine_xattrop]
0-StoragePool-disperse-2: Mismatching dictionary in answers
of 'GF_FOP_XATTROP'" repeated 2 times between [2017-01-10
02:46:26.523302] and [2017-01-10 02:46:26.523324]

[2017-01-10 02:46:26.523328] W [MSGID: 122040]
[ec-common.c:919:ec_prepare_update_cbk] 0-StoragePool-disperse-2:
Failed to get size and version [Input/output error]



[root@glusterfs3 Log_Files]# gluster --version

glusterfs 3.7.18 built on Dec  8 2016 06:34:26



[root@glusterfs3 Log_Files]# gluster volume info



Volume Name: StoragePool

Type: Distributed-Disperse

Volume ID: 149e976f-4e21-451c-bf0f-f5691208531f

Status: Started

Number of Bricks: 8 x (2 + 1) = 24

Transport-type: tcp

Bricks:

Brick1: glusterfs1sds:/ws/disk1/ws_brick

Brick2: glusterfs2sds:/ws/disk1/ws_brick

Brick3: glusterfs3sds:/ws/disk1/ws_brick

Brick4: glusterfs1sds:/ws/disk2/ws_brick

Brick5: glusterfs2sds:/ws/disk2/ws_brick

Brick6: glusterfs3sds:/ws/disk2/ws_brick

Brick7: glusterfs1sds:/ws/disk3/ws_brick

Brick8: glusterfs2sds:/ws/disk3/ws_brick

Brick9: glusterfs3sds:/ws/disk3/ws_brick

Brick10: glusterfs1sds:/ws/disk4/ws_brick

Brick11: glusterfs2sds:/ws/disk4/ws_brick

Brick12: glusterfs3sds:/ws/disk4/ws_brick

Brick13: glusterfs1sds:/ws/disk5/ws_brick

Brick14: glusterfs2sds:/ws/disk5/ws_brick

Brick15: glusterfs3sds:/ws/disk5/ws_brick

Brick16: glusterfs1sds:/ws/disk6/ws_brick

Brick17: glusterfs2sds:/ws/disk6/ws_brick

Brick18: glusterfs3sds:/ws/disk6/ws_brick

Brick19: glusterfs1sds:/ws/disk7/ws_brick

Brick20: glusterfs2sds:/ws/disk7/ws_brick

Brick21: glusterfs3sds:/ws/disk7/ws_brick

Brick22: glusterfs1sds:/ws/disk8/ws_brick

Brick23: glusterfs2sds:/ws/disk8/ws_brick

Brick24: glusterfs3sds:/ws/disk8/ws_brick

Options Reconfigured:

performance.readdir-ahead: on

diagnostics.client-log-level: INFO



Thanks and Regards,

Ram

***************************Legal
Disclaimer***************************
"This communication may contain confidential and privileged
material for the sole use of the intended recipient. Any
unauthorized review, use or distribution by others is
strictly prohibited. If you have received the message by
mistake, please advise the sender by reply email and delete the message. Thank you."
*************************************************************
**
***
*
*
*
*


_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-devel


***************************Legal
Disclaimer***************************
"This communication may contain confidential and privileged
material for the sole use of the intended recipient. Any
unauthorized review, use or distribution by others is strictly
prohibited. If you have received the message by mistake,
please advise the sender by reply
email and delete the message. Thank you."
**************************************************************
**
***
*
*
*


***************************Legal
Disclaimer***************************
"This communication may contain confidential and privileged
material for the sole use of the intended recipient. Any
unauthorized review, use or distribution by others is strictly
prohibited. If you have received the message by mistake, please
advise the sender by reply email and delete the message. Thank you."
***************************************************************
**
***
*
*

***************************Legal
Disclaimer***************************
"This communication may contain confidential and privileged
material for the sole use of the intended recipient. Any
unauthorized review, use or distribution by others is strictly
prohibited. If you have received the message by mistake, please
advise the sender by reply
email and delete the message. Thank you."
****************************************************************
**
***
*


***************************Legal
Disclaimer***************************
"This communication may contain confidential and privileged
material for the sole use of the intended recipient. Any
unauthorized review, use or distribution by others is strictly
prohibited. If you have received the message by mistake, please
advise the sender by reply email and delete the message. Thank you."
*****************************************************************
**
***

***************************Legal
Disclaimer***************************
"This communication may contain confidential and privileged material for the sole use of the intended recipient. Any unauthorized review, use or distribution by others is strictly prohibited. If you have received the message by mistake, please advise the sender by reply email and delete the message. Thank you."
******************************************************************
**
**

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users
***************************Legal
Disclaimer***************************
"This communication may contain confidential and privileged
material for the sole use of the intended recipient. Any
unauthorized review, use or distribution by others is strictly
prohibited. If you have received the message by mistake, please advise the sender by reply email and delete the message. Thank you."
******************************************************************
**
**


***************************Legal
Disclaimer***************************
"This communication may contain confidential and privileged material
for the sole use of the intended recipient. Any unauthorized review,
use or distribution by others is strictly prohibited. If you have
received the message by mistake, please advise the sender by reply email and delete the message. Thank you."
********************************************************************
**


***************************Legal
Disclaimer***************************
"This communication may contain confidential and privileged material
for the sole use of the intended recipient. Any unauthorized review,
use or distribution by others is strictly prohibited. If you have
received the message by mistake, please advise the sender by reply email and delete the message. Thank you."
*********************************************************************
*



***************************Legal Disclaimer***************************
"This communication may contain confidential and privileged material for the sole use of the intended recipient. Any unauthorized review, use or distribution by others is strictly prohibited. If you have received the message by mistake, please advise the sender by reply email and delete the message. Thank you."
**********************************************************************

_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-devel
***************************Legal Disclaimer***************************
"This communication may contain confidential and privileged material for the
sole use of the intended recipient. Any unauthorized review, use or distribution
by others is strictly prohibited. If you have received the message by mistake,
please advise the sender by reply email and delete the message. Thank you."
**********************************************************************


***************************Legal Disclaimer***************************
"This communication may contain confidential and privileged material for the
sole use of the intended recipient. Any unauthorized review, use or distribution
by others is strictly prohibited. If you have received the message by mistake,
please advise the sender by reply email and delete the message. Thank you."
**********************************************************************


_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-devel




[Index of Archives]     [Gluster Users]     [Ceph Users]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux