On Tue, Jul 24, 2018 at 8:35 PM, Raghavendra Gowdappa <rgowdapp@xxxxxxxxxx> wrote:
On Tue, Jul 24, 2018 at 6:30 PM, Ravishankar N <ravishankar@xxxxxxxxxx> wrote:
On 07/24/2018 02:56 PM, Raghavendra Gowdappa wrote:
All,I was trying to debug regression failures on [1] and observed that split-brain-resolution.t was failing consistently.
=========================
TEST 45 (line 88): 0 get_pending_heal_count patchy
./tests/basic/afr/split-brain-resolution.t .. 45/45 RESULT 45: 1
./tests/basic/afr/split-brain-resolution.t .. Failed 17/45 subtests
Test Summary Report
-------------------
./tests/basic/afr/split-brain-resolution.t (Wstat: 0 Tests: 45 Failed: 17)
Failed tests: 24-26, 28-36, 41-45
On probing deeper, I observed a curious fact - on most of the failures stat was not served from md-cache, but instead was wound down to afr which failed stat with EIO as the file was in split brain. So, I did another test:* disabled md-cache* mount glusterfs with attribute-timeout 0 and entry-timeout 0
Now the test fails always. So, I think the test relied on stat requests being absorbed either by kernel attribute cache or md-cache. When its not happening stats are reaching afr and resulting in failures of cmds like getfattr etc.
This indeed seems to be the case. Is there any way we can avoid the stat? When a getfattr is performed on the mount, aren't lookup + getfattr are the only fops that need to be hit in gluster?Its a black box to me how kernel decides whether to do lookup or stat. But I guess, if only stat is needed and its not available in cache it would do a stat.
Another thing you can do is mounting with a higher value of attribute-timeout. Let us know whether it works.
-Ravi
_______________________________________________ Gluster-devel mailing list Gluster-devel@xxxxxxxxxxx https://lists.gluster.org/mail man/listinfo/gluster-devel
_______________________________________________ Gluster-devel mailing list Gluster-devel@xxxxxxxxxxx https://lists.gluster.org/mailman/listinfo/gluster-devel