Hi, Gluster expert,
When we setup replicate volume with info like the below:
Volume Name: test
Type: Replicate
Volume ID: 9373eba9-eb84-4618-a54c-f2837345daec
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: rcp:/trunk/brick/test1/sn0
Brick2: rcp:/trunk/brick/test1/sn1
Brick3: rcp:/trunk/brick/test1/sn2 (arbiter)
If we run a performance test which could write a same file with multi-pthread in same time.(write different offset).
The write performance drop a lots (about 60%-70% off to the volume which no arbiter)
And when we study the source code, there is a function
“afr_set_transaction_flock” in” afr-transaction.c”,
It will flock the entire file when arbiter_count is not zero, I suppose it is the root cause lead to performance
drop.
Now my question is:
1)
Why flock the entire file when arbiter is set on? Could you please share the detail why it will lead to split brain only to arbiter?
2)
If it is the root cause, and it really will lead to split-brain if not lock entire file, is there any solution to avoid performance drop for this mulit-write case?
The following is attached source code for this function FYI:
--------------------------------------------------------------------------------------
int afr_set_transaction_flock (xlator_t *this, afr_local_t *local)
{
afr_internal_lock_t *int_lock = NULL;
afr_private_t
*priv = NULL;
int_lock = &local->internal_lock;
priv = this->private;
if ((priv->arbiter_count || local->transaction.eager_lock_on
||
priv->full_lock) &&
local->transaction.type == AFR_DATA_TRANSACTION)
{
/*Lock entire file to avoid network split brains.*/
int_lock->flock.l_len
= 0;
int_lock->flock.l_start = 0;
} else {
int_lock->flock.l_len
= local->transaction.len;
int_lock->flock.l_start = local->transaction.start;
}
int_lock->flock.l_type
= F_WRLCK;
return 0;
}
------------------------------------------------------------------------------------
Thanks & Best Regards,
George