Re: races in dict_foreach() causing crashes in tier-file-creat.t

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 03/11/2016 06:25 PM, Jeff Darcy wrote:
Raghavendra G and I discussed about this problem and the right way to
fix it is to take a copy(without dict_foreach) of the dictionary in
dict_foreach inside a lock and then loop over the local dictionary. I am
worried about the performance implication of this

I think what I forgot to say was that the copy will happen inside a lock dict already has.

I'm worried about the correctness implications.  Any such copy will have
to do the equivalent of dict_foreach even if it doesn't call the function
of that name, and will be subject to the same races.

Also included Xavi, who earlier said we need to change dict.c but it is
a bigger change. May be the time has come? I would love to gather all
your inputs and implement a better version of dict if we need one.
There are three solutions I can think of.

(1) Have tier use STACK_WIND_COOKIE to pass the address of a lock down
both paths, so the two can synchronize between themselves.

(2) Have tier issue the lookup down the two paths *serially* instead
of in parallel, so there's no contention on the dictionary.  This is
probably most complicated *and* worst for performance, but I include
it for the sake of completeness.

Tier does send lookups serially, which fail on the hashed subvolumes of dhts. Both of them trigger lookup_everywhere which is executed in epoll threads, thus the they are executed in parallel. So (1) won't be simple enough, in the sense it has to remember it in local etc.


(3) Enhance dict_t with a gf_lock_t that can be used to serialize
access.  We don't have to use the lock in every invocation of
dict_foreach (though we should probably investigate that).  For
now, we can just use it in the code paths we know are contending.

dict already has a lock. The performance implication I was talking about was that we will allow syscalls like getxattr in posix which happen inside these locks if we execute dict_foreach inside this lock, which is costly. That is the reason Du and I wanted to do the copy of dict inside locks and then use this dictionary to carry out dict_foreach without any locks so that we don't have to be inside locks for too long. the copy shouldn't use dict_foreach for this reason :-). But I forgot to mention about the lock. I am wondering if anyone has better solution than this.


I suspect Xavi was suggesting (3) and I concur that it's the best
long-term solution.
Xavi was mentioning that dict_copy_with_ref is too costly, which is true, if we make this change it will be even more costly :-(.

Pranith
_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-devel



[Index of Archives]     [Gluster Users]     [Ceph Users]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux