Re: races in dict_foreach() causing crashes in tier-file-creat.t

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

On 11/03/16 13:55, Jeff Darcy wrote:
Raghavendra G and I discussed about this problem and the right way to
fix it is to take a copy(without dict_foreach) of the dictionary in
dict_foreach inside a lock and then loop over the local dictionary. I am
worried about the performance implication of this

I'm worried about the correctness implications.  Any such copy will have
to do the equivalent of dict_foreach even if it doesn't call the function
of that name, and will be subject to the same races.

Also included Xavi, who earlier said we need to change dict.c but it is
a bigger change. May be the time has come? I would love to gather all
your inputs and implement a better version of dict if we need one.

There are three solutions I can think of.

(1) Have tier use STACK_WIND_COOKIE to pass the address of a lock down
both paths, so the two can synchronize between themselves.

(2) Have tier issue the lookup down the two paths *serially* instead
of in parallel, so there's no contention on the dictionary.  This is
probably most complicated *and* worst for performance, but I include
it for the sake of completeness.

(3) Enhance dict_t with a gf_lock_t that can be used to serialize
access.  We don't have to use the lock in every invocation of
dict_foreach (though we should probably investigate that).  For
now, we can just use it in the code paths we know are contending.

I suspect Xavi was suggesting (3) and I concur that it's the best
long-term solution.

My initial idea was quite different and probably unpractical to implement due to its implications in existing code. However here it is:

The idea was to have a dict based on refcounting controlled by caller (instead of callee like it's done now) and without locks (only atomic operations to increase/decrease refcounting). When some dict needs to be modified, it's only done if refcount is 1, otherwise a copy is made before changing it. This means that any dict with more that one reference becomes read-only avoiding any interference between xlators.

The most critical change is the caller controlled refcounting. Currently, when some xlator needs to use a dict, it takes a ref. This makes impossible to know if the already existing ref (owned by the caller) will be used or not, so we should always do copies of dicts, which is inefficient.

When it's the caller who controls the refcounting, if it still needs the reference to do some other work after calling a function that needs the dict, it will increase the refcounting. Otherwise it will simply "give" its reference to the called function. With this approach we'll need to distinguish between functions that really need their own reference form those that can share the reference of the caller.


Example:

  Current code:

     dict = ...;

     /* We won't use dict anymore in this function, but we still hold
        a reference on the dict */
     do_something(..., dict, ...);
     /* We release our reference */
     dict_unref(dict);

     ---------

     /* Function that needs its own reference to dict */
     do_something(..., dict_t *dict, ...) {
         dict_ref(dict);

         /* We do something in background that uses the dict. It will
            take care of releasing the ref when not needed anymore. */
         do_something_in_background(..., dict, ...);
     }

     /* Function that doesn't need its own reference to dict */
     do_something(..., dict_t *dict, ...) {
         /* We do something with the dict here, but we don't return
            until all work has been done. In this case we can share the
            ref of the caller. */
         <do something with dict>
     }


  Proposed change:

    dict = ...;

    /* We won't use dict anymore, so we pass our reference to
       do_something() */
    do_something(..., dict, ...);

    /* If do_something() shares our reference, we'll need to release it
       here.

       dict_unref(dict);

    */

    /* Here we cannot use dict because we don't have any reference to
       it */

    ---------

    /* Function that needs its own reference to dict */
    do_something(..., dict_t *dict, ...) {
        /* We already have a reference that we can use */
        ...
        /* We do something in background that uses the dict. It will
           take care of releasing the ref when not needed anymore. */
        do_something_in_background(..., dict, ...);
        /* dict cannot be used here unless a ref has been taken before
           calling do_something_in_background(). */
    }

    /* Function that doesn't need its own reference to dict */
    do_something(..., dict_t *dict, ...) {
        /* We do something with the dict here, but we don't return
           until all work has been done. In this case we can share the
           ref of the caller. */
        <do something with dict>
    }


if we need to keep a reference on the caller:

    dict = ...;

    /* Acquire a new reference for do_something() if it won't share our
       reference */
    dict_ref(dict);

    do_something(..., dict, ...);

    /* Do whatever we need with our reference on dict. Note that our
       dict will be clean, exactly as we had it before calling
       do_something(), whatever this function has done. */
    <do anything on dict>

    /* Release our ref */
    dict_unref(dict);


Xavi

_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-devel

_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-devel



[Index of Archives]     [Gluster Users]     [Ceph Users]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux