Re: What do extra_free and extrastd_free params do in the dictionary object?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



To conclude and ensure I understood both:
- I've sent a patch to remove extra_free parameter (and the associated gf_memdup() in 2 places) - https://review.gluster.org/#/c/glusterfs/+/23999/ (waiting for CI to finish working on it).
- I'll send a patch to remove extra_stdfree - but this one is slightly bigger:
1. There are more places it's set (18).
2. Somehow, we need to release that memory. I kinda assume the sooner the better?
So perhaps I can just replace:
dict->extra_stdfree = cli_req.dict.dict_val;
with:
free(cli_req.dict.dict_val) ?
3. In some places, there was actually a call to GF_FREE, which is kind of confusing. For example, in __server_get_snap_info() :
    if (snap_info_rsp.dict.dict_val) {
        GF_FREE(snap_info_rsp.dict.dict_val);
    }

I think I should remove that and stick to freeing right after unserialization?
Y.



On Thu, Jan 9, 2020 at 12:42 PM Xavi Hernandez <jahernan@xxxxxxxxxx> wrote:
On Thu, Jan 9, 2020 at 11:11 AM Yaniv Kaul <ykaul@xxxxxxxxxx> wrote:


On Thu, Jan 9, 2020 at 11:35 AM Xavi Hernandez <jahernan@xxxxxxxxxx> wrote:
On Thu, Jan 9, 2020 at 10:22 AM Amar Tumballi <amarts@xxxxxxxxx> wrote:


On Thu, Jan 9, 2020 at 2:33 PM Xavi Hernandez <jahernan@xxxxxxxxxx> wrote:
On Thu, Jan 9, 2020 at 9:44 AM Amar Tumballi <amarts@xxxxxxxxx> wrote:


On Thu, Jan 9, 2020 at 1:38 PM Xavi Hernandez <jahernan@xxxxxxxxxx> wrote:
On Sun, Dec 22, 2019 at 4:56 PM Yaniv Kaul <ykaul@xxxxxxxxxx> wrote:
I could not find a relevant use for them. Can anyone enlighten me?

I'm not sure why they are needed. They seem to be used to keep the unserialized version of a dict around until the dict is destroyed. I thought this could be because we were using pointers to the unserialized data inside dict, but that's not the case currently. However, checking very old versions (pre 3.2), I see that dict values were not allocated, but a pointer to the unserialized data was used.

Xavi,

While you are right about the intent, it is used still, at least when I grepped latest repo to keep a reference in protocol layer. 

This is done to reduce a copy after the dictionary's binary content is received from RPC. The 'extra_free' flag is used when we use a GF_*ALLOC()'d buffer in protocol to receive dictionary, and extra_stdfree is used when RPC itself allocates the buffer and hence uses 'free()' to free the buffer.

I don't see it. When dict_unserialize() is called, key and value are allocated and copied, so  why do we need to keep the raw data after that ?

In 3.1 the value was simply a pointer to the unserialized data, but starting with 3.2, value is memdup'ed. Key is always copied. I don't see any other reference to the unserialized data right now. I think that instead of assigning the raw data to extra_(std)free, we should simply release that memory and remove those fields.

Am I missing something else ?

I did grep on 'extra_stdfree' and 'extra_free' and saw that many handshake/ and protocol code seemed to use it. Haven't gone deeper to check which part.

[amar@kadalu glusterfs]$ git grep extra_stdfree | wc -l
40
[amar@kadalu glusterfs]$ git grep extra_free | wc -l
5

Yes, they call dict_unserialize() and then store the unserialized data into those variables. That's what I'm saying it's not necessary.

In at least 2 cases, there's even something stranger I could not understand (see in bold - from server_setvolume() function) :
    params = dict_new();
    reply = dict_new();
    ret = xdr_to_generic(req->msg[0], &args, (xdrproc_t)xdr_gf_setvolume_req);
    if (ret < 0) {
        // failed to decode msg;
        req->rpc_err = GARBAGE_ARGS;
        goto fail;
    }
    ctx = THIS->ctx;

    this = req->svc->xl;
    /* this is to ensure config_params is populated with the first brick
     * details at first place if brick multiplexing is enabled
     */
    config_params = dict_copy_with_ref(this->options, NULL);

    buf = gf_memdup(args.dict.dict_val, args.dict.dict_len);

This is probably unnecessary if we can really remove extra_free. Probably it's here because args.dict.dict_val will be destroyed later.


    if (buf == NULL) {
        op_ret = -1;
        op_errno = ENOMEM;
        goto fail;
    }

    ret = dict_unserialize(buf, args.dict.dict_len, &params);
    if (ret < 0) {
        ret = dict_set_str(reply, "ERROR",
                           "Internal error: failed to unserialize "
                           "request dictionary");
        if (ret < 0)
            gf_msg_debug(this->name, 0,
                         "failed to set error "
                         "msg \"%s\"",
                         "Internal error: failed "
                         "to unserialize request dictionary");

        op_ret = -1;
        op_errno = EINVAL;
        goto fail;
    }

    params->extra_free = buf;
    buf = NULL;

This is needed to prevent memory pointed by buf from being destroyed once if has been assigned to extra_free. Also unnecessary if extra_free can be removed.

So we could eliminate one memory copy here if those fields are not really necessary.


 



I think this is not needed anymore. Probably we could remove these fields if that's the only reason.

If keeping them is hard to maintain, we can add few allocation to remove those elements, that shouldn't matter much IMO too. We are not using dictionary itself as protocol now (which we did in 1.x series though).

Regards,
Amar
---

 
TIA,
Y.
_______________________________________________

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/441850968


NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/441850968

Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-devel

_______________________________________________

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/441850968


NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/441850968

Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-devel

_______________________________________________

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/441850968


NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/441850968

Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-devel


[Index of Archives]     [Gluster Users]     [Ceph Users]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux