Memory corruption after #1686

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Ming,

I think I got it now. reset_entry fails to preserve the pj_hash_entry_buf in cached_res.

(the key was a red herring, that is also set again later).

If the pj_hash_entry_buf isn't preserved, pj_hash_set_np will call find_entry in hash.c and that will not find an entry, although there should be one.

In find_entry, after the scan around line 177, *p_entry will contain the beginning of the bucket's list (ht->table[hash & ht->rows]) and a new entry will be taken from the entry_buf (which will also point to the beginning of the bucket's list, because that's where the old entry was and reset_entry should not change the address of that). 

Then, in line 218, a loop will be created:

*p_entry = entry;

(*p_entry is the same as entry->next given the current layout).

This will create a loop in the bucket's list and either lead to the memory corruption I initially observed or maybe an infinite loop later in case of a hash collision on that hash table.

I hope this analysis finally makes sense.

This is the current patch:

--- a/pjlib-util/src/pjlib-util/resolver.c
+++ b/pjlib-util/src/pjlib-util/resolver.c
@@ -710,16 +710,19 @@ static struct cached_res *alloc_entry(pj_dns_resolver *resolver)
 static void reset_entry(struct cached_res **p_cached)
 {
     pj_pool_t *pool;
+    pj_hash_entry_buf hbuf;
     struct cached_res *cache = *p_cached;
     unsigned ref_cnt;
 
     pool = cache->pool;
+    pj_memcpy(hbuf, cache->hbuf, sizeof(hbuf));
     ref_cnt = cache->ref_cnt;
 
     pj_pool_reset(pool);
 
     cache = PJ_POOL_ZALLOC_T(pool, struct cached_res);
     cache->pool = pool;
+    pj_memcpy(cache->hbuf, hbuf, sizeof(hbuf));
     cache->ref_cnt = ref_cnt;
     *p_cached = cache;
 }

- Lars

P.S. Now that I have had a good look at it, I think this bug should be reproducable and shouldn't be platform dependent (not sure why I had sometimes problems reproducing it).

Also, if I may say so, pjsip's memory management is complicated.


> thanks for your quick answer.
> 
>> The entry update is done a few lines below the code patch you provided above:
>>    pj_hash_set_np(resolver->hrescache, &cache->key, sizeof(*key), hval,
>>     cache->hbuf, cache);
> 
> Ah, ok. I missed that.
> 
>> So I'm still not sure whether reset_entry() is the problem or it may be caused by something else.
> 
> I have thought about it some more: reset_entry clears the cache entry's key. pj_hash_set_np then sorts it into the wrong place. 
> 
> In my bug-case that lead to a loop in the linked list of the hash table (next pointed to itself). The actual memory corruption happened much later at pjsua_shutdown when the memory of the hash table was reclaimed, and the same memory was freed twice due to the loop in the linked list.
> 
> This patch fixes the problem and is much closer to your original intention:
> 
> --- a/pjlib-util/src/pjlib-util/resolver.c
> +++ b/pjlib-util/src/pjlib-util/resolver.c
> @@ -710,10 +710,12 @@ static struct cached_res *alloc_entry(pj_dns_resolver *resolver)
> static void reset_entry(struct cached_res **p_cached)
> {
>     pj_pool_t *pool;
> +    struct res_key key;
>     struct cached_res *cache = *p_cached;
>     unsigned ref_cnt;
> 
>     pool = cache->pool;
> +    key = cache->key;
>     ref_cnt = cache->ref_cnt;
> 
>     pj_pool_reset(pool);
> @@ -721,6 +723,7 @@ static void reset_entry(struct cached_res **p_cached)
>     cache = PJ_POOL_ZALLOC_T(pool, struct cached_res);
>     cache->pool = pool;
>     cache->ref_cnt = ref_cnt;
> +    cache->key = key;
>     *p_cached = cache;
> }
> 
> Regards,
> 
> - Lars
> 
>> 
>> On Wed, Aug 21, 2013 at 8:11 PM, Lars Immisch <lars at ibp.de> wrote:
>> Hi,
>> 
>> I've seen a memory corruption problem in pjlib-util/resolver.c that was introduced with the changes for #1686 (http://trac.pjsip.org/repos/ticket/1686)
>> 
>> This patch fixes the problem:
>> 
>> --- a/pjlib-util/src/pjlib-util/resolver.c
>> +++ b/pjlib-util/src/pjlib-util/resolver.c
>> @@ -706,24 +706,6 @@ static struct cached_res *alloc_entry(pj_dns_resolver *resolver)
>>     return cache;
>> }
>> 
>> -/* Re-allocate cache entry, to free cached packet */
>> -static void reset_entry(struct cached_res **p_cached)
>> -{
>> -    pj_pool_t *pool;
>> -    struct cached_res *cache = *p_cached;
>> -    unsigned ref_cnt;
>> -
>> -    pool = cache->pool;
>> -    ref_cnt = cache->ref_cnt;
>> -
>> -    pj_pool_reset(pool);
>> -
>> -    cache = PJ_POOL_ZALLOC_T(pool, struct cached_res);
>> -    cache->pool = pool;
>> -    cache->ref_cnt = ref_cnt;
>> -    *p_cached = cache;
>> -}
>> -
>> /* Put unused/expired cached entry to the free list */
>> static void free_entry(pj_dns_resolver *resolver, struct cached_res *cache)
>> {
>> @@ -816,7 +798,7 @@ PJ_DEF(pj_status_t) pj_dns_resolver_start_query( pj_dns_resolver *resolver,
>>             * the cache (as it has been expired).
>>             */
>>            cache->ref_cnt--;
>> -           if (cache->ref_cnt <= 0)
>> +           if (cache->ref_cnt == 0)
>>                free_entry(resolver, cache);
>> 
>>            /* Must return PJ_SUCCESS */
>> @@ -1251,11 +1233,8 @@ static void update_res_cache(pj_dns_resolver *resolver,
>>         */
>>        cache->ref_cnt--;
>>        cache = alloc_entry(resolver);
>> -    } else {
>> -       /* Reset cache to avoid bloated cache pool */
>> -       reset_entry(&cache);
>>     }
>> -
>> +
>>     /* Duplicate the packet.
>>      * We don't need to keep the NS and AR sections from the packet,
>>      * so exclude from duplication. We do need to keep the Query
>> --
>> 
>> (The second hunk is not strictly necessary, BTW, but ref_cnt is unsigned, so never < 0)
>> 
>> I haven't looked at it very closely, but I believe the problem ist that while reset_entry reallocates the cache entry, it does not refresh the links in the hash table.
>> 
>> There might be a better way to fix this and keep the intended memory usage optimization, but I didn't do it because the problem was hard to reproduce.
>> 
>> - Lars

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 496 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://lists.pjsip.org/pipermail/pjsip_lists.pjsip.org/attachments/20130823/bed388f9/attachment-0001.asc>


[Index of Archives]     [Asterisk Users]     [Asterisk App Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [Linux API]
  Powered by Linux