RE: [PATCH 7/8] zswap: add to mm/

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> From: Seth Jennings [mailto:sjenning@xxxxxxxxxxxxxxxxxx]
> Subject: Re: [PATCH 7/8] zswap: add to mm/
> 
> On 01/03/2013 04:33 PM, Dan Magenheimer wrote:
> >> From: Seth Jennings [mailto:sjenning@xxxxxxxxxxxxxxxxxx]
> >>
> >> However, once the flushing code was introduced and could free an entry
> >> from the zswap_fs_store() path, it became necessary to add a per-entry
> >> refcount to make sure that the entry isn't freed while another code
> >> path was operating on it.
> >
> > Hmmm... doesn't the refcount at least need to be an atomic_t?
> 
> An entry's refcount is only ever changed under the tree lock, so
> making them atomic_t would be redundantly atomic.

Maybe I'm missing something still but then I think you also
need to evaluate and act on the refcount (not just read it) while
your treelock is held.  I.e., in:

> +		/* page is already in the swap cache, ignore for now */
> +		spin_lock(&tree->lock);
> +		refcount = zswap_entry_put(entry);
> +		spin_unlock(&tree->lock);
> +
> +		if (likely(refcount))
> +			return 0;
> +
> +		/* if the refcount is zero, invalidate must have come in */
> +		/* free */
> +		zs_free(tree->pool, entry->handle);
> +		zswap_entry_cache_free(entry);
> +		atomic_dec(&zswap_stored_pages);

the entry's refcount may be changed by another processor
immediately after the unlock, and then the "if (refcount)"
is testing a stale value and you will get (I think) a memory leak.

There is similar racy code in zswap_fs_invalidate_page which
I think could lead to a double free.  There's another
I think in zswap_fs_load...  And the refcount is dec'd
in one path inside of zswap_fs_store as well which may
race with the above.

When flushing multiple zpages to free a pageframe, you may
need to test refcounts for all the entries while within the lock.
If so, this is one place where the high-density storage will make
things messy, especially if page boundaries are crossed.

A nit: Even I, steeped in tmem terminology, was confused by
your use of "fs"... to nearly all readers it will
be translated as "filesystem" which is mystifying.
Just spell it out "frontswap", even if it causes a few
lines to be wrapped.

Have a good weekend!
Dan

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href


[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]