On 02/18/2013 11:24 AM, Seth Jennings wrote:
On 02/15/2013 10:04 PM, Ric Mason wrote:
On 02/14/2013 02:38 AM, Seth Jennings wrote:
<snip>
+/* invalidates all pages for the given swap type */
+static void zswap_frontswap_invalidate_area(unsigned type)
+{
+ struct zswap_tree *tree = zswap_trees[type];
+ struct rb_node *node, *next;
+ struct zswap_entry *entry;
+
+ if (!tree)
+ return;
+
+ /* walk the tree and free everything */
+ spin_lock(&tree->lock);
+ node = rb_first(&tree->rbroot);
+ while (node) {
+ entry = rb_entry(node, struct zswap_entry, rbnode);
+ zs_free(tree->pool, entry->handle);
+ next = rb_next(node);
+ zswap_entry_cache_free(entry);
+ node = next;
+ }
+ tree->rbroot = RB_ROOT;
Why don't need rb_erase for every nodes?
We are freeing the entire tree here. try_to_unuse() in the swapoff
syscall should have already emptied the tree, but this is here for
completeness.
rb_erase() will do things like rebalancing the tree; something that
just wastes time since we are in the process of freeing the whole
tree. We are holding the tree lock here so we are sure that no one
else is accessing the tree while it is in this transient broken state.
If we have a sub-tree like:
...
/
A
/ \
B C
B == rb_next(tree)
A == rb_next(B)
C == rb_next(A)
The current code free's A (via zswap_entry_cache_free()) prior to
examining C, and thus rb_next(C) results in a use after free of A.
You can solve this by doing a post-order traversal of the tree, either
a) in the destructive manner used in a number of filesystems, see
fs/ubifs/orphan.c ubifs_add_orphan(), for example.
b) or by doing something similar to this commit:
https://github.com/jmesmon/linux/commit/d9e43aaf9e8a447d6802531d95a1767532339fad
, which I've been using for some yet-to-be-merged code.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>