Re: [BUG 3.0.0-rc1] ksm: NULL pointer dereference in ksm_do_scan()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 2 Jun 2011, Chris Wright wrote:
> * Andrea Righi (andrea@xxxxxxxxxxxxxxx) wrote:
> > mmh.. I can reproduce the bug also with the standard ubuntu (11.04)
> > kernel. Could you post your .config?
> 
> Andrea (Righi), can you tell me if this WARN fires?  This looks
> like a pure race between removing from list and checking list, i.e.
> insufficient locking.
> 
> ksm_scan.mm_slot == the only registered mm
> 
> CPU 1 (bug program)		CPU 2 (ksmd)
> 				list_empty() is false
> lock
> ksm_scan.mm_slot
> list_del
> unlock
> 				slot == &ksm_mm_head (but list is now empty_)
> 
> 
> diff --git a/mm/ksm.c b/mm/ksm.c
> index 942dfc7..ab79a92 100644
> --- a/mm/ksm.c
> +++ b/mm/ksm.c
> @@ -1301,6 +1301,7 @@ static struct rmap_item *scan_get_next_rmap_item(struct page **page)
>  		slot = list_entry(slot->mm_list.next, struct mm_slot, mm_list);
>  		ksm_scan.mm_slot = slot;
>  		spin_unlock(&ksm_mmlist_lock);
> +		WARN_ON(slot == &ksm_mm_head);
>  next_mm:
>  		ksm_scan.address = 0;
>  		ksm_scan.rmap_list = &slot->rmap_list;

AndreaR, good find, many thanks for discovering and reporting it.
I couldn't look at it until last night, and even then, it was not
obvious to me exactly where my assumptions were going wrong.

Even now it's unclear what role the SIGSEGV plays, as opposed to an
normal exit: I guess it just happens to change the timing enough to
make the race dangerous.

Your patch was not wrong, but I do prefer a patch that plugs the
exact hole; and I needed to understand what was going on - without
understanding it, there was a danger we might leak memory instead.

AndreaA, I didn't study the patch you posted half an hour ago,
since by that time I'd worked it out and was preparing patch below.
I think your patch would be for a different bug, hopefully one we
don't have, it looks more complicated than we should need for this.

ChrisW, yes, your WARN_ON is spot on, matches what I saw exactly.

I'll fill in the patch description later, must dash now, probably
offline until late tonight.  Or if you're satisfied and don't want to
wait, you guys fill that in and send off to Linus & Andrew - thanks.

[PATCH] ksm: fix easily reproduced NULL pointer dereference

Reported-by: Andrea Righi <andrea@xxxxxxxxxxxxxxx>
Signed-off-by: Hugh Dickins <hughd@xxxxxxxxxx>
Cc: stable@xxxxxxxxxx
---

 mm/ksm.c |    7 +++++++
 1 file changed, 7 insertions(+)

--- 3.0-rc1/mm/ksm.c	2011-05-29 18:42:37.429882601 -0700
+++ linux/mm/ksm.c	2011-06-02 09:55:31.729702490 -0700
@@ -1302,6 +1302,13 @@ static struct rmap_item *scan_get_next_r
 		slot = list_entry(slot->mm_list.next, struct mm_slot, mm_list);
 		ksm_scan.mm_slot = slot;
 		spin_unlock(&ksm_mmlist_lock);
+
+		/*
+		 * Although we tested list_empty() above, a racing __ksm_exit
+		 * of the last mm on the list may have removed it since then.
+		 */
+		if (slot == &ksm_mm_head)
+			return NULL;
 next_mm:
 		ksm_scan.address = 0;
 		ksm_scan.rmap_list = &slot->rmap_list;

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxxx  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>


[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]