Re: kernel panic in latest vanilla stable, while using nameif with "alive" pppoe interfaces

Eric Dumazet <eric.dumazet@xxxxxxxxx> · Mon, 19 Oct 2009 14:36:11 +0200

Michal Ostrowski a écrit :
> Here's my theory on this after an inital look...
> 
> Looking at the oops report and disassembly of the actual module binary
> that caused the oops, one can deduce that:
> 
> Execution was in pppoe_flush_dev().  %ebx contained the pointer "struct
> pppox_sock *po", which is what we faulted on, excuting "cmp %eax, 0x190(%ebx)".
> %ebx value was 0xffffffff (hence we got "NULL pointer dereference at 0x18f").
> 
> At this point "i" (stored in %esi) is 15 (valid), meaning that we got a value
> of 0xffffffff in pn->hash_table[i].
> 
>>From this I'd hypothesize that the combination of dev_put() and release_sock()
> may have allowed us to free "pn".  At the bottom of the loop we alreayd
> recognize that since locks are dropped we're responsible for handling
> invalidation of objects, and perhaps that should be extended to "pn" as well.
> --
> Michal Ostrowski
> mostrows@xxxxxxxxx
> 
> 

Looking at this stuff, I do believe flush_lock protection is not
properly done.

At the end of pppoe_connect() for example we can find :

err_put:
        if (po->pppoe_dev) {
                dev_put(po->pppoe_dev);
                po->pppoe_dev = NULL;
        }

This is done without any protection, and can therefore clash with 
pppoe_flush_dev() :

	spin_lock(&flush_lock);
	po->pppoe_dev = NULL; /* ppoe_dev can already be NULL before this point */
	spin_unlock(&flush_lock);

	dev_put(dev);    /* oops */
--
To unsubscribe from this list: send the line "unsubscribe linux-ppp" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html