[Michal Ostrowski - Mon, Oct 19, 2009 at 08:19:23AM -0500] | | The entire scheme for managing net namespaces seems unsafe. We depend | on synchronization via pn->hash_lock, but have no guarantee of the | existence of the "net" object -- hence no way to ensure the existence | of the lock itself. This should be relatively easy to fix though as | we should be able to get/put the net namespace as we add remove | objects to/from the pppoe hash. | Hmm... it seems not. The only possible scenario I see (for such nonexistence namespace is that when it was cached via RCU and returned before grace period elapsed, so perhaps we need to call synchronize_net somewhere). | | Once you solve this existence issue, the flush_lock can be eliminated | altogether since all of the relevant code paths already depend on a | write_lock_bh(&pn->hash_lock), and that's the lock that should be use | to protect the pppoe_dev field. | | Another patch to follow later... | | -- | Michal Ostrowski | mostrows@xxxxxxxxx | | | | On Mon, Oct 19, 2009 at 7:36 AM, Eric Dumazet <eric.dumazet@xxxxxxxxx> wrote: | > Michal Ostrowski a écrit : | >> Here's my theory on this after an inital look... | >> | >> Looking at the oops report and disassembly of the actual module binary | >> that caused the oops, one can deduce that: | >> | >> Execution was in pppoe_flush_dev(). %ebx contained the pointer "struct | >> pppox_sock *po", which is what we faulted on, excuting "cmp %eax, 0x190(%ebx)". | >> %ebx value was 0xffffffff (hence we got "NULL pointer dereference at 0x18f"). | >> | >> At this point "i" (stored in %esi) is 15 (valid), meaning that we got a value | >> of 0xffffffff in pn->hash_table[i]. | >> | >>>From this I'd hypothesize that the combination of dev_put() and release_sock() | >> may have allowed us to free "pn". At the bottom of the loop we alreayd | >> recognize that since locks are dropped we're responsible for handling | >> invalidation of objects, and perhaps that should be extended to "pn" as well. | >> -- | >> Michal Ostrowski | >> mostrows@xxxxxxxxx | >> | >> | > | > Looking at this stuff, I do believe flush_lock protection is not | > properly done. | > | > At the end of pppoe_connect() for example we can find : | > | > err_put: | > if (po->pppoe_dev) { | > dev_put(po->pppoe_dev); | > po->pppoe_dev = NULL; | > } Yep, this is unsafe, thanks! | > | > This is done without any protection, and can therefore clash with | > pppoe_flush_dev() : | > | > spin_lock(&flush_lock); | > po->pppoe_dev = NULL; /* ppoe_dev can already be NULL before this point */ | > spin_unlock(&flush_lock); | > | > dev_put(dev); /* oops */ | > | Denys, could you check if the patch below help? -- Cyrill --- drivers/net/pppoe.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) Index: linux-2.6.git/drivers/net/pppoe.c ===================================================================== --- linux-2.6.git.orig/drivers/net/pppoe.c +++ linux-2.6.git/drivers/net/pppoe.c @@ -312,9 +312,9 @@ static void pppoe_flush_dev(struct net_d } sk = sk_pppox(po); spin_lock(&flush_lock); + dev_put(po->pppoe_dev); po->pppoe_dev = NULL; spin_unlock(&flush_lock); - dev_put(dev); /* We always grab the socket lock, followed by the * hash_lock, in that order. Since we should @@ -708,10 +708,12 @@ end: release_sock(sk); return error; err_put: + spin_lock(&flush_lock); if (po->pppoe_dev) { dev_put(po->pppoe_dev); po->pppoe_dev = NULL; } + spin_unlock(&flush_lock); goto end; } -- To unsubscribe from this list: send the line "unsubscribe linux-ppp" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html