Hi Sasha, On 10/06/2017 04:17 AM, Levin, Alexander (Sasha Levin) wrote: > From: Florian Fainelli <f.fainelli@xxxxxxxxx> > > This patch has been added to the v4.1 stable tree. If you have any > objections, please let us know. > > =============== > > [ Upstream commit ebc8254aeae34226d0bc8fda309fd9790d4dccfe ] For some reason 7ad813f208533cebfcc32d3d7474dc1677d1b09a ("net: phy: > Correctly process PHY_HALTED in phy_stop_machine()") actually made it back into the stable-4.1.y tree: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/commit/?h=linux-4.1.y&id=ef0680a31df994e1d77b840f37772ff727cdc8ca it should be dropped. Thanks! > > This reverts commit 7ad813f208533cebfcc32d3d7474dc1677d1b09a ("net: phy: > Correctly process PHY_HALTED in phy_stop_machine()") because it is > creating the possibility for a NULL pointer dereference. > > David Daney provide the following call trace and diagram of events: > > When ndo_stop() is called we call: > > phy_disconnect() > +---> phy_stop_interrupts() implies: phydev->irq = PHY_POLL; > +---> phy_stop_machine() > | +---> phy_state_machine() > | +----> queue_delayed_work(): Work queued. > +--->phy_detach() implies: phydev->attached_dev = NULL; > > Now at a later time the queued work does: > > phy_state_machine() > +---->netif_carrier_off(phydev->attached_dev): Oh no! It is NULL: > > CPU 12 Unable to handle kernel paging request at virtual address > 0000000000000048, epc == ffffffff80de37ec, ra == ffffffff80c7c > Oops[#1]: > CPU: 12 PID: 1502 Comm: kworker/12:1 Not tainted 4.9.43-Cavium-Octeon+ #1 > Workqueue: events_power_efficient phy_state_machine > task: 80000004021ed100 task.stack: 8000000409d70000 > $ 0 : 0000000000000000 ffffffff84720060 0000000000000048 0000000000000004 > $ 4 : 0000000000000000 0000000000000001 0000000000000004 0000000000000000 > $ 8 : 0000000000000000 0000000000000000 00000000ffff98f3 0000000000000000 > $12 : 8000000409d73fe0 0000000000009c00 ffffffff846547c8 000000000000af3b > $16 : 80000004096bab68 80000004096babd0 0000000000000000 80000004096ba800 > $20 : 0000000000000000 0000000000000000 ffffffff81090000 0000000000000008 > $24 : 0000000000000061 ffffffff808637b0 > $28 : 8000000409d70000 8000000409d73cf0 80000000271bd300 ffffffff80c7804c > Hi : 000000000000002a > Lo : 000000000000003f > epc : ffffffff80de37ec netif_carrier_off+0xc/0x58 > ra : ffffffff80c7804c phy_state_machine+0x48c/0x4f8 > Status: 14009ce3 KX SX UX KERNEL EXL IE > Cause : 00800008 (ExcCode 02) > BadVA : 0000000000000048 > PrId : 000d9501 (Cavium Octeon III) > Modules linked in: > Process kworker/12:1 (pid: 1502, threadinfo=8000000409d70000, > task=80000004021ed100, tls=0000000000000000) > Stack : 8000000409a54000 80000004096bab68 80000000271bd300 80000000271c1e00 > 0000000000000000 ffffffff808a1708 8000000409a54000 80000000271bd300 > 80000000271bd320 8000000409a54030 ffffffff80ff0f00 0000000000000001 > ffffffff81090000 ffffffff808a1ac0 8000000402182080 ffffffff84650000 > 8000000402182080 ffffffff84650000 ffffffff80ff0000 8000000409a54000 > ffffffff808a1970 0000000000000000 80000004099e8000 8000000402099240 > 0000000000000000 ffffffff808a8598 0000000000000000 8000000408eeeb00 > 8000000409a54000 00000000810a1d00 0000000000000000 8000000409d73de8 > 8000000409d73de8 0000000000000088 000000000c009c00 8000000409d73e08 > 8000000409d73e08 8000000402182080 ffffffff808a84d0 8000000402182080 > ... > Call Trace: > [<ffffffff80de37ec>] netif_carrier_off+0xc/0x58 > [<ffffffff80c7804c>] phy_state_machine+0x48c/0x4f8 > [<ffffffff808a1708>] process_one_work+0x158/0x368 > [<ffffffff808a1ac0>] worker_thread+0x150/0x4c0 > [<ffffffff808a8598>] kthread+0xc8/0xe0 > [<ffffffff808617f0>] ret_from_kernel_thread+0x14/0x1c > > The original motivation for this change originated from Marc Gonzales > indicating that his network driver did not have its adjust_link callback > executing with phydev->link = 0 while he was expecting it. > > PHYLIB has never made any such guarantees ever because phy_stop() merely just > tells the workqueue to move into PHY_HALTED state which will happen > asynchronously. > > Reported-by: Geert Uytterhoeven <geert+renesas@xxxxxxxxx> > Reported-by: David Daney <ddaney.cavm@xxxxxxxxx> > Fixes: 7ad813f20853 ("net: phy: Correctly process PHY_HALTED in phy_stop_machine()") > Signed-off-by: Florian Fainelli <f.fainelli@xxxxxxxxx> > Signed-off-by: David S. Miller <davem@xxxxxxxxxxxxx> > Signed-off-by: Sasha Levin <alexander.levin@xxxxxxxxxxx> > --- > drivers/net/phy/phy.c | 3 --- > 1 file changed, 3 deletions(-) > > diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c > index 1ca78b46c01b..21a668faacd7 100644 > --- a/drivers/net/phy/phy.c > +++ b/drivers/net/phy/phy.c > @@ -512,9 +512,6 @@ void phy_stop_machine(struct phy_device *phydev) > if (phydev->state > PHY_UP && phydev->state != PHY_HALTED) > phydev->state = PHY_UP; > mutex_unlock(&phydev->lock); > - > - /* Now we can run the state machine synchronously */ > - phy_state_machine(&phydev->state_queue.work); > } > > /** > -- Florian