Re: [PATCH for v4.9 LTS 035/111] net: phy: Fix lack of reference count on PHY driver

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 06/05/2017 12:58 PM, Levin, Alexander (Sasha Levin) wrote:
> On Mon, Jun 05, 2017 at 09:56:18AM -0700, Florian Fainelli wrote:
>> On 06/05/2017 05:15 AM, Levin, Alexander (Sasha Levin) wrote:
>>> On Sun, Jun 04, 2017 at 10:17:49AM -0700, Florian Fainelli wrote:
>>>> Hi Alex,
>>>>
>>>> On 06/04/2017 01:12 AM, Levin, Alexander (Sasha Levin) wrote:
>>>>> From: Mao Wenan <maowenan@xxxxxxxxxx>
>>>>>
>>>>> [ Upstream commit cafe8df8b9bc9aa3dffa827c1a6757c6cd36f657 ]
>>>>>
>>>>> There is currently no reference count being held on the PHY driver,
>>>>> which makes it possible to remove the PHY driver module while the PHY
>>>>> state machine is running and polling the PHY. This could cause crashes
>>>>> similar to this one to show up:
>>>>>
>>>>> [   43.361162] BUG: unable to handle kernel NULL pointer dereference at 0000000000000140
>>>>> [   43.361162] IP: phy_state_machine+0x32/0x490
>>>>> [   43.361162] PGD 59dc067
>>>>> [   43.361162] PUD 0
>>>>> [   43.361162]
>>>>> [   43.361162] Oops: 0000 [#1] SMP
>>>>> [   43.361162] Modules linked in: dsa_loop [last unloaded: broadcom]
>>>>> [   43.361162] CPU: 0 PID: 1299 Comm: kworker/0:3 Not tainted 4.10.0-rc5+ #415
>>>>> [   43.361162] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
>>>>> BIOS Ubuntu-1.8.2-1ubuntu2 04/01/2014
>>>>> [   43.361162] Workqueue: events_power_efficient phy_state_machine
>>>>> [   43.361162] task: ffff880006782b80 task.stack: ffffc90000184000
>>>>> [   43.361162] RIP: 0010:phy_state_machine+0x32/0x490
>>>>> [   43.361162] RSP: 0018:ffffc90000187e18 EFLAGS: 00000246
>>>>> [   43.361162] RAX: 0000000000000000 RBX: ffff8800059e53c0 RCX:
>>>>> ffff880006a15c60
>>>>> [   43.361162] RDX: ffff880006782b80 RSI: 0000000000000000 RDI:
>>>>> ffff8800059e5428
>>>>> [   43.361162] RBP: ffffc90000187e48 R08: ffff880006a15c40 R09:
>>>>> 0000000000000000
>>>>> [   43.361162] R10: 0000000000000000 R11: 0000000000000000 R12:
>>>>> ffff8800059e5428
>>>>> [   43.361162] R13: ffff8800059e5000 R14: 0000000000000000 R15:
>>>>> ffff880006a15c40
>>>>> [   43.361162] FS:  0000000000000000(0000) GS:ffff880006a00000(0000)
>>>>> knlGS:0000000000000000
>>>>> [   43.361162] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>>> [   43.361162] CR2: 0000000000000140 CR3: 0000000005979000 CR4:
>>>>> 00000000000006f0
>>>>> [   43.361162] Call Trace:
>>>>> [   43.361162]  process_one_work+0x1b4/0x3e0
>>>>> [   43.361162]  worker_thread+0x43/0x4d0
>>>>> [   43.361162]  ? __schedule+0x17f/0x4e0
>>>>> [   43.361162]  kthread+0xf7/0x130
>>>>> [   43.361162]  ? process_one_work+0x3e0/0x3e0
>>>>> [   43.361162]  ? kthread_create_on_node+0x40/0x40
>>>>> [   43.361162]  ret_from_fork+0x29/0x40
>>>>> [   43.361162] Code: 56 41 55 41 54 4c 8d 67 68 53 4c 8d af 40 fc ff ff
>>>>> 48 89 fb 4c 89 e7 48 83 ec 08 e8 c9 9d 27 00 48 8b 83 60 ff ff ff 44 8b
>>>>> 73 98 <48> 8b 90 40 01 00 00 44 89 f0 48 85 d2 74 08 4c 89 ef ff d2 8b
>>>>>
>>>>> Keep references on the PHY driver module right before we are going to
>>>>> utilize it in phy_attach_direct(), and conversely when we don't use it
>>>>> anymore in phy_detach().
>>>>>
>>>>> Signed-off-by: Mao Wenan <maowenan@xxxxxxxxxx>
>>>>> [florian: rebase, rework commit message]
>>>>> Signed-off-by: Florian Fainelli <f.fainelli@xxxxxxxxx>
>>>>> Signed-off-by: David S. Miller <davem@xxxxxxxxxxxxx>
>>>>>
>>>>> Signed-off-by: Sasha Levin <alexander.levin@xxxxxxxxxxx>
>>>>
>>>> This commit alone will cause problems, you will also need to pick this
>>>> one on top of it:
>>>>
>>>> 6d9f66ac7fec2a6ccd649e5909806dfe36f1fc25 ("net: phy: Fix PHY module
>>>> checks and NULL deref in phy_attach_direct()")
>>>
>>> Should I also be grabbing a7dac9f9c1
>>> ("phy: fix error case of phy_led_triggers_(un)register")?
>>>
>>> It says it fixes a commit that's not in -stable, but it looks like it's
>>> still relevant even without that commit.
>>
>> No, you don't have to pick this one, it does indeed fix something that
>> was only introduced in 4.10 and newer.
> 
> Hm, can you ack this conflict resolution of applying 6d9f66ac7fe on top
> of 4.9.30 (in particular, the code in phy_attach_direct()):

Acked-by: Florian Fainelli <f.fainelli@xxxxxxxxx>

Sorry it took a bit of time for testing because I also exercised the
error paths to make sure it was not blowing up on us, FWIW, attached was
the patch that I used.
-- 
Florian
From 2a33694744e3ed2d33c4a530118ab46d20fbe6fc Mon Sep 17 00:00:00 2001
From: Florian Fainelli <f.fainelli@xxxxxxxxx>
Date: Wed, 8 Feb 2017 19:05:26 -0800
Subject: [PATCH] net: phy: Fix PHY module checks and NULL deref in
 phy_attach_direct()

The Generic PHY drivers gets assigned after we checked that the current
PHY driver is NULL, so we need to check a few things before we can
safely dereference d->driver. This would be causing a NULL deference to
occur when a system binds to the Generic PHY driver. Update
phy_attach_direct() to do the following:

- grab the driver module reference after we have assigned the Generic
  PHY drivers accordingly, and remember we came from the generic PHY
  path

- update the error path to clean up the module reference in case the
  Generic PHY probe function fails

- split the error path involving phy_detacht() to avoid double free/put
  since phy_detach() does all the clean up

- finally, have phy_detach() drop the module reference count before we
  call device_release_driver() for the Generic PHY driver case

Fixes: cafe8df8b9bc ("net: phy: Fix lack of reference count on PHY driver")
Signed-off-by: Florian Fainelli <f.fainelli@xxxxxxxxx>
Signed-off-by: David S. Miller <davem@xxxxxxxxxxxxx>
---
 drivers/net/phy/phy_device.c | 29 +++++++++++++++++++++--------
 1 file changed, 21 insertions(+), 8 deletions(-)

diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index 67571f9627e5..14d57d0d1c04 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -860,6 +860,7 @@ int phy_attach_direct(struct net_device *dev, struct phy_device *phydev,
 	struct module *ndev_owner = dev->dev.parent->driver->owner;
 	struct mii_bus *bus = phydev->mdio.bus;
 	struct device *d = &phydev->mdio.dev;
+	bool using_genphy = false;
 	int err;
 
 	/* For Ethernet device drivers that register their own MDIO bus, we
@@ -872,11 +873,6 @@ int phy_attach_direct(struct net_device *dev, struct phy_device *phydev,
 		return -EIO;
 	}
 
-	if (!try_module_get(d->driver->owner)) {
-		dev_err(&dev->dev, "failed to get the device driver module\n");
-		return -EIO;
-	}
-
 	get_device(d);
 
 	/* Assume that if there is no driver, that it doesn't
@@ -890,12 +886,22 @@ int phy_attach_direct(struct net_device *dev, struct phy_device *phydev,
 			d->driver =
 				&genphy_driver[GENPHY_DRV_1G].mdiodrv.driver;
 
+		using_genphy = true;
+	}
+
+	if (!try_module_get(d->driver->owner)) {
+		dev_err(&dev->dev, "failed to get the device driver module\n");
+		err = -EIO;
+		goto error_put_device;
+	}
+
+	if (using_genphy) {
 		err = d->driver->probe(d);
 		if (err >= 0)
 			err = device_bind_driver(d);
 
 		if (err)
-			goto error;
+			goto error_module_put;
 	}
 
 	if (phydev->attached_dev) {
@@ -931,8 +937,14 @@ int phy_attach_direct(struct net_device *dev, struct phy_device *phydev,
 	return err;
 
 error:
-	put_device(d);
+	/* phy_detach() does all of the cleanup below */
+	phy_detach(phydev);
+	return err;
+
+error_module_put:
 	module_put(d->driver->owner);
+error_put_device:
+	put_device(d);
 	if (ndev_owner != bus->owner)
 		module_put(bus->owner);
 	return err;
@@ -993,6 +1005,8 @@ void phy_detach(struct phy_device *phydev)
 	phydev->attached_dev = NULL;
 	phy_suspend(phydev);
 
+	module_put(phydev->mdio.dev.driver->owner);
+
 	/* If the device had no specific driver before (i.e. - it
 	 * was using the generic driver), we unbind the device
 	 * from the generic driver so that there's a chance a
@@ -1013,7 +1027,6 @@ void phy_detach(struct phy_device *phydev)
 	bus = phydev->mdio.bus;
 
 	put_device(&phydev->mdio.dev);
-	module_put(phydev->mdio.dev.driver->owner);
 	if (ndev_owner != bus->owner)
 		module_put(bus->owner);
 }
-- 
2.9.3


[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]