Re: [PATCH] mm:Avoid soft lockup due to possible attempt of double locking object's lock in __delete_object

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 2016-08-31 05:08 PM, Valdis.Kletnieks@xxxxxx wrote:
> On Wed, 31 Aug 2016 08:54:21 +0100, Catalin Marinas said:
>> On Tue, Aug 30, 2016 at 02:35:12PM -0400, Nicholas Krause wrote:
>>> This fixes a issue in the current locking logic of the function,
>>> __delete_object where we are trying to attempt to lock the passed
>>> object structure's spinlock again after being previously held
>>> elsewhere by the kmemleak code. Fix this by instead of assuming
>>> we are the only one contending for the object's lock their are
>>> possible other users and create two branches, one where we get
>>> the lock when calling spin_trylock_irqsave on the object's lock
>>> and the other when the lock is held else where by kmemleak.
>>
>> Have you actually got a deadlock that requires this fix?
> 
> Almost certainly not, but that's never stopped Nicholas before.  He's a well-known
> submitter of bad patches, usually totally incorrect, not even compile tested.
> 
> He's infamous enough that he's not allowed to post to any list hosted at vger.
>
Valdis,
Rather then argue since that will go nowhere. I am posing actual patches that have been tested on
hardware. Yes I known that is surprising but it's true.
>From 6364ca82f79a2b66f012f05ca4c7467c8fb2b1e8 Mon Sep 17 00:00:00 2001
From: Nicholas Krause <xerofoify@xxxxxxxxx>
Date: Wed, 31 Aug 2016 17:20:10 -0400
Subject: [PATCH] ata:Fix incorrect function call ordering in
 pata_ninja32_init_one

This fixes a incorrect function call ordering making cards using
this driver not being able to be read or written to due to the
incorrect calling of pci_set_master before other parts of the
card are registered before the pci master bus should be registered.

Signed-off-by: Nicholas Krause <xerofoify@xxxxxxxxx>
---
 drivers/ata/pata_ninja32.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/ata/pata_ninja32.c b/drivers/ata/pata_ninja32.c
index 44f97ad..89320c9 100644
--- a/drivers/ata/pata_ninja32.c
+++ b/drivers/ata/pata_ninja32.c
@@ -128,7 +128,6 @@ static int ninja32_init_one(struct pci_dev *dev, const struct pci_device_id *id)
 	rc = dma_set_coherent_mask(&dev->dev, ATA_DMA_MASK);
 	if (rc)
 		return rc;
-	pci_set_master(dev);
 
 	/* Set up the register mappings. We use the I/O mapping as only the
 	   older chips also have MMIO on BAR 1 */
@@ -148,6 +147,7 @@ static int ninja32_init_one(struct pci_dev *dev, const struct pci_device_id *id)
 
 	ninja32_program(base);
 	/* FIXME: Should we disable them at remove ? */
+	pci_set_master(dev);
 	return ata_host_activate(host, dev->irq, ata_bmdma_interrupt,
 				 IRQF_SHARED, &ninja32_sht);
 }
-- 
2.7.4
>From 719ad39496679523c70c7dda006e6da31d9004b3 Mon Sep 17 00:00:00 2001
From: Nicholas Krause <xerofoify@xxxxxxxxx>
Date: Wed, 24 Aug 2016 02:09:39 -0400
Subject: [PATCH] pciehp: Avoid not bringing up device if already existing on
 bus

This fixes pcihp_resume to now avoid incorrectly bailing out if the
device is already live in the pci bus but currently suspended. Further
more this issue is caused by incorrectly checking the status of the
device adapter directly, instead since this adapter can be shared
we must instead also check if the pci_bus has no more links to this
adapter by checking if the pci_bus used by this adapter's device list
is also empty before enabling it. Finally do the opposite of checking
that the list is not empty before disabling in order to avoid the
same issue on disabling the slot instead.

Signed-off-by: Nicholas Krause <xerofoify@xxxxxxxxx>
---
 drivers/pci/hotplug/pciehp_core.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/drivers/pci/hotplug/pciehp_core.c b/drivers/pci/hotplug/pciehp_core.c
index ac531e6..1ce725e 100644
--- a/drivers/pci/hotplug/pciehp_core.c
+++ b/drivers/pci/hotplug/pciehp_core.c
@@ -291,7 +291,7 @@ static int pciehp_resume(struct pcie_device *dev)
 	struct controller *ctrl;
 	struct slot *slot;
 	u8 status;
-
+	struct pci_bus *pbus = dev->port->subordinate;
 	ctrl = get_service_data(dev);
 
 	/* reinitialize the chipset's event detection logic */
@@ -302,10 +302,12 @@ static int pciehp_resume(struct pcie_device *dev)
 	/* Check if slot is occupied */
 	pciehp_get_adapter_status(slot, &status);
 	mutex_lock(&slot->hotplug_lock);
-	if (status)
-		pciehp_enable_slot(slot);
-	else
+	if (status) {
+		if (list_empty(&pbus->devices))
+			pciehp_enable_slot(slot);
+	} else if (!list_empty(&pbus->devices))
 		pciehp_disable_slot(slot);
+
 	mutex_unlock(&slot->hotplug_lock);
 	return 0;
 }
-- 
2.7.4
>From 5df90143c54be886d0894c40392e1e78868137a4 Mon Sep 17 00:00:00 2001
From: Nicholas Krause <xerofoify@xxxxxxxxx>
Date: Thu, 7 Apr 2016 14:56:12 -0400
Subject: [PATCH RESEND] aic94xx:Fix resource freeing on both error and succession paths in asd_abort_task

This fixes resource freeing of the linked list node as part of the
structure pointer, ascb by means of list_del_init in order to
avoid getting a WARN_ON due to the linked_list node not being
freed before the call to asd_adcb_free as seen by this kernel
panic trace:
[350548.713114] ------------[ cut here ]------------
[350548.713130] kernel BUG at drivers/scsi/aic94xx/aic94xx_hwi.h:344!
[350548.713144] invalid opcode: 0000 [#1] SMP
[350548.713159] Modules linked in: w83793 w83627hf hwmon_vid aic94xx
[350548.713184] CPU: 1 PID: 1372 Comm: scsi_eh_6 Tainted: G        W       4.4.3-gentoo #1
[350548.713200] Hardware name: Supermicro X7DB8/X7DB8, BIOS 2.1c 07/04/2011
[350548.713215] task: ffff8800bb992340 ti: ffff8800bb168000 task.ti: ffff8800bb168000
[350548.713231] RIP: 0010:[<ffffffffa000b9e4>]  [<ffffffffa000b9e4>] asd_abort_task+0x444/0x450 [aic94xx]
[350548.713257] RSP: 0018:ffff8800bb16bd18  EFLAGS: 00010283
[350548.713269] RAX: 0000000000000246 RBX: ffff88007b0d7840 RCX: 0000000000000000
[350548.713284] RDX: 0000000000000001 RSI: 0000000000000246 RDI: ffff8800bbbc6808
[350548.713299] RBP: ffff8800bb16bd90 R08: ffff8800bb168000 R09: 0000000000000000
[350548.713314] R10: 0000000000000000 R11: 0000000000000004 R12: ffff8800b9d18000
[350548.713329] R13: ffff880034e0bcc0 R14: ffff8800bbbc6808 R15: ffff8800bb16bd48
[350548.713590] FS:  0000000000000000(0000) GS:ffff88013fc80000(0000) knlGS:0000000000000000
[350548.713853] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[350548.713989] CR2: 00007fa7f9e49000 CR3: 0000000001e09000 CR4: 00000000000406e0
[350548.714086] Stack:
[350548.714391]  0000000000000000 0000880000000000 ffffffff00000000 ffff880000000000
[350548.714391]  ffff8800bb16bd38 ffff8800bb16bd38 ffff880000000000 ffff880000000000
[350548.714391]  ffff8800bb16bd58 ffff8800bb16bd58 0000000000000005 ffff8800bb16bde8
[350548.714391] Call Trace:
[350548.714391]  [<ffffffff8159dd6e>] sas_scsi_recover_host+0x42e/0xb60
[350548.714391]  [<ffffffff81584eb2>] scsi_error_handler+0xa2/0x520
[350548.714391]  [<ffffffff81806968>] ? __schedule+0x348/0x8b0
[350548.714391]  [<ffffffff81584e10>] ? scsi_eh_get_sense+0x170/0x170
[350548.714391]  [<ffffffff81584e10>] ? scsi_eh_get_sense+0x170/0x170
[350548.714391]  [<ffffffff81095154>] kthread+0xc4/0xe0
[350548.714391]  [<ffffffff81095090>] ? kthread_worker_fn+0x140/0x140
[350548.714391]  [<ffffffff8180a91f>] ret_from_fork+0x3f/0x70
[350548.714391]  [<ffffffff81095090>] ? kthread_worker_fn+0x140/0x140
[350548.714391] Code: fe ff ff b8 02 00 00 00 eb e9 8b 45 90 85 c0 75 e2 4c 89 e7 e8 de f4 ff ff 89 45 88 e9 cf fe ff ff b8 f4 ff ff ff e9 06 fe ff ff <0f> 0b c7 45 88 05 00 00 00 eb 99 90 55 31 c9 ba 02 00 00 00 48
[350548.714391] RIP  [<ffffffffa000b9e4>] asd_abort_task+0x444/0x450 [aic94xx]
[350548.714391]  RSP <ffff8800bb16bd18>
[350548.714978] ---[ end trace ce789fd9c515a590 ]---

Signed-off-by: Nicholas Krause <xerofoify@xxxxxxxxx>
---
 drivers/scsi/aic94xx/aic94xx_tmf.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/scsi/aic94xx/aic94xx_tmf.c b/drivers/scsi/aic94xx/aic94xx_tmf.c
index d4c35df..bd45b95 100644
--- a/drivers/scsi/aic94xx/aic94xx_tmf.c
+++ b/drivers/scsi/aic94xx/aic94xx_tmf.c
@@ -545,6 +545,7 @@ int asd_abort_task(struct sas_task *task)
 	tascb->completion = NULL;
 	if (res == TMF_RESP_FUNC_COMPLETE) {
 		task->lldd_task = NULL;
+		list_del_init(&ascb->list);
 		mb();
 		asd_ascb_free(tascb);
 	}
@@ -552,6 +553,7 @@ int asd_abort_task(struct sas_task *task)
 	return res;
 
  out_free:
+	list_del_init(&ascb->list);
 	asd_ascb_free(ascb);
 	ASD_DPRINTK("task 0x%p aborted, res: 0x%x\n", task, res);
 	return res;
-- 
2.5.0
Nick

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>



[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]