Re: [PATCH 1/2] lpfc: call lpfc_sli_validate_fcp_iocb() with the hbalock held

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jul 18, 2016 at 04:28:57PM -0400, Laurence Oberman wrote:
> 
> 
> ----- Original Message -----
> > From: "Johannes Thumshirn" <jthumshirn@xxxxxxx>
> > To: "Martin K . Petersen" <martin.petersen@xxxxxxxxxx>, "James Bottomley" <jejb@xxxxxxxxxxxxxxxxxx>, "James Smart"
> > <james.smart@xxxxxxxxxxxxx>, "Dick Kennedy" <dick.kennedy@xxxxxxxxxxxxx>
> > Cc: "Linux SCSI Mailinglist" <linux-scsi@xxxxxxxxxxxxxxx>, "Johannes Thumshirn" <jthumshirn@xxxxxxx>
> > Sent: Monday, July 18, 2016 10:06:03 AM
> > Subject: [PATCH 1/2] lpfc: call lpfc_sli_validate_fcp_iocb() with the hbalock held
> > 
> > Call lpfc_sli_validate_fcp_iocb() with the hbalock held, as the pointer to
> > iocbq is not guaranteed to still be valid after looking it up.
> > 
> > Signed-off-by: Johannes Thumshirn <jthumshirn@xxxxxxx>
> > ---
> >  drivers/scsi/lpfc/lpfc_sli.c | 2 ++
> >  1 file changed, 2 insertions(+)
> > 
> > diff --git a/drivers/scsi/lpfc/lpfc_sli.c b/drivers/scsi/lpfc/lpfc_sli.c
> > index 1a248a2..d58ec4c 100644
> > --- a/drivers/scsi/lpfc/lpfc_sli.c
> > +++ b/drivers/scsi/lpfc/lpfc_sli.c
> > @@ -9978,6 +9978,7 @@ lpfc_sli_sum_iocb(struct lpfc_vport *vport, uint16_t
> > tgt_id, uint64_t lun_id,
> >  	struct lpfc_iocbq *iocbq;
> >  	int sum, i;
> >  
> > +	spin_lock_irq(&phba->hbalock);
> >  	for (i = 1, sum = 0; i <= phba->sli.last_iotag; i++) {
> >  		iocbq = phba->sli.iocbq_lookup[i];
> >  
> > @@ -9985,6 +9986,7 @@ lpfc_sli_sum_iocb(struct lpfc_vport *vport, uint16_t
> > tgt_id, uint64_t lun_id,
> >  						ctx_cmd) == 0)
> >  			sum++;
> >  	}
> > +	spin_unlock_irq(&phba->hbalock);
> >  
> >  	return sum;
> >  }
> > --
> > 1.8.5.6
> > 
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> > the body of a message to majordomo@xxxxxxxxxxxxxxx
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > 
> Hi Johannes
> 
> The code looks correct, but I would get a review from James Smart or Dick Kennedy at Emulex (now Broadcom :))

Sure, that's why they're on Cc ;-).

> Have you experienced an issue already due to this, can you share the stack trace of the issue if you have it.

We have had an issue at a customer's side. It's actually pretty hard to
trigger (it involves two separated SANs and several FC switches in between and
link bounces) but the core dumps show that in lpfc_sli_validate_fcp_iocb() the
iocbq address in invalid. It actually _is_ a valid kernel pointer and _was_ a
valid 'struct lpfc_iocbq' but isn't anymore. So we suggest a use-after-free.
This patch is running OK withing one of our partner's QA sites for 3 weeks now
(we've been chasing it since October).

c.f this part of our partner's analysis:
crash> bt
PID: 62174  TASK: ffff880285fe40c0  CPU: 0   COMMAND: "kworker/0:1"
 #0 [ffff88027dfb7bc0] crash_kexec at ffffffff8008e4ca
 #1 [ffff88027dfb7c90] oops_end at ffffffff804129a8
 #2 [ffff88027dfb7cb0] general_protection at ffffffff80411d28
    [exception RIP: lpfc_sli_validate_fcp_iocb+148]
    RIP: ffffffffa0683114  RSP: ffff88027dfb7d60  RFLAGS: 00010246
    RAX: 2e362e322f6e6f68  RBX: 0000000000000000  RCX: 0000000000000000
    RDX: 0000000000000000  RSI: ffff8803c2417678  RDI: ffff88027d5a1a70
    RBP: ffff8803d8c28000   R8: 0000000000000001   R9: 0000000000000000
    R10: ffffffffa06f7898  R11: 0000000000000000  R12: 0000000000001340
    R13: ffff8803c2ac2000  R14: ffff8803c2417678  R15: 0000000000000000
    ORIG_RAX: ffffffffffffffff  CS: e030  SS: e02b
 #3 [ffff88027dfb7d68] lpfc_sli_abort_iocb at ffffffffa06831bb [lpfc]
 #4 [ffff88027dfb7dc8] fc_terminate_rport_io at ffffffffa0502aea [scsi_transport_fc]
 #5 [ffff88027dfb7de8] fc_timeout_deleted_rport at ffffffffa050338c [scsi_transport_fc]
 #6 [ffff88027dfb7e28] process_one_work at ffffffff80062ea8
 #7 [ffff88027dfb7e78] worker_thread at ffffffff800668aa
 #8 [ffff88027dfb7ee8] kthread at ffffffff8006a506
 #9 [ffff88027dfb7f48] kernel_thread_helper at ffffffff8041a244


crash> dis lpfc_sli_validate_fcp_iocb
...
0xffffffffa068310f <lpfc_sli_validate_fcp_iocb+143>:    nop
0xffffffffa0683110 <lpfc_sli_validate_fcp_iocb+144>:    mov    -0x58(%rdi),%rax
0xffffffffa0683114 <lpfc_sli_validate_fcp_iocb+148>:    mov    (%rax),%rax
0xffffffffa0683117 <lpfc_sli_validate_fcp_iocb+151>:    test   %rax,%rax
...

%RDI - 0x58  == 0xFFFF88027D5A1A18

crash>  rd 0xFFFF88027D5A1A18 20
ffff88027d5a1a18:  2e362e322f6e6f68 32356266650a0d39   hon/2.6.9..efb52
ffff88027d5a1a28:  2f6563697665642f 2f302f7375623278   /device/x2bus/0/
ffff88027d5a1a38:  00646e656b636162 ffff88027aaa0000   backend....z....
ffff88027d5a1a48:  0000002143ee2000 ffff88027aaa1f40   . .C!...@..z....

> 
> Reviewed-by Laurence Oberman <loberman@xxxxxxxxxx>

-- 
Johannes Thumshirn                                          Storage
jthumshirn@xxxxxxx                                +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]
  Powered by Linux