On Thu, 2011-07-14 at 00:41 -0700, Mike Christie wrote: > On 07/14/2011 01:33 AM, Eddie Wai wrote: > > > > On Mon, 2011-07-11 at 13:02 -0700, Mike Christie wrote: > >> On 07/11/2011 01:14 PM, Eddie Wai wrote: > >>> A kernel panic was observed when passing the sc->request->cpu = -1 to > >>> retrieve the per_cpu variable pointer: > >>> #0 [ffff880011203960] machine_kexec at ffffffff81022bc3 > >>> #1 [ffff8800112039b0] crash_kexec at ffffffff81088630 > >>> #2 [ffff880011203a80] __die at ffffffff8139ea20 > >>> #3 [ffff880011203aa0] no_context at ffffffff8102f3a7 > >>> #4 [ffff880011203ae0] __bad_area_nosemaphore at ffffffff8102f665 > >>> #5 [ffff880011203ba0] retint_signal at ffffffff8139dd1f > >>> #6 [ffff880011203cc8] bnx2i_indicate_kcqe at ffffffffa03dc4f2 > >>> #7 [ffff880011203da8] service_kcqes at ffffffffa03cb04f > >>> #8 [ffff880011203e68] cnic_service_bnx2x_kcq at ffffffffa03cb14a > >>> #9 [ffff880011203e88] cnic_service_bnx2x_bh at ffffffffa03cb1b3 > >>> > >>> The problem lies in the sg_io (and perhaps sg_scsi_ioctl) call to > >>> blk_get_request->get_request/wait->blk_alloc_request->blk_rq_init which > >>> re-initializes the request->cpu to -1. There is no assignment for cpu from > >>> that to the request_fn call to low level drivers. > >>> > >>> When this happens, the sc->request->cpu will be using the init value of > >>> -1. This will create a kernel panic when it hits bnx2i because the code > >>> refers it to get the per_cpu variables ptr. > >>> > >>> This change is to put in a guard against that and also for cases when > >>> CONFIG_SMP/BIO_CPU_AFFINE is not enabled. In those cases, the cpu > >>> affinitization code would not get run in __make_request either; hence > >>> the request->cpu will remain a -1 also. > >>> > >> > >> > >>> > >>> diff --git a/drivers/scsi/bnx2i/bnx2i_iscsi.c b/drivers/scsi/bnx2i/bnx2i_iscsi.c > >>> index 5c55a75..622383d 100644 > >>> --- a/drivers/scsi/bnx2i/bnx2i_iscsi.c > >>> +++ b/drivers/scsi/bnx2i/bnx2i_iscsi.c > >>> @@ -1225,6 +1225,10 @@ static int bnx2i_task_xmit(struct iscsi_task *task) > >>> if (!sc) > >>> return bnx2i_mtask_xmit(conn, task); > >>> > >>> + if (!blk_rq_cpu_valid(sc->request)) { > >>> + sc->request->cpu = get_cpu(); > >>> + put_cpu(); > >>> + } > >> > >> > >> If I understand you right, then I think this needs to get fixed in the > >> block or scsi layer instead of each LLD. > > > > Absolutely, but this bnx2i fix is still applicable alongside the fixes I'm > > proposing in the block layer below. > > > > I think the whole idea behind the tracking of the blk req->cpu is so that the blk completion > > can be fired off from the same CPU to take advantage of the CPU's llc. However, this is only being > > done when the queue is defined with the QUEUE_FLAG_SAME_COMP queue_flag enabled. In the case > > when the queue is defined without this enforced, it would then be up to the blk completion > > code to complete the blk request with the current CPU of the thread. > > > > The same analogy should apply to the iSCSI LLD for cmd completion as well. So if the > > sc->request->cpu is left at -1, the LLD should then decide how it wants the cmd completion > > to take place. For all the other cases, the request->cpu id should be used instead. > > For bnx2i, if the blk layer didn't set the request->cpu, we would want to align and complete > > the cmd against the task_xmit issuer's CPU id unconditionally; hence the explicit get_cpu call. > > Oh yeah, wrt the code we have today (so I mean ignoring my rant in the > other mail :)), I think what you are doing in your patch is sort of ok. > I am not sure if we want to be touching the request->cpu field in a LLD > though. I do not think LLDs should be modifying a block layer struct > like that probably. > Agreed. sc->request->cpu should be left to -1 in this case. I'll go ahead and defined a local variable to keep track of it. Thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html