Hi, James, Ewan, On 2016/12/8 10:33, James Bottomley wrote: > On Thu, 2016-12-08 at 10:28 +0800, Wei Fang wrote: >> Hi, James, Ewan, >> >> On 2016/12/8 7:43, James Bottomley wrote: >>> On Wed, 2016-12-07 at 15:30 -0500, Ewan D. Milne wrote: >>>> On Wed, 2016-12-07 at 12:09 -0800, James Bottomley wrote: >>>>> Hm, it looks like the state set in scsi_sysfs_add_sdev() is >>>>> bogus. >>>>> We expect the state to have been properly set before that (in >>>>> scsi_add_lun), so can we not simply remove it? >>>>> >>>>> James >>>>> >>>> >>>> I was considering that, but... >>>> >>>> enum scsi_device_state { >>>> SDEV_CREATED = 1, /* device created but not added >>>> to >>>> sysfs >>>> >>>> >>>> >>>> >>>> * Only internal commands allowed >>>> (for inq) */ >>>> >>>> So it seems the intent was for the state to not change until >>>> then. >>> >>> I think this is historical. There was a change somewhere that >>> moved >>> the sysfs state handling out of the sdev stat to is_visible, so the >>> sdev state no-longer reflects it. >>> >>>> The call to set the SDEV_RUNNING state earlier in scsi_add_lun() >>>> was added with: >>>> >>>> commit 6f4267e3bd1211b3d09130e626b0b3d885077610 >>>> Author: James Bottomley <James.Bottomley@xxxxxxxxxxxxxxxxxxxxx> >>>> Date: Fri Aug 22 16:53:31 2008 -0500 >>>> >>>> [SCSI] Update the SCSI state model to allow blocking in the >>>> created state >>>> >>>> Which allows the device to go into ->BLOCK (which is needed, >>>> since it >>>> actually happens). >>>> >>>> Should we remove the call from scsi_sysfs_add_sdev() and change >>>> the >>>> comment in scsi_device.h to reflect the intent? >> >> This sounds reasonable. >> >>> Assuming someone with the problem actually tests it, yes. >> >> This problem can be stably reproduced on Zengxi Chen's machine, who >> reported the bug. We can test it on this machine. >> >> The patch is as below, just for sure: >> >> diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c >> index 0734927..82dfe07 100644 >> --- a/drivers/scsi/scsi_sysfs.c >> +++ b/drivers/scsi/scsi_sysfs.c >> @@ -1204,10 +1204,6 @@ int scsi_sysfs_add_sdev(struct scsi_device >> *sdev) >> struct request_queue *rq = sdev->request_queue; >> struct scsi_target *starget = sdev->sdev_target; >> >> - error = scsi_device_set_state(sdev, SDEV_RUNNING); >> - if (error) >> - return error; >> - I looked through those code and found that if we fix this bug by removing setting the state in scsi_sysfs_add_sdev(), it can't be fixed completely: scsi_device_set_state(sdev, SDEV_RUNNING) in scsi_add_lun() and scsi_device_set_state(sdev, SDEV_CREATED_BLOCK) in scsi_internal_device_block() can be called simultaneously. Because there is no synchronization between scsi_device_set_state(), those calls may both return success, and the state may be SDEV_RUNNING after that, and the device queue is stopped. Thanks, Wei > That's it, although not the second hunk: CREATED still means device not > added to sysfs. It's just that RUNNING now doesn't mean it is. > > James > > > > . > -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html