Re: [PATCH] scsi: fix ata_port_wait_eh() hung caused by missing to wake up eh thread

"zhengbin (A)" <zhengbin13@xxxxxxxxxx> · Mon, 25 Mar 2019 20:49:12 +0800



On 2019/3/25 20:20, John Garry wrote:
> On 25/03/2019 11:55, zhengbin (A) wrote:
>> On 2019/3/25 18:02, Pavel Tikhomirov wrote:
>>> Likely we should do the same for host_eh_scheduled++ as we do for
>>> host_failed++ in scsi_eh_inc_host_failed. Just put it in call_rcu. These
>>> way rcu_read_lock would be enough: if we don't see RECOVERY state in
>>> scsi_dec_host_busy, that means that host_eh_scheduled++ and
>>> corresponding scsi_eh_wakeup not yet happened, and they will handle
>>> wakeup for us.
>>>
>>> Bart did these rcu-way to make common path faster, so better we do it
>>> without slow mem-barrier.
> 
> Out of curiousity, which platform did you observe this on?
ARM64
> 
> Thanks,
> John
> 
>> If we use rcu for host_eh_scheduled. we should add rcu_head in Scsi_Host,
>> replace scsi_schedule_eh(ap->scsi_host) with call_rcu in ata_std_sched_eh, sas_queue_reset??
>> This will trigger a kernel hang or oops because of double or more call_rcu() call.
>>
>> Any more suggestions?
>>>
>>> On 3/25/19 12:05 PM, zhengbin wrote:
>>>> When I use fio test kernel in the following steps:
>>>> 1.The sas controller mixes SAS/SATA disks
>>>> 2.Use fio test all disks
>>>> 3.Simultaneous enable/disable/link_reset/hard_reset PHY
>>>>
>>>> it will hung in ata_port_wait_eh
>>>> Call trace:
>>>>   __switch_to+0xb4/0x1b8
>>>>   __schedule+0x1e8/0x718
>>>>   schedule+0x38/0x90
>>>>   ata_port_wait_eh+0x70/0xf8
>>>>   sas_ata_wait_eh+0x24/0x30 [libsas]
>>>>   transport_sas_phy_reset.isra.3+0x128/0x160 [libsas]
>>>>   phy_reset_work+0x20/0x30 [libsas]
>>>>   process_one_work+0x1e4/0x460
>>>>   worker_thread+0x40/0x450
>>>>   kthread+0x12c/0x130
>>>>   ret_from_fork+0x10/0x18
>>>>
>>>> The key code process is like this:
>>>> scsi_dec_host_busy
>>>>     atomic_dec(&shost->host_busy);
>>>>     if (unlikely(scsi_host_in_recovery(shost))) {
>>>>         spin_lock_irqsave(shost->host_lock, flags);
>>>>         ...
>>>>         scsi_eh_wakeup(shost)
>>>>         ...
>>>>     }
>>>>
>>>> scsi_schedule_eh
>>>>     spin_lock_irqsave(shost->host_lock, flags);
>>>>     if (scsi_host_set_state(shost, SHOST_RECOVERY) == 0 ||
>>>>         scsi_host_set_state(shost, SHOST_CANCEL_RECOVERY) == 0) {
>>>>         ...
>>>>         scsi_eh_wakeup(shost);
>>>>     }
>>>>
>>>> scsi_eh_wakeup
>>>>     if (scsi_host_busy(shost) == shost->host_failed)
>>>>         wake_up_process(shost->ehandler);
>>>>
>>>> In scsi_dec_host_busy, host_busy & shost_state not in spinlock. Neither
>>>> function wakes up the SCSI error handler in the following timing:
>>>>
>>>> CPU 0(call scsi_dec_host_busy)    CPU 1(call scsi_schedule_eh)
>>>> LOAD shost_state(!=recovery)
>>>>                                    scsi_host_set_state(SHOST_RECOVERY)
>>>>                                    scsi_eh_wakeup(host_busy != host_failed)
>>>> atomic_dec(&shost->host_busy);
>>>> if (scsi_host_in_recovery(shost))
>>>>
>>>> Add a smp_mb between host_busy and shost_state.
>>>>
>>>> Signed-off-by: zhengbin <zhengbin13@xxxxxxxxxx>
>>>> ---
>>>>   drivers/scsi/scsi_error.c | 7 +++++++
>>>>   drivers/scsi/scsi_lib.c   | 5 +++++
>>>>   2 files changed, 12 insertions(+)
>>>>
>>>> diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
>>>> index 1b8378f..c605a71 100644
>>>> --- a/drivers/scsi/scsi_error.c
>>>> +++ b/drivers/scsi/scsi_error.c
>>>> @@ -88,6 +88,13 @@ void scsi_schedule_eh(struct Scsi_Host *shost)
>>>>
>>>>       if (scsi_host_set_state(shost, SHOST_RECOVERY) == 0 ||
>>>>           scsi_host_set_state(shost, SHOST_CANCEL_RECOVERY) == 0) {
>>>> +        /*
>>>> +         * We have to order shost_state store above and test of
>>>> +         * the host_busy(scsi_eh_wakeup will test it), because
>>>> +         * scsi_dec_host_busy accesses these variables without
>>>> +         * host_lock.
>>>> +         */
>>>> +        smp_mb();
>>>>           shost->host_eh_scheduled++;
>>>>           scsi_eh_wakeup(shost);
>>>>       }
>>>> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
>>>> index 601b9f1..9094d20 100644
>>>> --- a/drivers/scsi/scsi_lib.c
>>>> +++ b/drivers/scsi/scsi_lib.c
>>>> @@ -337,6 +337,11 @@ static void scsi_dec_host_busy(struct Scsi_Host *shost)
>>>>
>>>>       rcu_read_lock();
>>>>       atomic_dec(&shost->host_busy);
>>>> +    /*
>>>> +     * We have to order host_busy dec above and test of the shost_state
>>>> +     * below outside the host_lock.
>>>> +     */
>>>> +    smp_mb();
>>>>       if (unlikely(scsi_host_in_recovery(shost))) {
>>>>           spin_lock_irqsave(shost->host_lock, flags);
>>>>           if (shost->host_failed || shost->host_eh_scheduled)
>>>> -- 
>>>> 2.7.4
>>>>
>>>
>>
>>
>> .
>>
> 
> 
> 
> .
>