Re: Question about iscsi session block

Zhengyuan Liu <liuzhengyuang521@xxxxxxxxx> · Tue, 24 May 2022 14:29:22 +0800

Hi, Mike,

Sorry for the delayed reply since I have no  environment to check your
bellow patcheset untile recently

https://lore.kernel.org/all/20220226230435.38733-1-michael.christie@xxxxxxxxxx/

After applied those series, the total time has dropped from 80s to
nearly 10s, it's a great improvement.

Thanks, again

On Sun, Feb 27, 2022 at 7:00 AM Mike Christie
<michael.christie@xxxxxxxxxx> wrote:
>
> On 2/15/22 8:19 PM, michael.christie@xxxxxxxxxx wrote:
> > On 2/15/22 7:28 PM, Zhengyuan Liu wrote:
> >> On Wed, Feb 16, 2022 at 12:31 AM Mike Christie
> >> <michael.christie@xxxxxxxxxx> wrote:
> >>>
> >>> On 2/15/22 9:49 AM, Zhengyuan Liu wrote:
> >>>> Hi, all
> >>>>
> >>>> We have an online server which uses multipath + iscsi to attach storage
> >>>> from Storage Server. There are two NICs on the server and for each it
> >>>> carries about 20 iscsi sessions and for each session it includes about 50
> >>>>  iscsi devices (yes, there are totally about 2*20*50=2000 iscsi block devices
> >>>>  on the server). The problem is: once a NIC gets faulted, it will take too long
> >>>> (nearly 80s) for multipath to switch to another good NIC link, because it
> >>>> needs to block all iscsi devices over that faulted NIC firstly. The callstack is
> >>>>  shown below:
> >>>>
> >>>>     void iscsi_block_session(struct iscsi_cls_session *session)
> >>>>     {
> >>>>         queue_work(iscsi_eh_timer_workq, &session->block_work);
> >>>>     }
> >>>>
> >>>>  __iscsi_block_session() -> scsi_target_block() -> target_block() ->
> >>>>   device_block() ->  scsi_internal_device_block() -> scsi_stop_queue() ->
> >>>>  blk_mq_quiesce_queue()>synchronize_rcu()
> >>>>
> >>>> For all sessions and all devices, it was processed sequentially, and we have
> >>>> traced that for each synchronize_rcu() call it takes about 80ms, so
> >>>> the total cost
> >>>> is about 80s (80ms * 20 * 50). It's so long that the application can't
> >>>> tolerate and
> >>>> may interrupt service.
> >>>>
> >>>> So my question is that can we optimize the procedure to reduce the time cost on
> >>>> blocking all iscsi devices?  I'm not sure if it is a good idea to increase the
> >>>> workqueue's max_active of iscsi_eh_timer_workq to improve concurrency.
> >>>
> >>> We need a patch, so the unblock call waits/cancels/flushes the block call or
> >>> they could be running in parallel.
> >>>
> >>> I'll send a patchset later today so you can test it.
> >>
> >> I'm glad to test once you push the patchset.
> >>
> >> Thank you, Mike.
> >
> > I forgot I did this recently :)
> >
> > commit 7ce9fc5ecde0d8bd64c29baee6c5e3ce7074ec9a
> > Author: Mike Christie <michael.christie@xxxxxxxxxx>
> > Date:   Tue May 25 13:18:09 2021 -0500
> >
> >     scsi: iscsi: Flush block work before unblock
> >
> >     We set the max_active iSCSI EH works to 1, so all work is going to execute
> >     in order by default. However, userspace can now override this in sysfs. If
> >     max_active > 1, we can end up with the block_work on CPU1 and
> >     iscsi_unblock_session running the unblock_work on CPU2 and the session and
> >     target/device state will end up out of sync with each other.
> >
> >     This adds a flush of the block_work in iscsi_unblock_session.
> >
> >
> > It was merged in 5.14.
>
> Hey, I found one more bug when max_active > 1. While fixing it I decided to just
> fix this so we can do the sessions recoveries in parallel and the user doesn't have
> to worry about setting max_active.
>
> I'll send a patchset and cc you.