Re: [PATCH 1/3] target: iscsi: fix hang in the iSCSI login code

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



pá 10. 3. 2023 v 20:53 odesílatel Mike Christie
<michael.christie@xxxxxxxxxx> napsal:
>
> On 3/10/23 4:04 AM, Maurizio Lombardi wrote:
> > If the initiator suddenly stops sending data during a login while
> > keeping the TCP connection open, the sk_data_ready callback won't
> > schedule the login_work and the latter
> > will never timeout and release the login semaphore.
> >
> > All the other login operations will therefore get stuck waiting
> > for the semaphore to be released.
> >
>
> You mean np_login_sem right? Do you know why we have to serialize access
> to the np during login? Is it just a simple way to handle the internal
> target variables for things like MC/s, reinstatement, etc? I saw the tsih
> case, but am wondering how easy it is to remove.

Yes, I mean the np_login_sem, right now I don't know for sure how easy
is to remove it, but I agree that probably there is room for improvement here
and the login code can be simplified a lot.

>
> If we need the sem why do we use the sk_callback/non-blocking type of
> approach when we can only do 1 login at a time per tpg?

Yeah, I guess that the author just wanted the login thread to go back to
kernel_accept() as fast as possible to be able to serve the next connection.
But yes, if the new connection is against the same tpg, the login
thread will have
to wait for the semaphore anyway.


>
> It seems like we could simplify the login code by do a blocking type of
> implementation where:
>
> 1. __iscsi_target_login_thread starts a workstruct that runs
> iscsi_target_start_negotiation.  It would then run iscsi_target_do_login_rx
> which just waits for a response. When we get one, it does iscsi_target_do_login
> and if we need more PDUs loops. We have one timer for all this.
>
> 2. We can remove the np_login_sem. It would be replaced by a workstruct
> in the np. __iscsi_target_login_thread would just flush the work to make sure
> we are not running a login on the same np already.
>
> 3. We can remove all the sk_callback* related code for iscsit tcp since
> recvmsg just return failure when the state changes.
>
> 4. It looks like cxgbit will work with some small changes because I think it
> just kicks off iscsi_target_do_login_rx after sending a login PDU, then it
> just waits in its iscsit_get_login_rx.
>
> 5. It looks like isert will also work, because it's isert_get_login_rx can
> just wait on the login_req_comp/login_comp already.
>

Ok I think I could work on this new login code.
However, I would suggest to consider taking this patchset as a
stopgap, so the Delphix guys
can keep their customers happy and I can work on this
without feeling the pressure; also, this patchset is easier and safer
to backport to
stable kernels.

Maurizio





[Index of Archives]     [Linux SCSI]     [Kernel Newbies]     [Linux SCSI Target Infrastructure]     [Share Photos]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Device Mapper]

  Powered by Linux