RE: sg_remove and pending write request

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thank you for your reply and comments, Douglas. The user land product is
waiting for release. The time frame doesn't allow doing many changes in
the user land product at this time.

Read the sg.c again. It seems that the reattached SAS devices would take
the same sg slot if the following conditions meet
1. wait for 2+ minutes for a pending SG-IO write request to come back
before pushing the cable back. The 2+ minutes gives the scsi middle
level to timeout the pending io request and do error-recovery if it is
needed.
2. close user space fd properly (sg_release will try to do the
sg_dev_arr[k] = NULL.

Do you see any other conditions?

Thanks,

Yanling
-----Original Message-----
From: Douglas Gilbert [mailto:dougg@xxxxxxxxxx] 
Sent: Tuesday, October 17, 2006 2:37 PM
To: Qi, Yanling
Cc: linux-scsi@xxxxxxxxxxxxxxx
Subject: Re: sg_remove and pending write request

Qi, Yanling wrote:
> Hi All,
> 
> We are running a test case of SAS cable pull/push on a SAS RAID
system.
> After the SAS cable is pulled from a SAS RAID, scsi devices are
deleted.
> And then when the cable is pushed back, the scsi device with the same
> H:C:T:L sometime will be assigned to a diffent sgX.

There is no guarantee of the naming stability of sg
nodes (e.g. /dev/sg3) when devices disappear and re-appear.
Actually the design of lk 2.6 seems to actively discourage
user space programs from the assumption. Same applies for
all SCSI device nodes (and host numbers)

In the case of SAS, you really should be looking at the
target port SAS address in the device identification VPD
page (page 0x83). If the device in question is a SATA disk
then you have more work to do.

> Reading through the sg.c, it seems that if the sg device has a pending
> write request, the sg slot (sg_dev_arr[k] = NULL) will not be freed
> during sg_remove time. Can someone confirm this?

Yes, I can confirm that. The sg driver waits for the mid
level to callback with the outstanding IO completions (or
timeouts). If the user kills the process, the sg driver
still waits for IO completion. [A problem arises if the
user tries to 'rmmod sg'.] The device could well re-appear
during that "wait" time and the sg driver will assign a
different device node (i.e. the first unused slot in
sg_dev_arr[]).

> If this is the case, what the user space process do to prevent this
from
> happening?

Develop a user space program that applies fast acting
super glue to the SAS connectors when IOs are in flight
and hands approach.

As I said above, you cannot assume device node names will
be stable across disconnect, reconnect cycles.

> I see that the sg.c sends SIGPOLL to the user space process
> (kill_fasync(&sfp->async_qp, SIGPOLL,POLL_HUP);), what this signal
will
> be translated to the user space return-code from read/write call?

You would need to be running asynchronous IO with the
sg driver (i.e. write(),poll(),read() rather than SG_IO)
and POLLUP should appear in struct pollfd::revents .
You should also be able to run poll() from a signal handler
that catches SIGPOLL. [My knowledge is a bit rusty in this
area.]

Doug Gilbert


-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]
  Powered by Linux