> -----Original Message----- > From: linux-scsi-owner@xxxxxxxxxxxxxxx <linux-scsi- > owner@xxxxxxxxxxxxxxx> On Behalf Of John Garry > Sent: Friday, October 19, 2018 2:19 AM > To: Chris Moore - C33997 <Chris.Moore@xxxxxxxxxxxxx>; hare@xxxxxxx; > linux-scsi@xxxxxxxxxxxxxxx; Jason Yan <yanaijie@xxxxxxxxxx> > Subject: Re: Looking for some help understanding error handling > > On 05/10/2018 16:51, Chris.Moore@xxxxxxxxxxxxx wrote: > > Thanks Hannes, > > > > After some pointers from Shane Seymour I found that the FC and SRP > > transport layers have a devloss timer, so that when a device > > disappears they hold on to the target information for a time waiting > > to see if it comes back. The SAS transport layer doesn't have that feature. > > > > The options for me then would be to modify scsi_transport_sas.c to > > implement the devloss timeout, or to put that functionality into my LLDD. > > > > I'm willing to put the work into the SAS transport and libsas, but I > > suspect there's not a universal need for it. And since my LLDD is for > > internal use at our company and won't be upstreamed, I'll probably > > just do the work there. If anyone feels that this is a feature that more > people would want then I'll look into doing that. > > Hello, > > This feature sounds interesting for libsas. I however have a question on > feasibility of devloss here (note: I'm not familiar with the concept/realization > for other standards): if a device is deattached and re-attached, how can we > confirm the same device? For SAS device it's ok as a disk has the WWN, but > what about SATA? > > Thanks, > John Would the serial number work? I haven't worked a lot with SATA drives, but ATA8-ACS says the IDENTIFY DEVICE response must contain a unique serial number. Chris > > > > > Thanks, > > Chris > > > >> -----Original Message----- > >> From: Hannes Reinecke [mailto:hare@xxxxxxx] > >> Sent: Friday, October 5, 2018 8:01 AM > >> To: Chris Moore - C33997 <Chris.Moore@xxxxxxxxxxxxx>; linux- > >> scsi@xxxxxxxxxxxxxxx > >> Subject: Re: Looking for some help understanding error handling > >> > >> On 10/2/18 11:04 PM, Chris.Moore@xxxxxxxxxxxxx wrote: > >>> I'm working on LLDD for a SAS/SATA host adapter, and trying to > >>> understand > >> how the system handles link loss and recovery. > >>> > >>> Say I have a device that gets recognized and attached as sd > >>> 12:0:4:0, at > >> /dev/sdb. > >>> The drive goes offline temporarily, then comes back online. > >>> When it does, it comes back as sd 12:0:5:0, and maybe /dev/sdb, > >>> maybe > >> /dev/sdc. > >>> > >>> I'm not sure how the Id gets assigned. Since this is the same > >>> drive, is there some way my driver can tell libsas and/or SCSI core > >>> that it's the > >> same drive coming back? > >>> Or is there no way to control that? > >>> > >> Not really. The target device is getting destroyed once the device > >> disconnects, and when it reconnects a new structure is allocated. But > >> as the target number is a simple counter it gets increased up each > allocation. > >> > >>> I looked into /dev/disk/by-id, but that also didn't quite do what I > >>> expected. If I open /dev/disk/by-id/some_identifier, that's a > >>> symlink to, > >> say, /dev/sdb. > >> > >> Yes. > >> > >>> /dev/sdb goes away, comes back as /dev/sdc, but my process doesn't > >>> know that, it still has /dev/disk/by-id/some_identifier opened and > >>> so it will > >> never recover without closing and reopening the file. > >>> > >> Simply don't keep hold of the symlink; once you have opened you'll > >> miss any updates to the symlink itself. > >> So better to open the symlink, check the device, do whatever needs to > >> be done, and _close the symlink_ again. > >> Then you can listen for udev events telling you when a device appears > >> or vanishes. > >> > >> Cheers, > >> > >> Hannes > >> -- > >> Dr. Hannes Reinecke Teamlead Storage & Networking > >> hare@xxxxxxx +49 911 74053 688 > >> SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg > >> GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton HRB > >> 21284 (AG Nürnberg) >