So are you suggesting a larger/different fix, or do you see this minor change as an acceptable means of resolving the specific issue of the revalidation not going beyond the first child expander? At least with this fix it will go to the first expander with a change. It was a code-bug from fumble-fingers, missing a '*', not a request to revamp ;-> I know that by spec it should not stop at the first discovered change, ignoring changes in other children, but that is a much finer corner case requiring discovery events at a higher frequency than the revalidation period can handle. Not sure if I have the stomach to wait for a rewrite of libsas ... Besides I feel libsas 'mostly works' and for today just needs a few tweaks (evolution, not revolution). Sincerely -- Mark Salyzyn -----Original Message----- From: Luben Tuikov [mailto:ltuikov@xxxxxxxxx] Sent: Thursday, September 01, 2011 1:18 PM To: Mark Salyzyn; linux-scsi@xxxxxxxxxxxxxxx Cc: Darrick J Wong; James Bottomley Subject: Re: [PATCH] [SCSI]: libsas failure to revalidate domain for anything but the first expander child. ----- Original Message ----- > In an enclosure model where there are chaining expanders to a large body > of storage, it was discovered that libsas, responding to a broadcast > event change, would only revalidate the domain of first child expander > in the list. > > The issue is that the pointer value to the discovered source device was > used to break out of the loop, rather than the content of the pointer. > > This still remains non-compliant as the revalidate domain code is > supposed to loop through all child expanders, and not stop at the first > one it finds that reports a change count. However, the design of this > routine does not allow multiple device discoveries and that would be a > more complicated set of patches reserved for another day. We are fixing > the glaring bug rather than refactoring the code. Obviously I've tested this both when I was at Adaptec and at Vitesse. I'd connect 7-8 expanders, run iogen with 1000 threads to say 30-40 disks, and then unplug the port between a level 1 and 2 expander and the I/O would quiesce, iogen would report a subset of the disks missing, and then when the port was reestablished, I/O would restart. However I'm not sure that Bottomley tested this scenario after changing my code off-line before submitting it into the Linux kernel. Now a few notes to mention: Your patch patches a function called sas_find_bcast_dev(). My original code does NOT have such a function. Revalidation is much more subtle and the code looks simpler in my original version. In my original code there is a lot more recursion, symmetry and code mirroring. Granted, while such code is shorter, and simpler, it is harder to figure out what it does, and I feel this is exactly why we see the current state of libsas to be so explicit, simplistic and introducing bugs. See this: http://marc.info/?l=linux-scsi&m=131480962006471&w=2 where I described the state of libsas recently. > Please note, as I am *stuck* on Outlook as per company policy, the > following inline content will likely not patch clean even emailed as > 'Plain Text', the enclosed attached file should do the job. I have > Cc'd > all the folks that originated the files in libsas, as there was no > listed MAINTAINERs. > > Checkpatch.pl reports clean. Patch applies cleanly to a WIDE variety of > kernels up to latest. > > Sincerely -- Mark Salyzyn > > Cc: Luben Tuikov <tuikov@xxxxxxxxx> > Cc: Darrick J Wong <djwong@xxxxxxxxxx> > Cc: James Bottomley <jbottomley@xxxxxxxxxxxxx> > > Signed-off-by: Mark Salyzyn <msalyzyn@xxxxxxxxxxxxxx> > > sas_expander.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff -ru scsi-misc-2.6/drivers/scsi/libsas/sas_expander.c > scsi-misc-2.6.new/drivers/scsi/libsas/sas_expander.c > --- scsi-misc-2.6/drivers/scsi/libsas/sas_expander.c 2011-08-31 > 08:32:21.000000000 -0400 > +++ scsi-misc-2.6.new/drivers/scsi/libsas/sas_expander.c > 2011-09-01 08:57:55.000000000 -0400 > @@ -1721,7 +1721,7 @@ > list_for_each_entry(ch, &ex->children, siblings) { > if (ch->dev_type == EDGE_DEV || ch->dev_type == > FANOUT_DEV) { > res = sas_find_bcast_dev(ch, src_dev); > - if (src_dev) > + if (*src_dev) > return res; > } > } > ______________________________________________________________________ This email may contain privileged or confidential information, which should only be used for the purpose for which it was sent by Xyratex. No further rights or licenses are granted to use such information. If you are not the intended recipient of this message, please notify the sender by return and delete it. You may not use, copy, disclose or rely on the information contained in it. Internet email is susceptible to data corruption, interception and unauthorised amendment for which Xyratex does not accept liability. While we have taken reasonable precautions to ensure that this email is free of viruses, Xyratex does not accept liability for the presence of any computer viruses in this email, nor for any losses caused as a result of viruses. Xyratex Technology Limited (03134912), Registered in England & Wales, Registered Office, Langstone Road, Havant, Hampshire, PO9 1SA. The Xyratex group of companies also includes, Xyratex Ltd, registered in Bermuda, Xyratex International Inc, registered in California, Xyratex (Malaysia) Sdn Bhd registered in Malaysia, Xyratex Technology (Wuxi) Co Ltd registered in The People's Republic of China and Xyratex Japan Limited registered in Japan. ______________________________________________________________________ -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html