On 08/06/2013 04:52 AM, Douglas Gilbert wrote: > On 13-08-04 10:19 PM, vaughan wrote: >> On 08/03/2013 01:25 PM, Douglas Gilbert wrote: >>> On 13-08-01 01:01 AM, Douglas Gilbert wrote: >>>> On 13-07-22 01:03 PM, Jörn Engel wrote: >>>>> On Mon, 22 July 2013 12:40:29 +0800, Vaughan Cao wrote: >>>>>> >>>>>> There is a race when open sg with O_EXCL flag. Also a race may >>>>>> happen between >>>>>> sg_open and sg_remove. >>>>>> >>>>>> Changes from v4: >>>>>> * [3/4] use ERR_PTR series instead of adding another parameter in >>>>>> sg_add_sfp >>>>>> * [4/4] fix conflict for cherry-pick from v3. >>>>>> >>>>>> Changes from v3: >>>>>> * release o_sem in sg_release(), not in sg_remove_sfp(). >>>>>> * not set exclude with sfd_lock held. >>>>>> >>>>>> Vaughan Cao (4): >>>>>> [SCSI] sg: use rwsem to solve race during exclusive open >>>>>> [SCSI] sg: no need sg_open_exclusive_lock >>>>>> [SCSI] sg: checking sdp->detached isn't protected when open >>>>>> [SCSI] sg: push file descriptor list locking down to per-device >>>>>> locking >>>>>> >>>>>> drivers/scsi/sg.c | 178 >>>>>> +++++++++++++++++++++++++----------------------------- >>>>>> 1 file changed, 83 insertions(+), 95 deletions(-) >>>>> >>>>> Patchset looks good to me, although I didn't test it on hardware yet. >>>>> Signed-off-by: Joern Engel <joern@xxxxxxxxx> >>>>> >>>>> James, care to pick this up? >>>> >>>> Acked-by: Douglas Gilbert <dgilbert@xxxxxxxxxxxx> >>>> >>>> Tested O_EXCL with multiple processes and threads; passed. >>>> sg driver prior to this patch had "leaky" O_EXCL logic >>>> according to the same test. Block device passed. >>>> >>>> James, could you clean this up: >>>> drivers/scsi/sg.c:242:6: warning: unused variable ‘res’ >>>> [-Wunused-variable] >>> >>> Further testing suggests this patch on the sg driver is >>> broken, so I'll rescind my ack. >>> >>> The case it is broken for is when a device is opened >>> without O_EXCL. Now if, while it is open, a second >>> thread/process tries to open the same device O_EXCL >>> then IMO the second open should fail with EBUSY. >>> >>> My testing shows that O_EXCL opens properly deflect >>> other O_EXCL opens. >> Hi Doug, >> >> My test don't have this issue. The routine is something as below: >> >> I start three opens without O_EXCL, wait 30s each, and open with >> O_EXCL|O_NONBLOCK, it failed with EBUSY. >> And I also call myopen with/without O_EXCL many times in background at >> the same time, and the test is passed. I don't know why it failed in >> your test. >> >> Usage: myopen [-e][-n][-d delay] -f file >> -e: exclude >> -n: nonblock >> -d: delay N seconds and then close. >> >> [root@vacaowol5 16835013]# ./myopen -f /dev/sg5 -d 30 & >> [1] 3417 >> [root@vacaowol5 16835013]# ./myopen -f /dev/sg5 -d 30 & >> [2] 3418 >> [root@vacaowol5 16835013]# ./myopen -f /dev/sg5 -d 30 & >> [3] 3419 >> [root@vacaowol5 16835013]# cat /proc/scsi/sg/debug >> max_active_device=6(origin 1) >> def_reserved_size=32768 >> >>> device=sg5 scsi5 chan=0 id=1 lun=0 em=0 sg_tablesize=55 excl=0 >> FD(1): timeout=60000ms bufflen=32768 (res)sgat=1 low_dma=0 >> cmd_q=0 f_packid=0 k_orphan=0 closed=0 >> No requests active >> FD(2): timeout=60000ms bufflen=32768 (res)sgat=1 low_dma=0 >> cmd_q=0 f_packid=0 k_orphan=0 closed=0 >> No requests active >> FD(3): timeout=60000ms bufflen=32768 (res)sgat=1 low_dma=0 >> cmd_q=0 f_packid=0 k_orphan=0 closed=0 >> No requests active >> >> [root@vacaowol5 16835013]# ./myopen -e -n -f /dev/sg5 -d 30 & >> [4] 3422 >> [3422:3351] /dev/sg5:exclude: Device or resource busy >> >> [4]+ Exit 1 ./myopen -e -n -f /dev/sg5 -d 30 >> >> [root@vacaowol5 16835013]# cat /proc/scsi/sg/debug >> max_active_device=6(origin 1) >> def_reserved_size=32768 >> >>> device=sg5 scsi5 chan=0 id=1 lun=0 em=0 sg_tablesize=55 excl=0 >> FD(1): timeout=60000ms bufflen=32768 (res)sgat=1 low_dma=0 >> cmd_q=0 f_packid=0 k_orphan=0 closed=0 >> No requests active >> FD(2): timeout=60000ms bufflen=32768 (res)sgat=1 low_dma=0 >> cmd_q=0 f_packid=0 k_orphan=0 closed=0 >> No requests active >> FD(3): timeout=60000ms bufflen=32768 (res)sgat=1 low_dma=0 >> cmd_q=0 f_packid=0 k_orphan=0 closed=0 >> No requests active >> [root@vacaowol5 16835013]# cat /proc/scsi/sg/debug >> [1] Done ./myopen -f /dev/sg5 -d 30 >> [2]- Done ./myopen -f /dev/sg5 -d 30 >> [3]+ Done ./myopen -f /dev/sg5 -d 30 >> > > Hi, > After the initial failures about 36 hours ago, retesting > yesterday and today has not produced any unexpected > failures. And I have been trying hard on lk 3.10.4 and > lk 3.10.5 . > > My test program is a bit more intense than yours and can > be found in the sg3_utils beta in the News section of this > page: > http://sg.danny.cz/sg/ > > It is in the examples directory, two variants called > sg_tst_excl and sg_tst_excl2 . You will need a recent gcc > compiler, IOW something that can compile c++11 . gcc 4.7.3 > in Ubuntu 13.04 only just manages, fedora 19 should do > better with gcc 4.8.1 . The threading is implemented using > pthreads so it should be reliable. > > Typically I run multiple instances (processes) and each has > multiple threads. One instance can run '-x' which will cause > its first thread not to use O_EXCL **. All my tests currently > use O_NONBLOCK and that leads to lots of EBUSYs (sometimes > in the billions). > > Doug Gilbert > > > ** Using '-x' on two instances will cause an expected failure > so can be used as a control. > Hi Doug, Can I regard this as you ACK it again? Vaughan -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html