Re: [PATCH v5 0/4] [SCSI] sg: fix race condition in sg_open

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 13-08-04 10:19 PM, vaughan wrote:
On 08/03/2013 01:25 PM, Douglas Gilbert wrote:
On 13-08-01 01:01 AM, Douglas Gilbert wrote:
On 13-07-22 01:03 PM, Jörn Engel wrote:
On Mon, 22 July 2013 12:40:29 +0800, Vaughan Cao wrote:

There is a race when open sg with O_EXCL flag. Also a race may
happen between
sg_open and sg_remove.

Changes from v4:
   * [3/4] use ERR_PTR series instead of adding another parameter in
sg_add_sfp
   * [4/4] fix conflict for cherry-pick from v3.

Changes from v3:
   * release o_sem in sg_release(), not in sg_remove_sfp().
   * not set exclude with sfd_lock held.

Vaughan Cao (4):
    [SCSI] sg: use rwsem to solve race during exclusive open
    [SCSI] sg: no need sg_open_exclusive_lock
    [SCSI] sg: checking sdp->detached isn't protected when open
    [SCSI] sg: push file descriptor list locking down to per-device
      locking

   drivers/scsi/sg.c | 178
+++++++++++++++++++++++++-----------------------------
   1 file changed, 83 insertions(+), 95 deletions(-)

Patchset looks good to me, although I didn't test it on hardware yet.
Signed-off-by: Joern Engel <joern@xxxxxxxxx>

James, care to pick this up?

Acked-by: Douglas Gilbert <dgilbert@xxxxxxxxxxxx>

Tested O_EXCL with multiple processes and threads; passed.
sg driver prior to this patch had "leaky" O_EXCL logic
according to the same test. Block device passed.

James, could you clean this up:
    drivers/scsi/sg.c:242:6: warning: unused variable ‘res’
[-Wunused-variable]

Further testing suggests this patch on the sg driver is
broken, so I'll rescind my ack.

The case it is broken for is when a device is opened
without O_EXCL. Now if, while it is open, a second
thread/process tries to open the same device O_EXCL
then IMO the second open should fail with EBUSY.

My testing shows that O_EXCL opens properly deflect
other O_EXCL opens.
Hi  Doug,

My test don't have this issue. The routine is something as below:

I start three opens without O_EXCL, wait 30s each, and open with
O_EXCL|O_NONBLOCK, it failed with EBUSY.
And I also call myopen with/without O_EXCL many times in background at
the same time, and the test is passed. I don't know why it failed in
your test.

Usage: myopen [-e][-n][-d delay] -f file
       -e: exclude
       -n: nonblock
       -d: delay N seconds and then close.

[root@vacaowol5 16835013]# ./myopen  -f /dev/sg5 -d 30 &
[1] 3417
[root@vacaowol5 16835013]# ./myopen  -f /dev/sg5 -d 30 &
[2] 3418
[root@vacaowol5 16835013]# ./myopen  -f /dev/sg5 -d 30 &
[3] 3419
[root@vacaowol5 16835013]# cat /proc/scsi/sg/debug
max_active_device=6(origin 1)
  def_reserved_size=32768
  >>> device=sg5 scsi5 chan=0 id=1 lun=0   em=0 sg_tablesize=55 excl=0
    FD(1): timeout=60000ms bufflen=32768 (res)sgat=1 low_dma=0
    cmd_q=0 f_packid=0 k_orphan=0 closed=0
      No requests active
    FD(2): timeout=60000ms bufflen=32768 (res)sgat=1 low_dma=0
    cmd_q=0 f_packid=0 k_orphan=0 closed=0
      No requests active
    FD(3): timeout=60000ms bufflen=32768 (res)sgat=1 low_dma=0
    cmd_q=0 f_packid=0 k_orphan=0 closed=0
      No requests active

[root@vacaowol5 16835013]# ./myopen -e -n  -f /dev/sg5 -d 30 &
[4] 3422
[3422:3351] /dev/sg5:exclude: Device or resource busy

[4]+  Exit 1                  ./myopen -e -n -f /dev/sg5 -d 30

[root@vacaowol5 16835013]# cat /proc/scsi/sg/debug
max_active_device=6(origin 1)
  def_reserved_size=32768
  >>> device=sg5 scsi5 chan=0 id=1 lun=0   em=0 sg_tablesize=55 excl=0
    FD(1): timeout=60000ms bufflen=32768 (res)sgat=1 low_dma=0
    cmd_q=0 f_packid=0 k_orphan=0 closed=0
      No requests active
    FD(2): timeout=60000ms bufflen=32768 (res)sgat=1 low_dma=0
    cmd_q=0 f_packid=0 k_orphan=0 closed=0
      No requests active
    FD(3): timeout=60000ms bufflen=32768 (res)sgat=1 low_dma=0
    cmd_q=0 f_packid=0 k_orphan=0 closed=0
      No requests active
[root@vacaowol5 16835013]# cat /proc/scsi/sg/debug
[1]   Done                    ./myopen -f /dev/sg5 -d 30
[2]-  Done                    ./myopen -f /dev/sg5 -d 30
[3]+  Done                    ./myopen -f /dev/sg5 -d 30


Hi,
After the initial failures about 36 hours ago, retesting
yesterday and today has not produced any unexpected
failures. And I have been trying hard on lk 3.10.4 and
lk 3.10.5 .

My test program is a bit more intense than yours and can
be found in the sg3_utils beta in the News section of this
page:
  http://sg.danny.cz/sg/

It is in the examples directory, two variants called
sg_tst_excl and sg_tst_excl2 . You will need a recent gcc
compiler, IOW something that can compile c++11 . gcc 4.7.3
in Ubuntu 13.04 only just manages, fedora 19 should do
better with gcc 4.8.1 . The threading is implemented using
pthreads so it should be reliable.

Typically I run multiple instances (processes) and each has
multiple threads. One instance can run '-x' which will cause
its first thread not to use O_EXCL **. All my tests currently
use O_NONBLOCK and that leads to lots of EBUSYs (sometimes
in the billions).

Doug Gilbert


** Using '-x' on two instances will cause an expected failure
   so can be used as a control.

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]
  Powered by Linux