Re: [PATCH] sd: Fix crash due to race when removing scsi disk

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Jul 1, 2016 at 12:14 PM, Howard Cochran
<hcochran@xxxxxxxxxxxxxxxx> wrote:
> This crash occurred while writing 1 to /sys/block/sda/device/delete at
> the same instant that another process was closing the block device:
>
>  BUG: unable to handle kernel NULL pointer dereference at 00000230
>  IP: [<c138fa9c>] blk_get_backing_dev_info+0xc/0x20
>  Oops: 0000 [#1] PREEMPT SMP
>  Call Trace:
>   [<c112da2a>] ? __filemap_fdatawrite_range+0x15a/0x180
>   [<c112d9b5>] ? __filemap_fdatawrite_range+0xe5/0x180
>   [<c112dae8>] filemap_write_and_wait+0x38/0x70
>   [<c11b79b1>] fsync_bdev+0x41/0x50
>   [<c13a4f7c>] invalidate_partition+0x1c/0x40
>   [<c13a5d0f>] del_gendisk+0xcf/0x1c0
>   [<c15c7143>] sd_remove+0x53/0xb0
>   [<c157eaf0>] __device_release_driver+0x80/0x120
>   [<c157ebad>] device_release_driver+0x1d/0x30
>   [<c157e392>] bus_remove_device+0xb2/0xf0
>   [<c157b45c>] device_del+0xec/0x1e0
>   [<c13b6d88>] ? kobject_put+0x58/0xc0
>   [<c15c12af>] __scsi_remove_device+0xaf/0xc0
>   [<c15c12df>] scsi_remove_device+0x1f/0x30
>   [<c15c131b>] sdev_store_delete+0x2b/0x40
>   [<c15c12f0>] ? scsi_remove_device+0x30/0x30
>   [<c157a87f>] dev_attr_store+0x1f/0x40
>                ...
>   [<c11829bc>] SyS_write+0x4c/0xb0
>  EIP: [<c138fa9c>] blk_get_backing_dev_info+0xc/0x20 SS:ESP 0068:f5eb9d18
>
> It is caused by this race: Between the time Thread B's instance of
> filemap_write_and_wait() has asked whether there are any pages to flush and
> when it it dereferences bdev->disk, Thread A can clear that pointer in
> __blkdev_put().
>
> Thread A:                             Thread B:
> blkdev_close()                        sdev_store_delete()
>   blkdev_put()                          sd_remove()
>     __blkdev_put()                        del_gendisk()
>       mutex_lock(bd_mutex);                 invalidate_partition()
>         sync_blkdev()                         fsync_bdev()
>           filemap_write_and_wait()              filemap_write_and_wait()
>             if (mapping has pages)                if (mapping has pages)
>               deref bdev->disk (OK)
>         Set bdev->bd_disk = NULL;
>       mutex_unlock(bd_mutex);                       deref. bdev->bd_disk (BOOM!)
>
> The "dereference bdev->disk" occurs on this sub-chain:
> filemap_write_and_wait()
>   __filemap_fdatawrite_range()
>     mapping_cap_writeback_dirty()
>       inode_to_bdi()
>         bdev_get_queue()
>           return bdev->disk->queue;
>
> The problem was introduced by de1414a654e6 ("fs: export inode_to_bdi and use
> it in favor of mapping->backing_dev_info"). Before that change,
> mapping_cap_writeback_dirty() directly retrieved the backing_dev_info from
> the mapping rather than looking it up through
> mapping->host->inode_dev->bdev->bd_disk->queue.
>
> This was found while running a stress test on an ARM-based embedded system
> which involved repeatedly shutting down many services simultaneously via
> systemd isolate (thereby making it likely that "Thread B" was preempted for
> awhile just before it dereferenced bdev->bd_disk). I subsequently reproduced
> this on vanilla Linux 4.6 in QEMU/x86.
>
> This patch fixes the race by making sd_remove() hold bd_mutex during the
> call to del_gendisk().
>
> Fixes: de1414a654e6 ("fs: export inode_to_bdi and use it in favor of
> mapping->backing_dev_info")
> Signed-off-by: Howard Cochran <hcochran@xxxxxxxxxxxxxxxx>
> Cc: Howard Cochran <cochran@xxxxxxxxxxx>
> Cc: linux-scsi@xxxxxxxxxxxxxxx
> Cc: Christoph Hellwig <hch@xxxxxx>
> Cc: James Bottomley <JBottomley@xxxxxxxx>
> Cc: Martin K. Petersen <martin.petersen@xxxxxxxxxx>
> ---
>  drivers/scsi/sd.c | 7 +++++++
>  1 file changed, 7 insertions(+)
>
> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
> index f52b74c..0f53925 100644
> --- a/drivers/scsi/sd.c
> +++ b/drivers/scsi/sd.c
> @@ -3126,6 +3126,7 @@ static int sd_remove(struct device *dev)
>  {
>         struct scsi_disk *sdkp;
>         dev_t devt;
> +       struct block_device *bdev;
>
>         sdkp = dev_get_drvdata(dev);
>         devt = disk_devt(sdkp->disk);
> @@ -3134,7 +3135,13 @@ static int sd_remove(struct device *dev)
>         async_synchronize_full_domain(&scsi_sd_pm_domain);
>         async_synchronize_full_domain(&scsi_sd_probe_domain);
>         device_del(&sdkp->dev);
> +
> +       bdev = bdget_disk(sdkp->disk, 0);
> +       mutex_lock(&bdev->bd_mutex);
>         del_gendisk(sdkp->disk);
> +       mutex_unlock(&bdev->bd_mutex);
> +       bdput(bdev);
> +
>         sd_shutdown(dev);
>
>         blk_register_region(devt, SD_MINORS, NULL,
> --
> 1.9.1
>

Adding to Cc: Corrected email address for James Bottomley.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]
  Powered by Linux