Re: [PATCH 0/3] Fix blk_mq_end_request() and blk_end_request() for WRITE SAME

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 06/26/18 08:53, Martin K. Petersen wrote:
This series of three patches fixes a crash in the block layer core
that I encountered while retesting the SRP tests in blktests. Please
consider these patches for kernel v4.19.

Patches 1 and 2 look fine. However, I'm not sure I'm so keen on patch
3. It looks like it's papering over a more fundamental issue.

Can you elaborate a bit on why the existing code fails with dm in the
mix?

Hello Martin,

The issue I ran into is not specific to dm. The only role of dm in this is that I found a way to trigger the reported bug with dm.

A close look at the block layer core learned me that in several functions there is confusion between what is called the Data Out buffer size in the SCSI spec and the number of bytes that are affected on the medium. As you know these two numbers are identical for most commands but not for discard nor for write same requests. In the block layer core the number of bytes affected on the medium is __data_len. The size of the Data Out buffer is one of the following:
* rq->special_vec.bv_len if RQF_SPECIAL_PAYLOAD has been set.
* zero for discard requests for which RQF_SPECIAL_PAYLOAD has not been
  set.
* The cumulative size of all bio's associated with a request for all
  other request types, including write same.

Discard requests are generated and processed as follows:
* __blkdev_issue_discard() allocates a bio and sets .bi_size of that
  bio but does not attach any pages to the bio.
* generic_make_request() sets rq->__data_len by calling
  blk_rq_bio_prep() indirectly.
* The sd driver decides which SCSI command to translate discard requests
  into - UNMAP, WRITE SAME(10) or WRITE SAME(16).

Write same requests are generated and processed as follows:
* __blkdev_issue_write_same() allocates a bio, attaches a zero page to
  the bio and sets .bi_size.
* generic_make_request() sets rq->__data_len by calling
  blk_rq_bio_prep() indirectly.
* The sd driver translates the request into a WRITE SAME(10) or WRITE
  SAME(16) SCSI command.

Upon request completion scsi_finish_command() executes the following code:

	good_bytes = scsi_bufflen(cmd);
	scsi_io_completion(cmd, good_bytes);

In other words, the SCSI core passes the size of the Data Out buffer to blk_update_request(). In case of discard and write same requests, that number can differ from the number of bytes affected on the medium.

The loop in blk_update_request() iterates over the bio's attached to a request and hence finishes as soon as the value passed as the nr_bytes argument equals or exceeds the Data Out buffer size.

This is why I think that all blk_end_request_all() variants should pass the Data Out buffer size to blk_update_request() instead of the number of bytes affected on the medium.

Bart.




[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux