> On 5 Feb 2019, at 10.23, Matias Bjørling <mb@xxxxxxxxxxx> wrote: > > On 2/1/19 9:22 AM, Chansol Kim wrote: >> On 01/31/19 22:14 PM, Matias Bjørling wrote: >>> On 1/30/19 2:53 AM, 김찬솔 wrote: >>>> Changes: >>>> 1. Function pblk_rw_io to get bio* as a reference >>>> 2. In pblk_rw_io bio_put call on read case removed >>>> >>>> A fix to address issue where >>>> 1. pblk_make_rq calls pblk_rw_io passes bio* pointer as a value (0xA) >>>> 2. pblk_rw_io calls blk_queue_split passing bio* pointer as reference >>>> 3. In blk_queue_split, when there is a split, the original bio* (0xA) >>>> is passed to generic_make_requests, and the newly allocated bio is >>>> returned >>>> 4. If NVM_IO_DONE returned, pblk_make_rq calls bio_endio on the bio*, >>>> that is not the one returned by blk_queue_split >>>> 5. As a result bio_endio is not called on the newly allocated bio. >>>> >>>> Signed-off-by: chansol.kim <chansol.kim@xxxxxxxxxxx> >>>> --- >>>> drivers/lightnvm/pblk-init.c | 22 ++++++++-------------- >>>> 1 file changed, 8 insertions(+), 14 deletions(-) >>>> >>>> diff --git a/drivers/lightnvm/pblk-init.c b/drivers/lightnvm/pblk-init.c >>>> index b57f764d..4efc929 100644 >>>> --- a/drivers/lightnvm/pblk-init.c >>>> +++ b/drivers/lightnvm/pblk-init.c >>>> @@ -31,30 +31,24 @@ static DECLARE_RWSEM(pblk_lock); >>>> struct bio_set pblk_bio_set; >>>> static int pblk_rw_io(struct request_queue *q, struct pblk *pblk, >>>> - struct bio *bio) >>>> + struct bio **bio) >>>> { >>>> - int ret; >>>> - >>>> /* Read requests must be <= 256kb due to NVMe's 64 bit completion bitmap >>>> * constraint. Writes can be of arbitrary size. >>>> */ >>>> - if (bio_data_dir(bio) == READ) { >>>> - blk_queue_split(q, &bio); >>>> - ret = pblk_submit_read(pblk, bio); >>>> - if (ret == NVM_IO_DONE && bio_flagged(bio, BIO_CLONED)) >>>> - bio_put(bio); >>> >>> Could we kill the NVM_DONE_IO check in the pblk_rw_io, that should >>> achieve the same? >> I think it is possible to remove NVM_DONE_IO check here. And in that >> case perhaps it is necessary to change bio_endio call to somewhere other >> than pblk_make_rq, otherwise endio call would not be made to the new >> bio*. >> Assuming pblk_rw_io's second parameter is to be remained as bio*, There >> are three cases I think needs consideration. NVM_IO_ERROR return case, >> the read case and the write case. >> In NVM_IO_ERROR return case, for both read and write. NVM_IO_ERROR >> received by pblk_make_rq and bio_io_error called on bio, since this bio* >> that pblk_submit_read and pblk_write_to_cache function tried and failed >> might be a new one, so bio_io_error call needs to be made inside >> pblk_rw_io. >> In read case, there are three sub-cases. The first is All data is available >> in ring buffer and NVM_IO_DONE is returned. The second is all to be read >> from the device, which currently NVM_IO_OK is returned and endio is >> called after read completion from the device. The third is partial read, >> where the data that needs to be read from the device is read >> synchronously and pblk_rw_io returns NVM_IO_DONE. >> In write case, there are two sub-cases. Firstly, non REQ_PRE_FLUSH case, >> pblk_write_cache wil return either NVM_IO_DONE or NVM_IO_ERROR. A endio >> call is required in place somewhere NVM_IO_DONE is decided. >> For REQ_PREFLUSH case bio (new bio* if split) is added to w_ctx.bios, >> pblk_write_to_cache will return either NVM_IO_OK or NVM_IO_ERROR. bio* >> added to w_ctx.bios will be called by bio_endio on write completion to >> the disk. So it is already taken care of. >> In summary my feeling is that having pblk_rw_io receive bio* as a >> reference and removing bio_put in pblk_rw_io would be the minimum >> change. Please share your insight, I will try experimenting alternatives. > > What rubs me the wrong way is that that pattern isn't used in the rest > of kernel. I would rather move the calls to bio_io_error and bio_endio > into the pblk_rw_io() function. The implementation of pblk_rw_io() > leaks out to pblk_make_rq(). The code is a mismatch of some bio_endio > calls inside the pblk_rw_io, and others outside. It's not coherent. I agree that NVM_IO_DONE is now more confusing than anything - this comes from the rrpc days... Removing it here will require some refactoring on the partial read path, but nothing too dramatic. I'm also OK with unfolding pblk_rw_io() into pblk_make_rq(). Chansol: do you want to give it a go? Javier
Attachment:
signature.asc
Description: Message signed with OpenPGP