On 02/05/19 11:20 AM, Javier González wrote: > >> On 5 Feb 2019, at 10.23, Matias Bjørling <mb@xxxxxxxxxxx> wrote: >> >> On 2/1/19 9:22 AM, Chansol Kim wrote: >>> On 01/31/19 22:14 PM, Matias Bjørling wrote: >>>> On 1/30/19 2:53 AM, 김찬솔 wrote: >>>>> Changes: >>>>> 1. Function pblk_rw_io to get bio* as a reference >>>>> 2. In pblk_rw_io bio_put call on read case removed >>>>> >>>>> A fix to address issue where >>>>> 1. pblk_make_rq calls pblk_rw_io passes bio* pointer as a value (0xA) >>>>> 2. pblk_rw_io calls blk_queue_split passing bio* pointer as reference >>>>> 3. In blk_queue_split, when there is a split, the original bio* (0xA) >>>>> is passed to generic_make_requests, and the newly allocated bio is >>>>> returned >>>>> 4. If NVM_IO_DONE returned, pblk_make_rq calls bio_endio on the bio*, >>>>> that is not the one returned by blk_queue_split >>>>> 5. As a result bio_endio is not called on the newly allocated bio. >>>>> >>>>> Signed-off-by: chansol.kim <chansol.kim@xxxxxxxxxxx> >>>>> --- >>>>> drivers/lightnvm/pblk-init.c | 22 ++++++++-------------- >>>>> 1 file changed, 8 insertions(+), 14 deletions(-) >>>>> >>>>> diff --git a/drivers/lightnvm/pblk-init.c b/drivers/lightnvm/pblk-init.c >>>>> index b57f764d..4efc929 100644 >>>>> --- a/drivers/lightnvm/pblk-init.c >>>>> +++ b/drivers/lightnvm/pblk-init.c >>>>> @@ -31,30 +31,24 @@ static DECLARE_RWSEM(pblk_lock); >>>>> struct bio_set pblk_bio_set; >>>>> static int pblk_rw_io(struct request_queue *q, struct pblk *pblk, >>>>> - struct bio *bio) >>>>> + struct bio **bio) >>>>> { >>>>> - int ret; >>>>> - >>>>> /* Read requests must be <= 256kb due to NVMe's 64 bit completion bitmap >>>>> * constraint. Writes can be of arbitrary size. >>>>> */ >>>>> - if (bio_data_dir(bio) == READ) { >>>>> - blk_queue_split(q, &bio); >>>>> - ret = pblk_submit_read(pblk, bio); >>>>> - if (ret == NVM_IO_DONE && bio_flagged(bio, BIO_CLONED)) >>>>> - bio_put(bio); >>>> >>>> Could we kill the NVM_DONE_IO check in the pblk_rw_io, that should >>>> achieve the same? >>> I think it is possible to remove NVM_DONE_IO check here. And in that >>> case perhaps it is necessary to change bio_endio call to somewhere other >>> than pblk_make_rq, otherwise endio call would not be made to the new >>> bio*. >>> Assuming pblk_rw_io's second parameter is to be remained as bio*, There >>> are three cases I think needs consideration. NVM_IO_ERROR return case, >>> the read case and the write case. >>> In NVM_IO_ERROR return case, for both read and write. NVM_IO_ERROR >>> received by pblk_make_rq and bio_io_error called on bio, since this bio* >>> that pblk_submit_read and pblk_write_to_cache function tried and failed >>> might be a new one, so bio_io_error call needs to be made inside >>> pblk_rw_io. >>> In read case, there are three sub-cases. The first is All data is available >>> in ring buffer and NVM_IO_DONE is returned. The second is all to be read >>> from the device, which currently NVM_IO_OK is returned and endio is >>> called after read completion from the device. The third is partial read, >>> where the data that needs to be read from the device is read >>> synchronously and pblk_rw_io returns NVM_IO_DONE. >>> In write case, there are two sub-cases. Firstly, non REQ_PRE_FLUSH case, >>> pblk_write_cache wil return either NVM_IO_DONE or NVM_IO_ERROR. A endio >>> call is required in place somewhere NVM_IO_DONE is decided. >>> For REQ_PREFLUSH case bio (new bio* if split) is added to w_ctx.bios, >>> pblk_write_to_cache will return either NVM_IO_OK or NVM_IO_ERROR. bio* >>> added to w_ctx.bios will be called by bio_endio on write completion to >>> the disk. So it is already taken care of. >>> In summary my feeling is that having pblk_rw_io receive bio* as a >>> reference and removing bio_put in pblk_rw_io would be the minimum >>> change. Please share your insight, I will try experimenting alternatives. >> >> What rubs me the wrong way is that that pattern isn't used in the rest >> of kernel. I would rather move the calls to bio_io_error and bio_endio >> into the pblk_rw_io() function. The implementation of pblk_rw_io() >> leaks out to pblk_make_rq(). The code is a mismatch of some bio_endio >> calls inside the pblk_rw_io, and others outside. It's not coherent. > > I agree that NVM_IO_DONE is now more confusing than anything - this > comes from the rrpc days... Removing it here will require some > refactoring on the partial read path, but nothing too dramatic. > > I'm also OK with unfolding pblk_rw_io() into pblk_make_rq(). > > Chansol: do you want to give it a go? > > Javier > Matias, like you mentioned and Javier suggested, unfolding pblk_rw_io would make it more coherent with regards to call sites of bio_endio, including REQ_OP_DISCARD with REQ_PREFLUSH unset case. pblk_make_rq would be the place to call bio_io_error in case of NVM_IO_ERR, and to call bio_endio for NVM_IO_DONE. Javier: I am very up for it. Unfolding pblk_rw_io into pblk_make_rq function. I will make the change, test etc, and submit the patch (with better comment this time). Thank you. Chansol Kim