On 25/10/2021 08:16, Daejun Park wrote: >> On Thu, 2021-10-21 at 09:22 -0700, Bart Van Assche wrote: >>> On 10/21/21 8:17 AM, Christoph Hellwig wrote: >>>> On Thu, Oct 21, 2021 at 05:15:20PM +0200, Christoph Hellwig wrote: >>>>>>> I just noticed the UFS HPB support landed in 5.15, and just >>>>>>> as before it is completely broken by allocating another >>>>>>> request on the same device and then reinserting it in the >>>>>>> queue. It is bad enough that we have to live with >>>>>>> blk_insert_cloned_request for dm-mpath, but this is too big >>>>>>> of an API abuse to make it into a release. We need to drop >>>>>>> this code ASAP, and I can prepare a patch for that. >>>>>> >>>>>> That sounds awful, do you have a link to the offending >>>>>> commit(s)? >>>>> >>>>> I'll need to look for it, busy in calls right now, but just grep >>>>> for blk_insert_cloned_request. >>>> >>>> Might as well finish the git blame: >>>> >>>> commit 41d8a9333cc96f5ad4dd7a52786585338257d9f1 >>>> Author: Daejun Park <daejun7.park@xxxxxxxxxxx> >>>> Date: Mon Jul 12 18:00:25 2021 +0900 >>>> >>>> scsi: ufs: ufshpb: Add HPB 2.0 support >>>> >>>> Version 2.0 of HBP supports reads of varying sizes from 4KB to >>>> 1MB. >>>> >>>> A read operation <= 32KB is supported as single HPB read. A >>>> read between >>>> 36KB and 1MB is supported by a combination of write buffer >>>> command and HPB >>>> read command to deliver more PPN. The write buffer commands >>>> may not be >>>> issued immediately due to busy tags. To use HPB read more >>>> aggressively, the >>>> driver can requeue the write buffer command. The requeue >>>> threshold is >>>> implemented as timeout and can be modified with >>>> requeue_timeout_ms entry in >>>> sysfs. >>> >>> (+Daejun) >>> >>> Daejun, can the HPB code be reworked such that it does not use >>> blk_insert_cloned_request()? I'm concerned that if the HPB code is >>> not reworked that it will be removed from the upstream kernel. >> >> Just to give urgency to Bart's request: we have two or three weeks >> before the kernel is due to go final. Can the problems identified by >> Christoph be fixed within that timeframe? > > I'm checking to see if I can replace blk_execute_rq_nowait with > blk_insert_cloned_request in the HPB code. In case it is of any use, there is an example of sending a REQUEST_SENSE command directly here: https://lore.kernel.org/all/20210930124224.114031-2-adrian.hunter@xxxxxxxxx > >> >> Specifically, looking at the paper you reference, it only uses READ >> BUFFER for the host cache sharing. Since the JDEC standard appears to >> be proprietary, I have no method of understanding why the driver now >> uses WRITE BUFFER as well, but it appears to be a simple optimization. >> If you can cut out the WRITE BUFFER code, blk_insert_cloned_request() >> will also be gone and thus the API abuse. Can you get us a simple >> patch doing this ASAP so we don't have to revert the driver? > > If WRITE BUFFER is not used, only READs with a size of 32KB or less can be > changed to HPB READs. This becomes a limiting factor in how READ > performance can be improved by the HPB. > > Thanks, > Daejun >