Re: [PATCH v19 3/3] scsi: ufs: Prepare HPB read for cached sub-region

Bean Huo <huobean@xxxxxxxxx> · Tue, 09 Feb 2021 15:21:54 +0100

On Tue, 2021-02-09 at 13:25 +0000, Avri Altman wrote:
> > 
> > 
> > > > > +     put_unaligned_be64(ppn, &cdb[6]);
> > > > 
> > > > You are assuming the HPB entries read out by "HPB Read Buffer"
> > > > cmd
> > > > are
> > > > in Little
> > > > Endian, which is why you are using put_unaligned_be64 here.
> > > > However,
> > > > this assumption
> > > > is not right for all the other flash vendors - HPB entries read
> > > > out
> > > > by
> > > > "HPB Read Buffer"
> > > > cmd may come in Big Endian, if so, their random read
> > > > performance are
> > > > screwed.
> > > 
> > > For this question, it is very hard to make a correct format since
> > > the
> > > Spec doesn't give a clear definition. Should we have a default
> > > format,
> > > if there is conflict, and then add quirk or add a vendor-specific
> > > table?
> > > 
> > > Hi Avri
> > > Do you have a good idea?
> > 
> > I don't know.  Better let Daejun answer this.
> > This was working for me for both Galaxy S20 (Exynos) as well as
> > Xiaomi Mi10
> > (8250).
> 
> As for the endianity issue - 
> I don't think that any fix is needed in the hpb driver.
> It is readily seen that the ppn from get_ppn, and the one in the upiu
> cdb (upiu trace) are identical.
> Therefore, if an issue exist, it is IMHO a device issue.
> 
> kworker/u16:10-315   [001] d..2    62.283264: ufshpb_get_ppn: Avri
> ppn 480d2f8244c21abd
>   kworker/u16:10-315   [001] d..2    62.283336: ufshcd_upiu: v:1.10
> send: T:62283314922, HDR:014000000000000000000000,
> CDB:8800002ddaac480d2f8244c21abd0100, D:
> 
> Again, verified on both gs20 (exynos) and mi10 (8250).
> Thanks,
> Avri

Hi Avri,
Your testing method is no problem, the current problem is in function
ufshpb_get_ppn().

+static u64 ufshpb_get_ppn(struct ufshpb_lu *hpb,
+                         struct ufshpb_map_ctx *mctx, int pos, int
*error)
+{
+       u64 *ppn_table; 
+       struct page *page;
+       int index, offset;
+
+       index = pos / (PAGE_SIZE / HPB_ENTRY_SIZE);
+       offset = pos % (PAGE_SIZE / HPB_ENTRY_SIZE);
+
+       page = mctx->m_page[index];
+       if (unlikely(!page)) {
+               *error = -ENOMEM;
+               dev_err(&hpb->sdev_ufs_lu->sdev_dev,
+                       "error. cannot find page in mctx\n");
+               return 0;
+       }
+
+       ppn_table = page_address(page);
+       if (unlikely(!ppn_table)) {
+               *error = -ENOMEM;
+               dev_err(&hpb->sdev_ufs_lu->sdev_dev,
+                       "error. cannot get ppn_table\n");
+               return 0;
+       }
+
+       return ppn_table[offset];
+}

Say, the UFS device outputs the L2P entry in big-endian, which means
the most significant byte of an L2P entry will be output firstly, then
the less significant byte..., let's take an example of one L2P entry:

0x 12 34 56 78 90 12 34 56

0x12 is the most significant byte, will be store in the lowest address
in the L2P cache.

eg,

F0000008: 1234 5678 9012 3456

In the ARM based system, If we use "return ppn_table[offset]",
the original L2P entry 0x1234 5678 9012 3456, will be converted to
0x5634 1290 7856 3412. then use put_unaligned_be64(), UFS receive
unexpected L2P entry(L2P entry miss).

If the UFS output L2P entry in the big-endian, this is a problem.

For the UFS outputs L2P entry in little-endian, no problem,

Because of the L2P entry in the memory:

F0000008: 5634 1290 7856 3412

After return ppn_table[offset], L2P entry will be correct L2P entry:

0x1234567890123456. then use put_unaligned_be64(), UFS can receive
expected L2P etnry(L2P entry hit).

we need to figure out which way is the JEDEC recommended L2P entry
output endianness. otherwise, two methods co-exist in HPB driver, there
will confuse customer.
If you have a look at the JEDEC HPB 2.0, seems the big-endian is
correct way. This need you and Daejun to double check inside your
company.

thanks,
Bean