RE: PCIe EPF

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Prabhakar-san,

> From: Lad, Prabhakar, Sent: Wednesday, April 1, 2020 2:26 AM
<snip>
> > > > Did you try to probe the failure further by comparing the hexdumps? Where does
> > > > the mismatch happen?
> > > >
> > > I shall dump the memory and check the values, but basically crc is failing.
> >
> > I'm also interesting about comparing the hexdump between host and ep.
> > This is my gut feeling though, I'm guessing this is a timing issue
> > because using coherent memory API will be on non-cache even if CPU access,
> > but using steaming DMA API will be on cache if CPU access by
> > dma_unmap_single(DMA_FROM_DEVICE) and get_random_bytes() in pci_endpoint_test_write().
> >
> > If my guess is correct, almost all hexdumps are the same, but last
> > some bytes (I'm not sure how much bytes though) are not match.
> >
> So I did some experiments only focusing on read operation for now
> 
> 1:  I tried printing the crc value in pci_epf_test_write() function
> with below code snippet:
> 
> @@ -472,6 +474,9 @@ static int pci_epf_test_write(struct pci_epf_test *epf_test)
>          */
>         usleep_range(1000, 2000);
> 
> +       dev_err(dev, "%s %llx %llx csum: %x = %x\n", __func__, reg->dst_addr,
> +               phys_addr, reg->checksum, crc32_le(~0, dst_addr, reg->size));
> +
>  err_dma_map:
>         kfree(buf);
> 
> But with this I get :

I don't know why this happen for now.

<snip>
> 2: Instead of get_random_bytes() in pci_epf_test_write() I used
> memset() with the fixed value
<snip>
> +++ b/drivers/pci/endpoint/functions/pci-epf-test.c
> @@ -431,6 +431,8 @@ static int pci_epf_test_write(struct pci_epf_test *epf_test)
>         }
> 
>         get_random_bytes(buf, reg->size);
> +       memset(buf, 0xDF, reg->size);
>         reg->checksum = crc32_le(~0, buf, reg->size);
> 
> And later for the dst buffer I did memset for dst_buffer
> 
> @@ -472,6 +474,10 @@ static int pci_epf_test_write(struct pci_epf_test
> *epf_test)
>          */
>         usleep_range(1000, 2000);
> 
> +       memset(dst_addr, 0xDF, reg->size);

I think this memset is not needed because this function calls memcpy_toio(dst_addr, buf,..).

> +       dev_err(dev, "%s %llx %llx csum: %x\n", __func__, reg->dst_addr,
> +               phys_addr, reg->checksum);
> +
>  err_dma_map:
>         kfree(buf);
> 
> But with this I get similar issue as above:

I don't know why like the case 1 for now...

<snip>
> 
> 3: After making sure -s was aligned to 128 bytes memset and crc32_le
> in pci_epf_test_write() didnt hit above issues,
>  and now when I compared the crc value of both the buffers this
> matched on endpoint!

OK.

> 4: Now focusing more in rootdevice, I added memset(buf, 0xDF,
> reg->size); in pci_epf_test_write() function
> and was dumping the contents of buffer in pci_endpoint_test_read()
> with below function:
> 
> static void print_buffer(struct device *dev, void *buff_addr, size_t size)
> {
>     size_t i;
>     u64 *addr = buff_addr;
> 
>     for(i = 0; i < size; i += sizeof(addr))
>         dev_err(dev, "buf[%zu] : %llx\n", i, *addr);
> 
>     dev_err(dev, "\n\n\n\n");
> }
> 
> Below is the dump from pci_endpoint_test_read() of one passing case
> and other no-passing case:
> The contents of buffer match as expected, but the crc32_le is
> differrent and I am clue-less why is this
> being caused when using non-coherent memory.
> 
> root@hihope-rzg2m:~# pcitest -r -s 2176
<snip>
> [  528.556991] pci-endpoint-test 0000:01:00.0: pci_endpoint_test_read
> ffff0004b61f9000 7e951000 910c690d=910c690d

I'd like to know how to print these crc values. Your report on the case 1
mentioned on pci-epf-test.c like below though.

> +       dev_err(dev, "%s %llx %llx csum: %x = %x\n", __func__, reg->dst_addr,
> +               phys_addr, reg->checksum, crc32_le(~0, dst_addr, reg->size));

Also, as I mentioned on previous email, this is possible to cause timing issue.
So, I'd like to where you added the hexdump on pci_endpoint_test.c.
Perhaps, copy and pasting your whole debug diff is useful to understand about it.

> READ (   2176 bytes):           OKAY
> root@hihope-rzg2m:~# pcitest -r -s 2176
<snip>
> [  533.457921] pci-endpoint-test 0000:01:00.0: pci_endpoint_test_read
> ffff0004b61f9000 7e959800  ce535039=910c690d
> READ (   2176 bytes):           NOT OKAY
> 
> Note: for failure case the crc is always ce535039

The value of ce535039 is from pci_endpoint_test_readl(test, PCI_ENDPOINT_TEST_CHECKSUM)?
If so, it's strange because all 0xdf values with 2176 bytes should be 910c690d by crc32_le().

Best regards,
Yoshihiro Shimoda





[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux