Re: [PATCH v3] misc: pci_endpoint_test: Use memcpy_toio()/memcpy_fromio() for BAR tests

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 3/21/24 1:45 AM, Niklas Cassel wrote:
> Hello Kuppuswamy,
>
> On Wed, Mar 20, 2024 at 08:53:12AM -0700, Kuppuswamy Sathyanarayanan wrote:
>> Hi,
>>
>> On 3/20/24 2:01 AM, Niklas Cassel wrote:
>>> The current code uses writel()/readl(), which has an implicit memory
>>> barrier for every single readl()/writel().
>>>
>>> Additionally, reading 4 bytes at a time over the PCI bus is not really
>>> optimal, considering that this code is running in an ioctl handler.
>>>
>>> Use memcpy_toio()/memcpy_fromio() for BAR tests.
>>>
>>> Before patch with a 4MB BAR:
>>> $ time /usr/bin/pcitest -b 1
>>> BAR1:           OKAY
>>> real    0m 1.56s
>>>
>>> After patch with a 4MB BAR:
>>> $ time /usr/bin/pcitest -b 1
>>> BAR1:           OKAY
>>> real    0m 0.54s
>>>
>>> Signed-off-by: Niklas Cassel <cassel@xxxxxxxxxx>
>>> ---
>>> Changes since v2:
>>> -Actually free the allocated memory... (thank you Kuppuswamy)
>>>
>>>  drivers/misc/pci_endpoint_test.c | 68 ++++++++++++++++++++++++++------
>>>  1 file changed, 55 insertions(+), 13 deletions(-)
>>>
>>> diff --git a/drivers/misc/pci_endpoint_test.c b/drivers/misc/pci_endpoint_test.c
>>> index 705029ad8eb5..1d361589fb61 100644
>>> --- a/drivers/misc/pci_endpoint_test.c
>>> +++ b/drivers/misc/pci_endpoint_test.c
>>> @@ -272,33 +272,75 @@ static const u32 bar_test_pattern[] = {
>>>  	0xA5A5A5A5,
>>>  };
>>>  
>>> +static int pci_endpoint_test_bar_memcmp(struct pci_endpoint_test *test,
>>> +					enum pci_barno barno, int offset,
>>> +					void *write_buf, void *read_buf,
>>> +					int size)
>>> +{
>>> +	memset(write_buf, bar_test_pattern[barno], size);
>>> +	memcpy_toio(test->bar[barno] + offset, write_buf, size);
>>> +
>>> +	memcpy_fromio(read_buf, test->bar[barno] + offset, size);
>>> +
>>> +	return memcmp(write_buf, read_buf, size);
>>> +}
>>> +
>>>  static bool pci_endpoint_test_bar(struct pci_endpoint_test *test,
>>>  				  enum pci_barno barno)
>>>  {
>>> -	int j;
>>> -	u32 val;
>>> -	int size;
>>> +	int j, bar_size, buf_size, iters, remain;
>>> +	void *write_buf;
>>> +	void *read_buf;
>>>  	struct pci_dev *pdev = test->pdev;
>>> +	bool ret;
>>>  
>>>  	if (!test->bar[barno])
>>>  		return false;
>>>  
>>> -	size = pci_resource_len(pdev, barno);
>>> +	bar_size = pci_resource_len(pdev, barno);
>>>  
>>>  	if (barno == test->test_reg_bar)
>>> -		size = 0x4;
>>> +		bar_size = 0x4;
>>>  
>>> -	for (j = 0; j < size; j += 4)
>>> -		pci_endpoint_test_bar_writel(test, barno, j,
>>> -					     bar_test_pattern[barno]);
>>> +	buf_size = min(SZ_1M, bar_size);
>> Why 1MB  limit?
> Could you please clarify your concern?

Since you are trying to optimize the number of read/write calls, I
was just wondering why you chose maximum limit of 1MB per
read/write call.  But your following explanation makes sense to
me. I recommend adding some comments about it in commit log
or code.

Code wise, your change looks fine to me.

Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@xxxxxxxxxxxxxxx>

>
> A BAR could be several GB, so it does not make sense to always kmalloc()
> a buffer that is of the same size of the BAR.
> (Therefore we copy in to a smaller buffer, iterating over the whole BAR.)
>
> So we have to chose a max limit that we think is likely to succeed even
> when the memory is fragmented, and something that will work on embedded
> systems, etc.
>
> The highest BAR size used by pci-epf-test is by default 1MB, so 1MB
> seemed like a reasonable max limit. (Since we use min(), if the BAR is
> smaller than 1MB, the buffer we allocate will also be smaller than 1MB.
>
> Since we allocate two buffers, we are in the worst case allocating 2x 1MB,
> so I don't think that it is reasonable to have a higher max limit.
>
> If you are using a _very_ resource contained system as RC (and EP) to test
> the pci-epf-test driver, you have probably reduced the default BAR sizes
> defined in pci-epf-test to something smaller already, so 1MB seemed like
> a reasonable max limit.
>
>
> Kind regards,
> Niklas
>
>>>  
>>> -	for (j = 0; j < size; j += 4) {
>>> -		val = pci_endpoint_test_bar_readl(test, barno, j);
>>> -		if (val != bar_test_pattern[barno])
>>> -			return false;
>>> +	write_buf = kmalloc(buf_size, GFP_KERNEL);
>>> +	if (!write_buf)
>>> +		return false;
>>> +
>>> +	read_buf = kmalloc(buf_size, GFP_KERNEL);
>>> +	if (!read_buf) {
>>> +		ret = false;
>>> +		goto err;
>>>  	}
>>>  
>>> -	return true;
>>> +	iters = bar_size / buf_size;
>>> +	for (j = 0; j < iters; j++) {
>>> +		if (pci_endpoint_test_bar_memcmp(test, barno, buf_size * j,
>>> +						 write_buf, read_buf,
>>> +						 buf_size)) {
>>> +			ret = false;
>>> +			goto err;
>>> +		}
>>> +	}
>>> +
>>> +	remain = bar_size % buf_size;
>>> +	if (remain) {
>>> +		if (pci_endpoint_test_bar_memcmp(test, barno, buf_size * iters,
>>> +						 write_buf, read_buf,
>>> +						 remain)) {
>>> +			ret = false;
>>> +			goto err;
>>> +		}
>>> +	}
>>> +
>>> +	ret = true;
>>> +
>>> +err:
>>> +	kfree(write_buf);
>>> +	kfree(read_buf);
>>> +
>>> +	return ret;
>>>  }
>>>  
>>>  static bool pci_endpoint_test_intx_irq(struct pci_endpoint_test *test)
>> -- 
>> Sathyanarayanan Kuppuswamy
>> Linux Kernel Developer
>>
-- 
Sathyanarayanan Kuppuswamy
Linux Kernel Developer





[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux