Re: [kvm-unit-tests PATCH v4 04/12] pci: Rework pci_bar_addr()

Alexander Gordeev <agordeev@xxxxxxxxxx> · Tue, 7 Jun 2016 22:00:42 +0200

On Tue, Jun 07, 2016 at 04:08:22PM +0200, Andrew Jones wrote:
> On Tue, Jun 07, 2016 at 12:33:54PM +0200, Alexander Gordeev wrote:
> > On Tue, Jun 07, 2016 at 09:03:18AM +0200, Andrew Jones wrote:
> > > On Tue, Jun 07, 2016 at 08:38:55AM +0200, Alexander Gordeev wrote:
> > > > On Mon, Jun 06, 2016 at 04:12:03PM +0200, Andrew Jones wrote:
> > > > > > +phys_addr_t pci_bar_size(pcidevaddr_t dev, int bar_num)
> > > > > >  {
> > > > > >  	uint32_t bar = pci_config_readl(dev, PCI_BASE_ADDRESS_0 + bar_num * 4);
> > > > > > +	phys_addr_t size = (int32_t)pci_bar_size32(dev, bar_num);
> > > > > > +	phys_addr_t mask = (int32_t)pci_bar_mask(bar);
> > > > > >  
> > > > > > -	if (bar & PCI_BASE_ADDRESS_SPACE_IO)
> > > > > > -		return bar & PCI_BASE_ADDRESS_IO_MASK;
> > > > > > -	else
> > > > > > -		return bar & PCI_BASE_ADDRESS_MEM_MASK;
> > > > > > +	if (pci_bar_is64(dev, bar_num)) {
> > > > > > +		uint32_t size_high = pci_bar_size32(dev, bar_num + 1);
> > > > > > +		size = ((phys_addr_t)size_high << 32) | (uint32_t)size;
> > > > > > +	}
> > > > > > +
> > > > > > +	return (~(size & mask)) + 1;
> > > > > 
> > > > > All this casting of size and mask is pointless. Please rework it
> > > > > similar to what I did above.
> > > > 
> > > > This is the most compact and straight variant I was able to come up with:
> > > > 
> > > > 	uint32_t bar = pci_bar_get(dev, bar_num);
> > > > 	phys_addr_t size = (int32_t)pci_bar_size32(dev, bar_num);
> > > 
> > > But this is wrong. If your 32-bit size was 0x80000000, then you now
> > > say it's 0xffffffff80000000.
> > 
> > Hmm.. I am either terribly missing the point or we are on different
> > pages.
> 
> Let's make sure I know what the function is supposed to do.
> 
> Based on your comment explaining how to get the PCI bar size,
> it sounds like we should do these steps; let's assume a device
> needs size=0x200000, and is 32-bit

I am not quite sure what barsz=0x00000000 at read-write1s-dance	
phase is, but otherwise it seems correct to me (well may be a nit -
the 32-bit variant denoted 64-bit mmio).

>  read-write1s-dance	barsz=0x00000000
>  readl			barsz=0xffe0000c
>  mask			barsz=0xffe00000
> 
>  barsz = ~barsz + 1	barsz=0x00200000
> 
> Now let's say it's 64-bit
> 
>  read-write1s-dance	barsz_lo=0x00000000
>  readl			barsz_lo=0xffe0000c
>  mask			barsz_lo=0xffe00000
>  read-write1s-dance	barsz_hi=0x00000000
>  readl			barsz_hi=0xffffffff
>  <upper 32-bit mask is 0xffffffff, so don't bother masking at all>
> 
>  barsz = ~((phys_addr_t)barsz_hi << 32 | barsz_lo) + 1 (barsz=0x200000)
> 
> So you need 5 functions
> 
> uint32_t get_mask(int bar)
> {
> ...
> }
> 
> uint32_t read_dance(int bar)
> {
>   read-write1s-dance
>   return readl
> }
> 
> uint32_t get_size32(int bar)
> {
>   uint32_t size = read_dance(bar) & get_mask(bar);
>   return ~size + 1;
> }
> 
> uint64_t get_size64(int bar)
> {
>   uint64_t size = read_dance(bar) & get_mask(bar);
>   uint64_t size_hi = read_dance(bar + 1);
> 
>   size |= size_hi << 32;
>   return ~size + 1;
> }
> 
> uint64_t get_size(int bar)
> {
>   return bar_size(bar) == BAR64 ? get_size64(bar) : get_size32(bar);
> }

If the above is a pseudo code or you want me to rework using these
functions? If not, below is how it could look like.

Also, looking at your get_size() I noticed a bug in my version :-)
when read_dance() is followed by readl() returns zero - that means no
(more) BAR(s).

phys_addr_t pci_bar_size(pcidevaddr_t dev, int bar_num)
{
	uint32_t size;
	uint32_t bar;

	size = pci_bar_size32(dev, bar_num);
	if (!size)
		return 0;

	bar = pci_bar_get(dev, bar_num);
	size &= pci_bar_mask(bar);

	if (pci_bar_is64(dev, bar_num)) {
		phys_addr_t size64 = pci_bar_size32(dev, bar_num + 1);
		size64 = (size64 << 32) | size;

		return ~size64 + 1;
	} else {
		return ~size + 1;
	}
}

> > So if pci_bar_size32() returned 0x80000000 the size sign-extension
> > gives 0xffffffff80000000 and the mask sign extension gives (i.e.
> > mmio) 0xfffffffffffffff0. The AND gives 0xffffffff80000000 and the
> > NOT gives 0x7fffffff. Finally, 0x7fffffff + 1 gives 0x80000000.
> 
> I don't see the point of using the upper 32-bits to calculate a
> 32-bit value from 32-bit inputs.

It is indeed too complicated. My point has faded away :)

> > If we do not sign-extend (or explicitly OR with 0xffffffff00000000)
> > size and/or mask then the return(~(size & mask)) + 1 gives a wrong
> > 0xffffffff80000000.
> > 
> > > > 	phys_addr_t mask = (int32_t)pci_bar_mask(bar);
> > > 
> > > It might be OK to do that here, on a mask, but even if it is, then I
> > > don't like it, because it's too subtle (like I said for the casting
> > > in pci_bar_addr in my last reply). I don't like that it requires us to
> > > know that masking bit 31 in a 32-bit mask means we also want to mask
> > > 63..32. That should at least be in a comment somewhere.
> > 
> > Yes, but it is not an arbitrary mask, it is an alignment mask.
> > We unconditionally want to mask 63..32, 31th and even lower.
> 
> If you use sign-extension to generate the alignment mask, then
> it needs a comment. But if we can avoid the headache of 64-bit
> data computing 32-bit results then let's avoid it. (Even though
> you described to me how it's supposed to work, I'm still not
> sure it's safe for all cases :-)
> 
> > 
> > > > 	if (pci_bar_is64(dev, bar_num))
> > > > 		size |= (phys_addr_t)pci_bar_size32(dev, bar_num + 1) << 32;
> > > > 
> > > > 	return (~(size & mask)) + 1;
> > > > 
> > > > The casting is needed to avoid putting explicitly all 1s into higher
> > > > bits of size and/or mask. Otherwise (~(size & mask)) + 1 expression would
> > > > not bring correct results. I really struggle to make something better
> > > > readable.
> > > 
> > > It's not just about readability, it's about correctness. You shouldn't
> > > use 32-bit sizes/masks with 64-bit data this way. Like I said in the
> > > last reply, you should rework it like I showed you, operate on the
> > > 32-bit data with the 32-bit mask/size, and then eventually construct
> > > 64-bit data.
> > 
> > It is all about those last +1 and shifting bits from lower to upper
> > word of 64-bit value. My hope was to off-load it to the compiler :)
> 
> I think the problem comes from trying to apply a mask to the upper bits,
> even though it doesn't need one. Off-loading to the compiler is a good
> idea, but we can avoid being compilers ourselves by writing short, simple
> to understand functions, and then letting the compiler inline and optimize
> them for us :-)
> 
> Thanks,
> drew
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html