Re: [kvm-unit-tests PATCH v4 04/12] pci: Rework pci_bar_addr()

Andrew Jones <drjones@xxxxxxxxxx> · Tue, 7 Jun 2016 16:08:22 +0200

On Tue, Jun 07, 2016 at 12:33:54PM +0200, Alexander Gordeev wrote:
> On Tue, Jun 07, 2016 at 09:03:18AM +0200, Andrew Jones wrote:
> > On Tue, Jun 07, 2016 at 08:38:55AM +0200, Alexander Gordeev wrote:
> > > On Mon, Jun 06, 2016 at 04:12:03PM +0200, Andrew Jones wrote:
> > > > > +phys_addr_t pci_bar_size(pcidevaddr_t dev, int bar_num)
> > > > >  {
> > > > >  	uint32_t bar = pci_config_readl(dev, PCI_BASE_ADDRESS_0 + bar_num * 4);
> > > > > +	phys_addr_t size = (int32_t)pci_bar_size32(dev, bar_num);
> > > > > +	phys_addr_t mask = (int32_t)pci_bar_mask(bar);
> > > > >  
> > > > > -	if (bar & PCI_BASE_ADDRESS_SPACE_IO)
> > > > > -		return bar & PCI_BASE_ADDRESS_IO_MASK;
> > > > > -	else
> > > > > -		return bar & PCI_BASE_ADDRESS_MEM_MASK;
> > > > > +	if (pci_bar_is64(dev, bar_num)) {
> > > > > +		uint32_t size_high = pci_bar_size32(dev, bar_num + 1);
> > > > > +		size = ((phys_addr_t)size_high << 32) | (uint32_t)size;
> > > > > +	}
> > > > > +
> > > > > +	return (~(size & mask)) + 1;
> > > > 
> > > > All this casting of size and mask is pointless. Please rework it
> > > > similar to what I did above.
> > > 
> > > This is the most compact and straight variant I was able to come up with:
> > > 
> > > 	uint32_t bar = pci_bar_get(dev, bar_num);
> > > 	phys_addr_t size = (int32_t)pci_bar_size32(dev, bar_num);
> > 
> > But this is wrong. If your 32-bit size was 0x80000000, then you now
> > say it's 0xffffffff80000000.
> 
> Hmm.. I am either terribly missing the point or we are on different
> pages.

Let's make sure I know what the function is supposed to do.

Based on your comment explaining how to get the PCI bar size,
it sounds like we should do these steps; let's assume a device
needs size=0x200000, and is 32-bit

 read-write1s-dance	barsz=0x00000000
 readl			barsz=0xffe0000c
 mask			barsz=0xffe00000

 barsz = ~barsz + 1	barsz=0x00200000

Now let's say it's 64-bit

 read-write1s-dance	barsz_lo=0x00000000
 readl			barsz_lo=0xffe0000c
 mask			barsz_lo=0xffe00000
 read-write1s-dance	barsz_hi=0x00000000
 readl			barsz_hi=0xffffffff
 <upper 32-bit mask is 0xffffffff, so don't bother masking at all>

 barsz = ~((phys_addr_t)barsz_hi << 32 | barsz_lo) + 1 (barsz=0x200000)

So you need 5 functions

uint32_t get_mask(int bar)
{
...
}

uint32_t read_dance(int bar)
{
  read-write1s-dance
  return readl
}

uint32_t get_size32(int bar)
{
  uint32_t size = read_dance(bar) & get_mask(bar);
  return ~size + 1;
}

uint64_t get_size64(int bar)
{
  uint64_t size = read_dance(bar) & get_mask(bar);
  uint64_t size_hi = read_dance(bar + 1);

  size |= size_hi << 32;
  return ~size + 1;
}

uint64_t get_size(int bar)
{
  return bar_size(bar) == BAR64 ? get_size64(bar) : get_size32(bar);
}

> 
> So if pci_bar_size32() returned 0x80000000 the size sign-extension
> gives 0xffffffff80000000 and the mask sign extension gives (i.e.
> mmio) 0xfffffffffffffff0. The AND gives 0xffffffff80000000 and the
> NOT gives 0x7fffffff. Finally, 0x7fffffff + 1 gives 0x80000000.

I don't see the point of using the upper 32-bits to calculate a
32-bit value from 32-bit inputs.

> 
> If we do not sign-extend (or explicitly OR with 0xffffffff00000000)
> size and/or mask then the return(~(size & mask)) + 1 gives a wrong
> 0xffffffff80000000.
> 
> > > 	phys_addr_t mask = (int32_t)pci_bar_mask(bar);
> > 
> > It might be OK to do that here, on a mask, but even if it is, then I
> > don't like it, because it's too subtle (like I said for the casting
> > in pci_bar_addr in my last reply). I don't like that it requires us to
> > know that masking bit 31 in a 32-bit mask means we also want to mask
> > 63..32. That should at least be in a comment somewhere.
> 
> Yes, but it is not an arbitrary mask, it is an alignment mask.
> We unconditionally want to mask 63..32, 31th and even lower.

If you use sign-extension to generate the alignment mask, then
it needs a comment. But if we can avoid the headache of 64-bit
data computing 32-bit results then let's avoid it. (Even though
you described to me how it's supposed to work, I'm still not
sure it's safe for all cases :-)

> 
> > > 	if (pci_bar_is64(dev, bar_num))
> > > 		size |= (phys_addr_t)pci_bar_size32(dev, bar_num + 1) << 32;
> > > 
> > > 	return (~(size & mask)) + 1;
> > > 
> > > The casting is needed to avoid putting explicitly all 1s into higher
> > > bits of size and/or mask. Otherwise (~(size & mask)) + 1 expression would
> > > not bring correct results. I really struggle to make something better
> > > readable.
> > 
> > It's not just about readability, it's about correctness. You shouldn't
> > use 32-bit sizes/masks with 64-bit data this way. Like I said in the
> > last reply, you should rework it like I showed you, operate on the
> > 32-bit data with the 32-bit mask/size, and then eventually construct
> > 64-bit data.
> 
> It is all about those last +1 and shifting bits from lower to upper
> word of 64-bit value. My hope was to off-load it to the compiler :)

I think the problem comes from trying to apply a mask to the upper bits,
even though it doesn't need one. Off-loading to the compiler is a good
idea, but we can avoid being compilers ourselves by writing short, simple
to understand functions, and then letting the compiler inline and optimize
them for us :-)

Thanks,
drew
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html