On Tue, Jun 07, 2016 at 12:33:54PM +0200, Alexander Gordeev wrote: > On Tue, Jun 07, 2016 at 09:03:18AM +0200, Andrew Jones wrote: > > On Tue, Jun 07, 2016 at 08:38:55AM +0200, Alexander Gordeev wrote: > > > On Mon, Jun 06, 2016 at 04:12:03PM +0200, Andrew Jones wrote: > > > > > +phys_addr_t pci_bar_size(pcidevaddr_t dev, int bar_num) > > > > > { > > > > > uint32_t bar = pci_config_readl(dev, PCI_BASE_ADDRESS_0 + bar_num * 4); > > > > > + phys_addr_t size = (int32_t)pci_bar_size32(dev, bar_num); > > > > > + phys_addr_t mask = (int32_t)pci_bar_mask(bar); > > > > > > > > > > - if (bar & PCI_BASE_ADDRESS_SPACE_IO) > > > > > - return bar & PCI_BASE_ADDRESS_IO_MASK; > > > > > - else > > > > > - return bar & PCI_BASE_ADDRESS_MEM_MASK; > > > > > + if (pci_bar_is64(dev, bar_num)) { > > > > > + uint32_t size_high = pci_bar_size32(dev, bar_num + 1); > > > > > + size = ((phys_addr_t)size_high << 32) | (uint32_t)size; > > > > > + } > > > > > + > > > > > + return (~(size & mask)) + 1; > > > > > > > > All this casting of size and mask is pointless. Please rework it > > > > similar to what I did above. > > > > > > This is the most compact and straight variant I was able to come up with: > > > > > > uint32_t bar = pci_bar_get(dev, bar_num); > > > phys_addr_t size = (int32_t)pci_bar_size32(dev, bar_num); > > > > But this is wrong. If your 32-bit size was 0x80000000, then you now > > say it's 0xffffffff80000000. > > Hmm.. I am either terribly missing the point or we are on different > pages. Let's make sure I know what the function is supposed to do. Based on your comment explaining how to get the PCI bar size, it sounds like we should do these steps; let's assume a device needs size=0x200000, and is 32-bit read-write1s-dance barsz=0x00000000 readl barsz=0xffe0000c mask barsz=0xffe00000 barsz = ~barsz + 1 barsz=0x00200000 Now let's say it's 64-bit read-write1s-dance barsz_lo=0x00000000 readl barsz_lo=0xffe0000c mask barsz_lo=0xffe00000 read-write1s-dance barsz_hi=0x00000000 readl barsz_hi=0xffffffff <upper 32-bit mask is 0xffffffff, so don't bother masking at all> barsz = ~((phys_addr_t)barsz_hi << 32 | barsz_lo) + 1 (barsz=0x200000) So you need 5 functions uint32_t get_mask(int bar) { ... } uint32_t read_dance(int bar) { read-write1s-dance return readl } uint32_t get_size32(int bar) { uint32_t size = read_dance(bar) & get_mask(bar); return ~size + 1; } uint64_t get_size64(int bar) { uint64_t size = read_dance(bar) & get_mask(bar); uint64_t size_hi = read_dance(bar + 1); size |= size_hi << 32; return ~size + 1; } uint64_t get_size(int bar) { return bar_size(bar) == BAR64 ? get_size64(bar) : get_size32(bar); } > > So if pci_bar_size32() returned 0x80000000 the size sign-extension > gives 0xffffffff80000000 and the mask sign extension gives (i.e. > mmio) 0xfffffffffffffff0. The AND gives 0xffffffff80000000 and the > NOT gives 0x7fffffff. Finally, 0x7fffffff + 1 gives 0x80000000. I don't see the point of using the upper 32-bits to calculate a 32-bit value from 32-bit inputs. > > If we do not sign-extend (or explicitly OR with 0xffffffff00000000) > size and/or mask then the return(~(size & mask)) + 1 gives a wrong > 0xffffffff80000000. > > > > phys_addr_t mask = (int32_t)pci_bar_mask(bar); > > > > It might be OK to do that here, on a mask, but even if it is, then I > > don't like it, because it's too subtle (like I said for the casting > > in pci_bar_addr in my last reply). I don't like that it requires us to > > know that masking bit 31 in a 32-bit mask means we also want to mask > > 63..32. That should at least be in a comment somewhere. > > Yes, but it is not an arbitrary mask, it is an alignment mask. > We unconditionally want to mask 63..32, 31th and even lower. If you use sign-extension to generate the alignment mask, then it needs a comment. But if we can avoid the headache of 64-bit data computing 32-bit results then let's avoid it. (Even though you described to me how it's supposed to work, I'm still not sure it's safe for all cases :-) > > > > if (pci_bar_is64(dev, bar_num)) > > > size |= (phys_addr_t)pci_bar_size32(dev, bar_num + 1) << 32; > > > > > > return (~(size & mask)) + 1; > > > > > > The casting is needed to avoid putting explicitly all 1s into higher > > > bits of size and/or mask. Otherwise (~(size & mask)) + 1 expression would > > > not bring correct results. I really struggle to make something better > > > readable. > > > > It's not just about readability, it's about correctness. You shouldn't > > use 32-bit sizes/masks with 64-bit data this way. Like I said in the > > last reply, you should rework it like I showed you, operate on the > > 32-bit data with the 32-bit mask/size, and then eventually construct > > 64-bit data. > > It is all about those last +1 and shifting bits from lower to upper > word of 64-bit value. My hope was to off-load it to the compiler :) I think the problem comes from trying to apply a mask to the upper bits, even though it doesn't need one. Off-loading to the compiler is a good idea, but we can avoid being compilers ourselves by writing short, simple to understand functions, and then letting the compiler inline and optimize them for us :-) Thanks, drew -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html