Re: RFC on writel and writel_relaxed

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 2018-03-23 at 10:35 -0600, Jason Gunthorpe wrote:
> On Fri, Mar 23, 2018 at 12:52:02AM +1100, Benjamin Herrenschmidt wrote:
> 
> > > >  - Make writel_relaxed() be a simple store without barriers, and
> > > > readl_relaxed() be "eieio, read, eieio", thus allowing write combining
> > > > to happen between successive writel_relaxed on WC space (no change on
> > > > normal NC space) while maintaining the ordering between relaxed reads
> > > > and writes. The flip side is a (slight) increased overhead of
> > > > readl_relaxed.
> > > 
> > > Are there many drivers that actually do writeX() on WC space?
> > > memory-barriers.txt
> > > pretty much says that all bets are off and no ordering guarantees can be assumed
> > > when using readX/writeX on prefetchable IO memory. It seems sketchy enough to
> > > give me some pause, but maybe it works fine elsewhere.
> > 
> > I don't know whether any does it, but I want to provide a way for a
> > driver to somewhat reliably obtain write combine semantics without
> > having to hand code endian swap and other horrors involved with using
> > __raw_* accessors.
> 
> Many of the drivers in drivers/infiniband work with write combining
> memory.
> 
> The usual pattern is a desire to push 32 or 64 bytes to the WC BAR as
> efficiently as possible, ideally in a single PCI-E TLP.
> 
> A memcpy_to_wc primitive could probably cover these use cases, no need
> to redesign the IO accessors..
> 
> The WC memory is never read, so read/write order is not important to
> any infiniband driver.
> 
> What is very important is keeping the WC behavior isolated within the
> spinlock. WC to the same addresses cannot be permitted in this pattern:
> 
>    writel(addr = 0);
>    mmiowmb();
>    spin_unlock();
>    spin_lock()
>    writel(addr = 0);
> 
> The CPU must always generate two PCI-E TLPs to the device.

On powerpc you'll never get write combining with writel. So that at
least is covered.

> This is a super performance critical operation for most drivers and
> directly impacts network performance.
> 
> Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux