> -----Original Message----- > From: Guo Ren <guoren@xxxxxxxxxx> > Sent: 06 June 2021 22:42 > To: Anup Patel <Anup.Patel@xxxxxxx>; Atish Patra <atishp@xxxxxxxxxxxxxx> > Cc: Palmer Dabbelt <palmer@xxxxxxxxxxx>; anup@xxxxxxxxxxxxxx; > drew@xxxxxxxxxxxxxxx; Christoph Hellwig <hch@xxxxxx>; wefu@xxxxxxxxxx; > lazyparser@xxxxxxxxx; linux-riscv@xxxxxxxxxxxxxxxxxxx; linux- > kernel@xxxxxxxxxxxxxxx; linux-arch@xxxxxxxxxxxxxxx; linux- > sunxi@xxxxxxxxxxxxxxx; guoren@xxxxxxxxxxxxxxxxx; Paul Walmsley > <paul.walmsley@xxxxxxxxxx> > Subject: Re: [PATCH RFC 0/3] riscv: Add DMA_COHERENT support > > Hi Anup and Atish, > > On Thu, Jun 3, 2021 at 2:00 PM Anup Patel <Anup.Patel@xxxxxxx> wrote: > > > > > > > > > -----Original Message----- > > > From: Palmer Dabbelt <palmer@xxxxxxxxxxx> > > > Sent: 03 June 2021 09:43 > > > To: guoren@xxxxxxxxxx > > > Cc: anup@xxxxxxxxxxxxxx; drew@xxxxxxxxxxxxxxx; Christoph Hellwig > > > <hch@xxxxxx>; Anup Patel <Anup.Patel@xxxxxxx>; wefu@xxxxxxxxxx; > > > lazyparser@xxxxxxxxx; linux-riscv@xxxxxxxxxxxxxxxxxxx; linux- > > > kernel@xxxxxxxxxxxxxxx; linux-arch@xxxxxxxxxxxxxxx; linux- > > > sunxi@xxxxxxxxxxxxxxx; guoren@xxxxxxxxxxxxxxxxx; Paul Walmsley > > > <paul.walmsley@xxxxxxxxxx> > > > Subject: Re: [PATCH RFC 0/3] riscv: Add DMA_COHERENT support > > > > > > On Sat, 29 May 2021 17:30:18 PDT (-0700), Palmer Dabbelt wrote: > > > > On Fri, 21 May 2021 17:36:08 PDT (-0700), guoren@xxxxxxxxxx wrote: > > > >> On Wed, May 19, 2021 at 3:15 PM Anup Patel <anup@xxxxxxxxxxxxxx> > > > wrote: > > > >>> > > > >>> On Wed, May 19, 2021 at 12:24 PM Drew Fustini > > > <drew@xxxxxxxxxxxxxxx> wrote: > > > >>> > > > > >>> > On Wed, May 19, 2021 at 08:06:17AM +0200, Christoph Hellwig > > > wrote: > > > >>> > > On Wed, May 19, 2021 at 02:05:00PM +0800, Guo Ren wrote: > > > >>> > > > Since the existing RISC-V ISA cannot solve this problem, > > > >>> > > > it is better to provide some configuration for the SOC > > > >>> > > > vendor to > > > customize. > > > >>> > > > > > >>> > > We've been talking about this problem for close to five years. > > > >>> > > So no, if you don't manage to get the feature into the ISA > > > >>> > > it can't be supported. > > > >>> > > > > >>> > Isn't it a good goal for Linux to support the capabilities > > > >>> > present in the SoC that a currently being fab'd? > > > >>> > > > > >>> > I believe the CMO group only started last year [1] so the > > > >>> > RV64GC SoCs that are going into mass production this year > > > >>> > would not have had the opporuntiy of utilizing any RISC-V ISA > > > >>> > extension for handling cache management. > > > >>> > > > >>> The current Linux RISC-V policy is to only accept patches for > > > >>> frozen or ratified ISA specs. > > > >>> (Refer, Documentation/riscv/patch-acceptance.rst) > > > >>> > > > >>> This means even if emulate CMO instructions in OpenSBI, the > > > >>> Linux patches won't be taken by Palmer because CMO specification > > > >>> is still in draft stage. > > > >> Before CMO specification release, could we use a sbi_ecall to > > > >> solve the current problem? This is not against the specification, > > > >> when CMO is ready we could let users choose to use the new CMO in > Linux. > > > >> > > > >> From a tech view, CMO trap emulation is the same as sbi_ecall. > > > >> > > > >>> > > > >>> Also, we all know how much time it takes for RISCV international > > > >>> to freeze some spec. Judging by that we are looking at another > > > >>> 3-4 years at minimum. > > > > > > > > Sorry for being slow here, this thread got buried. > > > > > > > > I've been trying to work with a handful of folks at the RISC-V > > > > foundation to try and get a subset of the various in-development > > > > specifications (some simple CMOs, something about non-caching in > > > > the page tables, and some way to prevent speculative accesse from > > > > generating coherence traffic that will break non-coherent systems). > > > > I'm not sure we can get this together quickly, but I'd prefer to > > > > at least try before we jump to taking vendor-specificed behavior here. > > > > It's obviously an up-hill battle to try and get specifications > > > > through the process and I'm certainly not going to promise it will > > > > work, but I'm hoping that the impending need to avoid forking the > > > > ISA will be sufficient to get people behind producing some > > > > specifications in a timely > > > fashion. > > > > > > > > I wasn't aware than this chip had non-coherent devices until I saw > > > > this thread, so we'd been mostly focused on the Beagle V chip. > > > > That was in a sense an easier problem because the SiFive IP in it > > > > was never designed to have non-coherent devices so we'd have to > > > > make anything work via a series of slow workarounds, which would > > > > make emulating the eventually standardized behavior reasonable in > > > > terms of performance (ie, everything would be super slow so who really > cares). > > > > > > > > I don't think relying on some sort of SBI call for the CMOs whould > > > > be such a performance hit that it would prevent these systems from > > > > being viable, but assuming you have reasonable performance on your > > > > non-cached accesses then that's probably not going to be viable to > > > > trap and emulate. At that point it really just becomes silly to > > > > pretend that we're still making things work by emulating the > > > > eventually ratified behavior, as anyone who actually tries to use > > > > this thing to do IO would need out of tree patches. I'm not sure > > > > exactly what the plan is for the page table bits in the > > > > specification right now, but if you can give me a pointer to some > > > > documentation then I'm happy to try and push for something > compatible. > > > > > > > > If we can't make the process work at the foundation then I'd be > > > > strongly in favor of just biting the bullet and starting to take > > > > vendor-specific code that's been implemented in hardware and is > > > > necessarry to make things work acceptably. That's obviously a > > > > sub-optimal solution as it'll lead to a bunch of ISA > > > > fragmentation, but at least we'll be able to keep the software stack > together. > > > > > > > > Can you tell us when these will be in the hands of users? That's > > > > pretty important here, as I don't want to be blocking real users > > > > from having their hardware work. IIRC there were some plans to > > > > distribute early boards, but it looks like the foundation got > > > > involved and I guess I lost the thread at that point. > > > > > > > > Sorry this is all such a headache, but hopefully we can get things > > > > sorted out. > > > > > > I talked with some of the RISC-V foundation folks, we're not going > > > to have an ISA specification for the non-coherent stuff any time > > > soon. I took a look at this code and I definately don't want to > > > take it as is, but I'm not opposed to taking something that makes the > hardware work as long as it's a lot cleaner. > > > We've already got two of these non-coherent chips, I'm sure more > > > will come, and I'd rather have the extra headaches than make > > > everyone fork the software stack. > > > > Thanks for confirming. The CMO extension is still in early stages so > > it will certainly take more time for them. After CMO extension is > > finalized, it will take some more time to have actual RISC-V platforms with > CMO implemented. > > > > > > > > After talking to Atish it looks like there's likely to be an SBI > > > extension to handle the CMOs, which should let us avoid the bulk of > > > the vendor-specific behavior in the kernel. I know some people are > > > worried about adding to the SBI surface. I'm worried about that > > > too, but that's way better than sticking a bunch of vendor-specific > > > instructions into the kernel. The SBI extension should make for a > > > straight-forward cache flush implementation in Linux, so let's just plan on > that getting through quickly (as has been done before). > > > > Yes, I agree. We can have just a single SBI call which is meant for > > DMA sync purpose only which means it will flush/invalidate pages from > > all cache levels irrespective of the cache hierarchy (i.e. > > flush/invalidate to RAM). The CMO extension might more generic cache > > operations which can target any cache level. > > > > I am already preparing a write-up for SBI DMA sync call in SBI spec. I > > will share it with you separately as well. > > > > > > > > Unfortunately we've yet to come up with a way to handle the > > > non-cacheable mappings without introducing a degree of > > > vendor-specific behavior or seriously impacting performance (mark > > > them as not valid and deal with them in the trap handler). I'm not > > > really sure it counts as supporting the hardware if it's massively > > > slow, so that really leaves us with vendor-specific mappings as the only > option to make these chips work. > > > > A RISC-V platform can have non-cacheable mappings is following > > possible > > ways: > > 1) Fixed physical address range as non-cacheable using PMAs > > 2) Custom page table attributes > > 3) Svpmbt extension being defined by RVI > > > > Atish and me both think it is possible to have RISC-V specific DMA ops > > implementation which can handle all above case. Atish is already > > working on DMA ops implementation for RISC-V. > Not only DMA ops, but also icache_sync & __vdso_icache_sync. Please have a > look at: > https://lore.kernel.org/linux-riscv/1622970249-50770-12-git-send-email- > guoren@xxxxxxxxxx/T/#u The icache_sync and __vdso_icache_sync will have to be addressed differently. The SBI DMA sync extension cannot address this. It seems Allwinner D1 have more non-standard stuff: 1) Custom PTE bits for IO-coherent access 2) Custom data cache flush/invalidate for DMA sync 3) Custom icache flush/invalidate Other hand, BeagleV has only two problems: 1) Custom physical address range for IO-coherent access 2) Custom L2 cache flush/invalidate for DMA sync >From above #2, can be solved by SBI DMA sync call and Linux DMA ops for both BeagleV and Allwinner D1 On BeagleV, issue #1 can be solved using "dma-ranges". On Allwinner D1, issues #1 and #3 need to be addressed separately. I think supporting BeagleV in upstream Linux is relatively easy compared to Allwinner D1. @Guo, please check if you can reserve dedicated physical address range for IO-coherent access (just like BeagleV). If yes, then we can tackle issue #1 for Allwinner D1 using "dma-ranges" DT property. Regards, Anup > > > > > > > > > > This implementation, which adds some Kconfig entries that control > > > page table bits, definately isn't suitable for upstream. Allowing > > > users to set arbitrary page table bits will eventually conflict with > > > the standard, and is just going to be a mess. It'll also lead to > > > kernels that are only compatible with specific designs, which we're > > > trying very hard to avoid. At a bare minimum we'll need some way to > > > detect systems with these page table bits before setting them, and > > > some description of what the bits actually do so we can reason about > them. > > > > Yes, vendor specific Kconfig options are strict NO NO. We can't > > give-up the goal of unified kernel image for all platforms. > > > > Regards, > > Anup > > > > -- > Best Regards > Guo Ren > > ML: https://lore.kernel.org/linux-csky/